I am trying to do a Regex in Go to match s3 bucket urls.
so far i have
https://s3.amazonaws.com/(.+?)/",
"http://s3.amazonaws.com/(.+?)/",
"//s3-us-east-2.amazonaws.com/(.+?)/",
"//s3-us-west-1.amazonaws.com/(.+?)/",
"//s3-us-west-2.amazonaws.com/(.+?)/",
"//s3.ca-central-1.amazonaws.com/(.+?)/",
"//s3-ap-south-1.amazonaws.com/(.+?)/",
"//s3-ap-northeast-2.amazonaws.com/(.+?)/",
"//s3-ap-southeast-1.amazonaws.com/(.+?)/",
"//s3-ap-northeast-1.amazonaws.com/(.+?)/",
"//s3-eu-central-1.amazonaws.com/(.+?)/",
"//s3-eu-west-1.amazonaws.com/(.+?)/",
"//s3-eu-west-2.amazonaws.com/(.+?)/",
"//s3-eu-west-3.amazonaws.com/(.+?)/",
"//s3.sa-east-1.amazonaws.com/(.+?)/",
"https://(.+?).s3.amazonaws.com",
"//s3.amazonaws.com/([A-z0-9-]+)",
"//s3-ap-southeast-2.amazonaws.com/(.+?)/",
but this is overkill so i was looking at
//s3.amazonaws.com/([A-z0-9-]+)
but this misses out the . but when i do //s3.amazonaws.com/([A-z0-9-]\.+) it does not match any of the strings found.
I am currently trying to match it against
//s3.amazonaws.com/bucket.name/
and //s3.amazonaws.com/bucket-name-here
any suggestions?
In your regex you use [A-z0-9-]
. Note that [A-z]
is different from [A-Za-z]
.
To match a literal dot you could escape it: \.
This part ([A-z0-9-]\.+)
in this regex //s3.amazonaws.com/([A-z0-9-]\.+)
will match your character class once and then one or more times the dot like j.....
To fully match the 2 urls from your example, you could add a dot in the character class, add an optional forward slash at the end and you might omit the capturing group (parenthesis around the character class([])
) if you only want to match the full url and not use the data in the captured group itself for further usage.
//s3\.amazonaws\.com/[.A-z0-9-]+/?
Looking at the other urls in your example, maybe this regex can help you and you can adapt it to your further requirements:
(?:https?:)?//[A-z0-9.-]+\.amazonaws\.com(?:/(?:[A-z0-9.-]*/?))?