I have the following string, which is a discovery packet from a projector on our network:
AMXB<-SDKClass=VideoProjector><-UUID=ABCDEFG><-Make=DELL><-Model=S300w><-Revision=0.2.0>
I'm trying to write some Golang code that turns this into a map, so I can call details["UUID"]
and have it return ABCDEFG
. I wrote a regex expression that looks like this:
(?:UUID=)(?P<UUID>(.*?))>|(?:Make=)(?P<Make>(.*?))>|(?:Model=)(?P<Model>(.*?))>|(?:SDKClass=)(?P<SDKClass>(.*?))>
When I test it online with regex 101, it seems to match everything just fine, except for the numbered groups, but I can easily ignore those:
MATCH 1
SDKClass [15-29] VideoProjector
- [15-29] VideoProjector
MATCH 2
UUID [37-49] B8AC6FDFE1E2
- [37-49] B8AC6FDFE1E2
MATCH 3
Make [57-61] DELL
- [57-61] DELL
MATCH 4
Model [70-75] S300w
- [70-75] S300w
But when I try it in Golang, I get different results (note: These results tidied up using go-spew to make it easier to read):
([][]string) (len=4 cap=10) {
([]string) (len=9 cap=9) {
(string) (len=24) "SDKClass=VideoProjector>",
(string) "",
(string) "",
(string) "",
(string) "",
(string) "",
(string) "",
(string) (len=14) "VideoProjector",
(string) (len=14) "VideoProjector"
},
([]string) (len=9 cap=9) {
(string) (len=18) "UUID=B8AC6FDFE1E2>",
(string) (len=12) "B8AC6FDFE1E2",
(string) (len=12) "B8AC6FDFE1E2",
(string) "",
(string) "",
(string) "",
(string) "",
(string) "",
(string) ""
},
([]string) (len=9 cap=9) {
(string) (len=10) "Make=DELL>",
(string) "",
(string) "",
(string) (len=4) "DELL",
(string) (len=4) "DELL",
(string) "",
(string) "",
(string) "",
(string) ""
},
([]string) (len=9 cap=9) {
(string) (len=12) "Model=S300w>",
(string) "",
(string) "",
(string) "",
(string) "",
(string) (len=5) "S300w",
(string) (len=5) "S300w",
(string) "",
(string) ""
}
}
What's wrong with my regex and how do I fix it? I've tried just about every combination of expressions (I'm nearly a regex master now :\ )
As far as I see, it works exactly as you wrote it and equally well in both the regex101 and Go. The difference you observe is only because of the difference in how the results are presented.
Let's look close on the results returned by regex101. For example, this one:
MATCH 1
SDKClass [15-29] `VideoProjector`
8. [15-29] `VideoProjector`
It basically says that it found two submatches, one of those is named, another is on index 8. Let's look on the Go then:
([]string) (len=9 cap=9) {
(string) (len=24) "SDKClass=VideoProjector>",
(string) "",
(string) "",
(string) "",
(string) "",
(string) "",
(string) "",
(string) (len=14) "VideoProjector",
(string) (len=14) "VideoProjector"
},
It says that it found two submatches, for groups 7 and 8. In order to get the name of the group 7 you should call r.SubexpNames()
, which will return SDKClass
for r.SubexpNames()[7]
.
So both return the same result.
So with help from AlexAtNet, I got an answer -- enough to get me going. Here's my final code:
r, _ := regexp.Compile("<-([^=]+)=([^>]+)>")
match := r.FindAllString(string(msg), -1)
result := make(map[string]string)
for _, p := range match {
split := strings.Split(p, "=")
result[split[0]] = split[1]
}
The results come out like so:
([]string) (len=4 cap=10) {
(string) (len=23) "SDKClass=VideoProjector",
(string) (len=17) "UUID=B8AC6FDFE1E2",
(string) (len=9) "Make=DELL",
(string) (len=11) "Model=S300w"
(string) (len=14) "Revision=0.2.0"
}
but I can simply Split()
the strings by =
and get both the property name and the value.
I'm still looking for an improvement to my regex and / or code, just so I can see how to do it right without needing additional splits or excessive code.