我有一个superheroes字符串,它们都有名字,但不是所有的都有属性。
它有一个格式为⛦Name⛯☾atta Data☽,其中attaName☾atta Data☽是可选的。
superheroes字符串如下:
⛦superman⛯shirt☾blue☽⛦joker⛯⛦spiderman⛯age☾15yo☽girlFriend☾Cindy☽
我希望使用Regex提取字符串,并将结果填充到映射的一个片段中,如下所示:
[ {name: superman, shirt: blue},
{name: joker},
{name: spiderman, age: 15yo, girlFriend: Cindy} ]
我在Go上做不到。然后我又使用regex⛦(\w+)⛯(?:(\W+)☾(\w+)☽),但它只能捕获单个属性,即无法捕获年龄属性的regex。
我的代码:
func main() {
re := regexp.MustCompile("⛦(\\w+)⛯(?:(\\w+)☾(\\w+)☽)*")
fmt.Printf("%q
", re.FindAllStringSubmatch("⛦superman⛯shirt☾blue☽⛦joker⛯⛦spiderman⛯age☾15yo☽girlFriend☾Cindy☽", -1))
}
Go代码在这里:https://play.golang.org/p/Epv66LVwuRK
运行结果如下:
[
["⛦superman⛯shirt☾blue☽" "superman" "shirt" "blue"]
["⛦joker⛯" "joker" "" ""]
["⛦spiderman⛯age☾15yo☽girlFriend☾Cindy☽" "spiderman" "girlFriend" "Cindy"]
]
age消失了,哪里出错了?
You cannot capture arbitrary number of substrings with a single capturing group. You need to match the whole record first, and then match the subparts of it with another regex.
See an example:
package main
import (
"fmt"
"regexp"
)
func main() {
str := "⛦superman⛯shirt☾blue☽⛦joker⛯⛦spiderman⛯age☾15yo☽girlFriend☾Cindy☽"
re_main := regexp.MustCompile(`⛦(\w+)⛯((?:\w+☾\w+☽)*)`)
re_aux := regexp.MustCompile(`(\w+)☾(\w+)☽`)
for _, match := range re_main.FindAllStringSubmatch(str, -1) {
fmt.Printf("%v
", match[1])
for _, match_aux := range re_aux.FindAllStringSubmatch(match[2], -1) {
fmt.Printf("%v: %v
", match_aux[1], match_aux[2])
}
fmt.Println("--END OF MATCH--")
}
}
See the Go demo
Output:
superman
shirt: blue
--END OF MATCH--
joker
--END OF MATCH--
spiderman
age: 15yo
girlFriend: Cindy
--END OF MATCH--
Here, ⛦(\w+)⛯((?:\w+☾\w+☽)*)
is the main regex that matches and captures into Group 1 the main "key" and the string of the other key-values is captured into Group 2. Then, you need to iterate over the found matches, and collect all key-values from the Group 2 using (\w+)☾(\w+)☽
.
You have set your regex
like ⛦(\\w+)⛯(?:(\\w+)☾(\\w+)☽)*
which prints only two level of key
and value
, like it prints as per your regex
:
[["⛦superman⛯shirt☾blue☽" "superman" "shirt" "blue"]
["⛦joker⛯" "joker" "" ""]
["⛦spiderman⛯age☾15yo☽girl☾Cindy☽" "spiderman" "girl" "Cindy"]]
I increase the regex one more key
and value
pairs and it prints the age
value as well, follow the below code for regex
:
re := regexp.MustCompile("⛦(\\w+)⛯(?:(\\w+)☾(\\w+)☽)*(?:(\\w+)☾(\\w+)☽)*")
fmt.Printf("%q
", re.FindAllStringSubmatch("⛦superman⛯shirt☾blue☽⛦joker⛯⛦spiderman⛯age☾15yo☽girl☾Cindy☽", -1))