Say I want to extract all numbers from a string (Most likely using regex matching) and I also want to replace those number matches with a generic placeholder like "#".
This is easily done in two parts using FindAll, then ReplaceAll. However I have serious doubts about the performance costs of doing such operations.
So take a string
"sdasd 3.2% sadas 6 ... +8.9"
replace it with
"sdasd #% sadas # ... +#"
and get a slice
[3.2,6.0,8.9]
In the most performant way possible.
Edit : I implemented the regexp.FindAllString + regexp.ReplaceAllString and the performance hit to my app was very minimal. I will hopefully try Elliot Chance's approach and compare the two when I have time.
If you need raw performance than regexp is rarely the way to achieve it, even if it is convenient. Iterating token by token should be pretty fast. Some code:
input := "sdasd 3.2 sadas 6"
output := []string{}
numbers := []float64{}
for _, tok := range strings.Split(input, " ") {
if f, err := strconv.ParseFloat(tok, 64); err == nil {
numbers = append(numbers, f)
tok = "#"
}
output = append(output, tok)
}
finalString := strings.Join(output, " ")
fmt.Println(finalString, numbers)
I'm sure there's a few more optimizations in there that could be made, but this is the general approach I'd take.
Never underestimate the power of regex, especially the RE2 engine of Go.
Also, never, ever, assume anything about performance without benchmarking. It always surprises.
The regular expression is usually compiled and cached. To be sure, you could optimize by compiling it first.