I have a large dataset where I needed to do some string manipulation (I know strings are immutable). The Replace()
function in the strings
package does exactly what I need, except I need it to search in reverse.
Say I have this string: AA-BB-CC-DD-EE
Run this script:
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.Replace("AA-BB-CC-DD-EE", "-", "", 1))
}
It outputs: AABB-CC-DD-EE
What I need is: AA-BBCCDDEE
, where the first instance of the search key is found, and the rest discarded.
Splitting the string, inserting the dash, and joining it back together works. But, I'm thinking there is a more performant way to achieve this.
String slices!
in := "AA-BB-CC-DD-EE"
afterDash := strings.Index(in, "-") + 1
fmt.Println(in[:afterDash] + strings.Replace(in[afterDash:], "-", "", -1))
(might require some tweaking to get the behavior you want in the case that the input has no dashes).
This can be another solution
package main
import (
"strings"
"fmt"
)
func Reverse(s string) string {
n := len(s)
runes := make([]rune, n)
for _, rune := range s {
n--
runes[n] = rune
}
return string(runes[n:])
}
func main() {
S := "AA-BB-CC-DD-EE"
S = Reverse(strings.Replace(Reverse(S), "-", "", strings.Count(S, "-")-1))
fmt.Println(S)
}
Another solution:
package main
import (
"fmt"
"strings"
)
func main() {
S := strings.Replace("AA-BB-CC-DD-EE", "-", "*", 1)
S = strings.Replace(S, "-", "", -1)
fmt.Println(strings.Replace( S, "*", "-", 1))
}
This was a fun question to answer. While the solutions offered work neatly, splitting and replacing, to say nothing of calling Replace 3 times doesn't seem likely to be performant.
The answer? Don't reinvent the wheel, the go standard library has already almost solved this problem with Replace(), let's tweak it. I stumbled a bit over how the API of our new function should work, finally settling on leaving the signature unchanged, but deciding on minimal change from strings.Replace:
func ReplaceAfter(s,old,new string,skip int) string
The variable skip
replaces n
to clarify what it does since the caller will specify how many instances of old
to skip replacing. skip==0
is defined as replacing every instance and skip==-1
is defined as replacing no instances.
From here there were really only a few bits of the function that needed changing.
func ReplaceAfter(s, old, new string, skip int) string {
if old == new || skip == -1 { // changed
return s // avoid allocation
}
// Compute number of replacements.
m := strings.Count(s, old)
if m == 0 || m < skip { // changed
return s // avoid allocation
} // changed (removed else if)
// Apply replacements to buffer.
n := m - skip // changed, n means the same thing but is calculated
t := make([]byte, len(s)+n*(len(new)-len(old))) // longer buffer
w := 0
start := 0
for i := 0; i < m; i++ {
j := start
if len(old) == 0 {
if i > 0 {
_, wid := utf8.DecodeRuneInString(s[start:])
j += wid
}
} else {
j += strings.Index(s[start:], old)
}
if i >= skip { // changed, replace
w += copy(t[w:], s[start:j])
w += copy(t[w:], new)
} else { // changed, skip ahead
w += copy(t[w:], s[start:j+len(old)])
}
start = j + len(old)
}
w += copy(t[w:], s[start:])
return string(t[0:w])
}
Here's a playground link with a working demo. If you're interested, I also copied and adapted the relevant Test functions from go/src/strings/, to make sure that the function as written behaved itself predictably.
I think you want to use strings.Map
rather than rigging things with compositions of functions. It's basically meant for this scenario: character replacement with more complex requirements than Replace
and cousins can handle. The definition:
Map returns a copy of the string s with all its characters modified according to the mapping function. If mapping returns a negative value, the character is dropped from the string with no replacement.
Your mapping function can be built with a fairly simple closure:
func makeReplaceFn(toReplace rune, skipCount int) func(rune) rune {
count := 0
return func(r rune) rune {
if r == toReplace && count < skipCount {
count++
} else if r == toReplace && count >= skipCount {
return -1
}
return r
}
}
From there, it's a very straightforward program:
strings.Map(makeReplaceFn('-', 1), "AA-BB-CC-DD-EE")
Playground, this produces the desired output:
AA-BBCCDDEE
Program exited.
I'm not sure whether this is faster or slower than other solutions without benchmarking, because on one hand it has to call a function for each rune in the string, while on the other hand it doesn't have to convert (and thus copy) between a []byte
/[]rune
and string
between each function call (though the subslicing answer by hobbs is probably overall the best).
In addition, the method can be easily adapted to other scenarios (e.g. retaining every other dash), with the caveat that strings.Map
can only do rune to rune mapping, and not rune to string mapping like strings.Replace
does.