Currently, I have the following code:
cmd := "echo \"Hello world\"!\x00"
re := regexp.MustCompile(`[^\s"']+|"([^"]*)"|'([^']*)`)
args := re.FindAllString(cmd, -1)
fmt.Println("%v", args)
This yields the array %v [echo "Hello world" !]
, but I want the output to be %v [echo "Hello world"!]
(basically, quotes should contain everything inside of them as one item in the array, but the terminating quote should not signal the immediate start of the next item in the array).
How would I go about doing this?
You are explicitly matching "
, then any number of ^"
, then "
, so of course it terminates after the second "
. If you were to wrap that with [^\s"']*
(matching anything but whitespace and "
) in a grouping, I think it may give you what you are looking for. Let me know if this result is satisfactory.
re := regexp.MustCompile(`[^\s"']+|([^\s"']*"([^"]*)"[^\s"']*)+|'([^']*)`)
Playground example: https://play.golang.org/p/fWWsx7dIIRd
I'm not super well versed as to regular expression efficiency, so pardon if this adds too much complexity to the expression.
EDIT: One caveat to this specific expression is that a single "
will break something into two results, e.g. hi"there
would split into hi
and there
.
Improved regex. This just matches quoted segments or non-whitespace segments. Can handle errant single quotes.
package main
import (
"fmt"
"regexp"
)
func main() {
cmd := "echo \"Hello world\"!\x00"
re := regexp.MustCompile(`("[^"]+?"\S*|\S+)`)
args := re.FindAllString(cmd, -1)
fmt.Println("%v", args)
fmt.Println("%v", len(args))
}