Let us say I have the following string: "Algorithms 1" by Robert Sedgewick
. This is input from the terminal.
The format of this string will always be:
1. Starts with a double quote
2. Followed by characters (may contain space)
3. Followed by double quote
4. Followed by space
5. Followed by the word "by"
6. Followed by space
7. Followed by characters (may contain space)
Knowing the above format, how do I read this?
I tried using fmt.Scanf()
but that would treat a word after each space as a separate value. I looked at regular expressions but I could not make out if there is a function using which I could GET values and not just test for validity.
You should use groups (parentheses) to get out the information you want:
"([\w\s]*)"\sby\s([\w\s]+)\.
This returns two groups:
Algorithms 1
Robert Sedgewick
Now there should be a regex method to get all matches out of a text. The result will contain a match object which then contains the groups.
I think in go it is: FindAllStringSubmatch (https://github.com/StefanSchroeder/Golang-Regex-Tutorial/blob/master/01-chapter2.markdown)
Test it out here: https://regex101.com/r/cT2sC5/1
The input format is so simple, you can simply use character search implemented in strings.IndexRune()
:
s := `"Algorithms 1" by Robert Sedgewick`
s = s[1:] // Exclude first double qote
x := strings.IndexRune(s, '"') // Find the 2nd double quote
title := s[:x] // Title is between the 2 double qotes
author := s[x+5:] // Which is followed by " by ", exclude that, rest is author
Printing results with:
fmt.Println("Title:", title)
fmt.Println("Author:", author)
Output:
Title: Algorithms 1
Author: Robert Sedgewick
Try it on the Go Playground.
Another solution would be to use strings.Split()
:
s := `"Algorithms 1" by Robert Sedgewick`
parts := strings.Split(s, `"`)
title := parts[1] // First part is empty, 2nd is title
author := parts[2][4:] // 3rd is author, but cut off " by "
Output is the same. Try it on the Go Playground.
If we cut off the first double quote, we may do a splitting by the separator
`" by `
If we do so, we will have exactly the 2 parts: title and author. Since we cut off first double quote, the separator can only be at the end of the title (the title cannot contain double quotes as per your rules):
s := `"Algorithms 1" by Robert Sedgewick`
parts := strings.Split(s[1:], `" by `)
title := parts[0] // First part is exactly the title
author := parts[1] // 2nd part is exactly the author
Try it on the Go Playground.
If after all the above solutions you still want to use regexp, here's how you could do it:
Use parenthesis to define submatches you want to get out. You want 2 parts: the title between quotes and the author that follows by
. You can use regexp.FindStringSubmatch()
to get the matching parts. Note that the first element in the returned slice will be the complete input, so relevant parts are the subsequent elements:
s := `"Algorithms 1" by Robert Sedgewick`
r := regexp.MustCompile(`"([^"]*)" by (.*)`)
parts := r.FindStringSubmatch(s)
title := parts[1] // First part is always the complete input, 2nd part is the title
author := parts[2] // 3rd part is exactly the author
Try it on the Go Playground.