I'm calling a function as a goroutine and using a WaitGroup to prevent closing a shared Scanner before they all finish. The myfunc()
function iterates over a file. I wanted to memory-map this file and share it between all of the goroutines rather than have the I/O chokepoint of reading from disk each time. I was told that this approach would work in an answer to another question. However, while this function worked fine standalone, it's not working concurrently. I am receiving the error:
panic: runtime error: slice bounds out of range
but the error is when I call the Scan()
method (not on a slice) which is confusing.
Here is a MWE:
// ... package declaration; imports; yada yada
// the actual Sizes map is much more meaningful, this is just for the MWE
var Sizes = map[int]string {
10: "Ten",
20: "Twenty",
30: "Thirty",
40: "Forty",
}
type FileScanner struct {
io.Closer
*bufio.Scanner
}
func main() {
// ... validate path to file stored in filePath variable
filePath := "/path/to/file.txt"
// get word list scanner to be shared between goroutines
scanner := getScannerPtr(&filePath)
// call myfunc() for each param passed
var wg sync.WaitGroup
ch := make(chan string)
for _, param := range os.Args[1:] {
wg.Add(1)
go myfunc(¶m, scanner, ch)
wg.Done()
}
// print results received from channel
for range os.Args[1:] {
fmt.Println(<-ch) // print data received from channel ch
}
// don't close scanner until all goroutines are finished
wg.Wait()
defer scanner.Close()
}
func getScannerPtr(filePath *string) *FileScanner {
f, err := os.Open(*filePath)
if err != nil {
fmt.Fprint(os.Stderr, "Error opening file
")
panic(err)
}
scanner := bufio.NewScanner(f)
return &FileScanner{f, scanner}
}
func myfunc(param *string, scanner *FileScanner, ch chan<-string) {
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
// ... do something with line (read only)
// ... access shared Sizes map when doing it (read only)
ch <- "some string result goes here"
}
}
I originally thought the issue was concurrent access to the shared Sizes map, but moving it inside myfunc()
(and inefficiently redeclaring/redefining it every time) still resulted in the same error, which has to do with calling Scan()
. I'm attempting to follow the guidance I received in this answer.
Here's the full stack trace of the panic:
panic: runtime error: slice bounds out of range
goroutine 6 [running]:
bufio.(*Scanner).Scan(0xc42008a000, 0x80)
/usr/local/go/src/bufio/scan.go:139 +0xb3e
main.crack(0xc42004c280, 0xc42000a080, 0xc42001c0c0)
/Users/dan/go/src/crypto_ctf_challenge/main.go:113 +0x288
created by main.main
/Users/dan/go/src/crypto_ctf_challenge/main.go:81 +0x1d8
exit status 2
Line 81 is:
go myfunc(¶m, scanner, ch)
Line 113 is:
for scanner.Scan() {
Actually after review of the Scan
source, it doesn't appear to be thread-safe. You can get around this by having one routine read off of the scanner, and any number of other routines consume lines and process them:
func main() {
// ... validate path to file stored in filePath variable
filePath := "/path/to/file.txt"
// get word list scanner to be shared between goroutines
scanner := getScannerPtr(&filePath)
defer scanner.Close()
// call myfunc() for each param passed
var wg sync.WaitGroup
ch := make(chan string)
lines := make(chan string)
go func() {
for scanner.Scan() {
lines <- scanner.Text()
}
close(lines)
}()
for _, param := range os.Args[1:] {
wg.Add(1)
go myfunc(param, lines, ch)
wg.Done()
}
// print results received from channel
for range os.Args[1:] {
fmt.Println(<-ch) // print data received from channel ch
}
// don't close scanner until all goroutines are finished
wg.Wait()
}
func myfunc(param string, lines chan []byte, ch chan<-string) {
for line := range lines {
line = strings.TrimSpace(line)
// ... do something with line (read only)
// ... access shared Sizes map when doing it (read only)
ch <- "some string result goes here"
}
}
Also note that there's no point in defer
ing the last line in a function; the whole point of defer
is to call it somewhere in the body of the function and know it will be called after the function returns. Since you're using a WaitGroup
to prevent the function returning until you're done with your scanner, you can safely defer the close immediately.