I need to read a CSV file and record the locations of lines with certain values into an array, then later go back and retrieve those lines in no particular order and with good performance, so random access.
My program uses csv.NewReader(file), but I see no way to get or set the file offset that it uses. I tried file.Seek(0,io.SeekCurrent) to return the file position, but it doesn't change between calls to reader.Read(). I also tried fmt.Println("+v +v ",reader,file) to see if anything stores the reader's file position, but I don't see it. I also don't know the best way to use the file position if I do find it.
Here's what I need to do:
file,_ = os.Open("stuff.csv")
reader = csv.NewReader(file)
//read file and record locations
for {
line,_ = reader.Read()
if wantToRememberLocation(line) {
locations = append(locations, getLocation()) //need this function
}
}
//then revisit certain lines
for {
reader.GoToLine(locations[random]) //need this function
line,_ = reader.Read()
doStuff(line)
}
Is there even a way to do this with the csv library, or will I have to write my own using more primitive file io functions?
Here's a solution using TeeReader. This example just saves all the positions and goes back and rereads some of them.
//set up some vars and readers to record position and length of each line
type Record struct {
Pos int64
Len int
}
records := make([]Record,1)
var buf bytes.Buffer
var pos int64
file,_ := Open("stuff.csv")
tr := io.TeeReader(file, &buf)
cr := csv.NewReader(tr)
//read first row and get things started
data,_ := cr.Read()
dostuff(data)
//length of current row determines position of next
lineBytes,_ := buf.ReadBytes('
')
length := len(lineBytes)
pos += int64(length)
records[0].Len = length
records = append(records, Record{ Pos: pos })
for i:=1;;i++ {
//read csv data
data,err = c.Read()
if err != nil {break}
dostuff(data)
//record length and position
lineBytes,_ = buf.ReadBytes('
')
lenth = len(lineBytes)
pos += int64(length)
records[i].Len = length
records = append(records, Record{ Pos: pos })
}
//prepare individual line reader
line := make([]byte,1000)
lineReader := bytes.NewReader(line)
//read random lines from file
for {
i := someLineNumber()
//use original file reader to fill byte slice with line
file.ReadAt(line[:records[i].Len], records[i].Pos)
//need new lineParser to start at beginning every time
lineReader.Seek(0,0)
lineParser := csv.NewReader(lineReader)
data,_ = lineParser.Read()
doStuff(data)
}
os.Open returns a File, which implements io.Seeker.
So you can do this to rewind the stream to the beginning:
_, err = file.Seek(0, io.SeekStart)