从bufio读取文件,并通过文件进行半复杂的排序

So there may be questions like this but its not a super easy thing to google. Basically I have a file thats a set of protobufs encoded and sequenced as they normally are from the protobuf spec.

So think of the bytes values being chunked something like this throughout the file:

[EncodeVarInt(size of protobuf struct)] [protobuf stuct bytes]

So you have a few bytes read one at a time that are used for large jump of a read on our protof structure.

My implementation using the os ReadAt method on a file currently looks something like this.

// getting the next value in a file context feature 
func (geobuf *Geobuf_Reader) Next() bool {
    if geobuf.EndPos <= geobuf.Pos {
        return false
    } else {
        startpos := int64(geobuf.Pos)

        for int(geobuf.Get_Byte(geobuf.Pos)) > 127 {
            geobuf.Pos += 1
        }
        geobuf.Pos += 1

        sizebytes := make([]byte,geobuf.Pos-int(startpos))

        geobuf.File.ReadAt(sizebytes,startpos)

        size,_ := DecodeVarint(sizebytes)

        geobuf.Feat_Pos = [2]int{int(size),geobuf.Pos}
        geobuf.Pos = geobuf.Pos+int(size)

        return true
    }
    return false
}

//  reads a geobuf feature as geojson
func (geobuf *Geobuf_Reader) Feature() *geojson.Feature {
    // getting raw bytes
    a := make([]byte,geobuf.Feat_Pos[0])
    geobuf.File.ReadAt(a,int64(geobuf.Feat_Pos[1]))

    return Read_Feature(a)
}

How can I implement something like bufio or other chunked reading mechanisms to speed up so many file ReadAt's? Most bufio implementations I've seen are for having a specific delimitter. Thanks in advance hopefully this wasn't a horrible question.

Package bufio

import "bufio" 

type SplitFunc

SplitFunc is the signature of the split function used to tokenize the input. The arguments are an initial substring of the remaining unprocessed data and a flag, atEOF, that reports whether the Reader has no more data to give. The return values are the number of bytes to advance the input and the next token to return to the user, plus an error, if any. If the data does not yet hold a complete token, for instance if it has no newline while scanning lines, SplitFunc can return (0, nil, nil) to signal the Scanner to read more data into the slice and try again with a longer slice starting at the same point in the input.

If the returned error is non-nil, scanning stops and the error is returned to the client.

The function is never called with an empty data slice unless atEOF is true. If atEOF is true, however, data may be non-empty and, as always, holds unprocessed text.

type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)

Use bufio.Scanner and write a custom protobuf struct SplitFunc.