如何有效地在文件中插入字节片?

I am building a simple key-value store for fun. Right now, I am looking for an efficient way to insert a slice in a file.

My current approach:

  • find the desired offset

  • store in a buffer the blocks that precede the desired insert point

  • append the byte slice to that buffer

  • append the rest of the file

  • Write to disk

Problem is:

  • It is not a given that the whole file can fit in memory

  • It is inefficient

I have looked into available libraries and sadly the best match I have found os.WriteAt overwrites the following blocks. Example:

import "os"

func main() {
    pathToFile := "./tmp"
    bufferToWrite := []byte{255, 255, 255, 255, 255}

    f, _ := os.OpenFile(pathToFile, os.O_CREATE|os.O_RDWR, os.PermMode)
    defer f.Close()
    f.Write(bufferToWrite)

So, at this point the content of tmp will be (after $: xxd -g 1 -b tmp):

11111111 11111111 11111111 (x) 11111111 11111111

Let's try to insert something with offset = 3 (x):

    bufferToInsert := []byte{0, 0}
    f.WriteAt(bufferToInsert, 3)
}

Output will be:

11111111 11111111 11111111 00000000 00000000

And I want it to be:

11111111 11111111 11111111 00000000 00000000 11111111 11111111

Any ideas?

Instead of inventing your own file format, you could copy the pack file format from Git.

The basic idea is to have an index file and a data file. When you want to insert a slice you just append it to the data file. Then you update the index file, which is usually smaller. Note that the pack file is not designed for real-time updates, but accompanied by individual object files.

Or have a look at the Berkeley DB file format.