I am building a simple key-value store for fun. Right now, I am looking for an efficient way to insert a slice in a file.
My current approach:
find the desired offset
store in a buffer the blocks that precede the desired insert point
append the byte slice to that buffer
append the rest of the file
Write to disk
Problem is:
It is not a given that the whole file can fit in memory
It is inefficient
I have looked into available libraries and sadly the best match I have found os.WriteAt
overwrites the following blocks. Example:
import "os"
func main() {
pathToFile := "./tmp"
bufferToWrite := []byte{255, 255, 255, 255, 255}
f, _ := os.OpenFile(pathToFile, os.O_CREATE|os.O_RDWR, os.PermMode)
defer f.Close()
f.Write(bufferToWrite)
So, at this point the content of tmp
will be (after $: xxd -g 1 -b tmp
):
11111111 11111111 11111111 (x) 11111111 11111111
Let's try to insert something with offset = 3 (x):
bufferToInsert := []byte{0, 0}
f.WriteAt(bufferToInsert, 3)
}
Output will be:
11111111 11111111 11111111 00000000 00000000
And I want it to be:
11111111 11111111 11111111 00000000 00000000 11111111 11111111
Any ideas?
Instead of inventing your own file format, you could copy the pack file format from Git.
The basic idea is to have an index file and a data file. When you want to insert a slice you just append it to the data file. Then you update the index file, which is usually smaller. Note that the pack file is not designed for real-time updates, but accompanied by individual object files.
Or have a look at the Berkeley DB file format.