索引大文件的最佳方法

I have a file with about 100gb with a word:tag per line. I want to index these on word to easily get the list of tags for a given word.

I wanted to save this on boltdb (mainly to checkout boltdb) but random write access is bad so I was aiming to index the file in some other way first, then moving all of it to boltdb without need to check for duplicates or de/serialisation of the tag list

So, for reference, if I simply read the file into memory (discarding data), I get about 8 MB/s.

If I write to boltdb files using code such as

        line := ""
        linesRead := 0
        for scanner.Scan() {
            line = scanner.Text()
            linesRead += 1
            data := strings.Split(line, ":")

            err = bucket.Put([]byte(data[0]), []byte(data[1]))
            logger.FatalErr(err)
            // commit on every N lines
            if linesRead % 10000 == 0 {
                err = tx.Commit()
                logger.FatalErr(err)
                tx, err = db.Begin(true)
                logger.FatalErr(err)
                bucket = tx.Bucket(name)
            }
        }

I get about 300 Kb/s speed and this is not even complete (as it's not adding tag to each word, only stores the last occurrence). So adding the array and JSON serialisation would definitely lower that speed...

So I gave mongodb a try

        index := mgo.Index{
            Key: []string{"word"},
            Unique: true,
            DropDups: false,
            Background: true,
            Sparse: true,
        }
        err = c.EnsureIndex(index)
        logger.FatalErr(err)

        line := ""
        linesRead := 0
        bulk := c.Bulk()

        for scanner.Scan() {
            line = scanner.Text()
            data := strings.Split(line, ":")
            bulk.Upsert(bson.M{"word": data[0]}, bson.M{"$push": bson.M{"tags": data[1]}})
            linesRead += 1

            if linesRead % 10000 == 0 {
                _, err = bulk.Run()
                logger.FatalErr(err)
                bulk = c.Bulk()
            }
        }

And I get about 300 Kb/s as well (though Upsert and $push here handle appending to the list).

I tried with a local MySQL instance as well (indexed on word) but speed was like 30x slower...