Cloud BigTable存储同一行键的多个副本

I am writing to BigTable using the Go library. I use the ApplyBulk method to make multiple inserts atomically. However, when I query BigTable using the ReadRows function I see multiple copies/versions with the same row key.

For instance: In the below-mentioned example, I see multiple copies of the same RowKey with partial data and the last occurrence has all the columns with all the fields.

enter image description here

How can I ensure that only a single copy of data is stored for each row key? And how can I fetch only the latest version of rows inserted?

Code:

row_range := bigtable.PrefixRange("")

err = tbl.ReadRows(ctx,row_range, func(row bigtable.Row) bool {
// logic goes here
}, bigtable.RowFilter(bigtable.LatestNFilter(1)))

As for as I know, that is normal. CBT stores the history of that row key and you'll want to pass a filter to get the latest one.

bigtable.RowFilter(bigtable.LatestNFilter(1))

Update: this is how I use that filter

    rowName := "myrow#key#id" 
    row, err := bt.Table.ReadRow(ctx, rowName, bigtable.RowFilter(bigtable.LatestNFilter(1)))
    if err != nil {
        // handle error...
    }
    if row == nil {
        // check for 0 result...
    }

Update 2:

Based on your code, using ReadRows instead, it looks like you're trying to get multiple row keys. so your code should pull the latest row for each of the row keys.

If you just want the one key that you show in your image then I would just use the ReadRow method instead.

err = tbl.ReadRow(ctx,"1564:u2Sng4xbtG", bigtable.RowFilter(bigtable.LatestNFilter(1)))

Else... I guess there could be an issue with how it was stored in CBT, but that is a little out of my skill set for CBT. hopefully, a CBT expert can chime in for you.

I think that the error is coming from the way that ApplyBulk transform the "insert tasks".

You can find more reference here https://godoc.org/cloud.google.com/go/bigtable

There seems to be a bit of confusion here:

  1. Under no circumstances should ReadRows ever return duplicate row keys. Assuming that the rows in your spreadsheet correspond to the rows that the client library returned in func callback, then this is a bug in the client library. Please open an issue in https://github.com/googleapis/google-cloud-go/issues and provide a way to reproduce the issue

  2. Bigtable does allow multiple versions of cell values. It provides filters like LatestNFilter() to hide old cell values and gc rules to remove them periodically. However, this is scoped to cell values and is unrelated to row keys. In other words, Bigtable provides cell versions not row versions.

  3. ApplyBulk is not atomic and furthermore it doesn't provide any guarantees about ordering of the mutations.