如何将以下线程模型从C ++转换为Go?

In my C++ project, I have a large, GB binary file on disk that I read into memory for read-only calculations.

My current C++ implementation involves reading the entire chunk into memory once and then spawning threads to read from the chunk in order to do various calculations (mutex-free and runs quickly). Technically, each thread really only needs a small part of the file at a time, so in the future, I may change this implementation to use mmap(), especially if the file gets too big. I've noticed this gommap lib so I think I should be covered going forward.

What approach should I take to translate my current C++ threading model (one large chunk of read-only memory) into a go threading model, keeping run-time efficiency in mind?

goroutines? alternatives?

I'm sure this answer will cop a lot of heat but here goes:

You won't get reduced running time by switching to Go, especially if your code is already mutex free. Go doesn't guarantee efficient balancing of goroutines, and will not currently make best use of the available cores. The generated code is slower than C++. Go's current strengths are in clean abstractions, and concurrency, not parallelism.

Reading the entire file up front isn't particular efficient if you then have to go and backtrack through memory. Parts of the file you won't use again until much later will be dropped from the cache, only to be reloaded again later. You should consider memory mapping if your platform will allow it, so that pages are loaded from disk as they're required.

If there is any intense inter-routine communication, or dependencies between the data you should try to make the algorithm single threaded. It's difficult to say without knowing more about the routines you're applying to the data, but it does sound possible that you've pulled out threads prematurely in the hope to get a magic performance boost.

If you're unable to rely on memory mapping due to file size, or other platform constraints, you should consider making use of the pread call, thereby reusing a single file descriptor, and only reading as required.

As always, the following rule applies to optimization. You must profile. You must check that changes you make from a working solution, are improving things. Very often you'll find that memory mapping, threading and other shenanigans have no noticeable effect on performance whatsoever. It's also an uphill battle if you're switching away from C or C++.

Also, you should spawn goroutines to handle each part of the file, and reduce the results of the calculations through a channel. Make sure to set GOMAXPROCS to an appropriate value.

This program sums all the bytes in a file in multiple goroutines (without worrying about overflow).

You'll want to reimplement processChunk and aggregateResults for your case. You may also want to change the channel type of the results channel. Depending on what you're doing, you may not even need to aggregate the results. The chunk size and the channel's buffer size are other knobs you can tweak.

package main

import (
    "fmt"
    "io/ioutil"
)

func main() {
    data, err := ioutil.ReadFile("filename")
    if err != nil {
        // handle this error somehow
        panic(err.String())
    }
    // Adjust this to control the size of chunks.
    // I've chosen it arbitrarily.
    const chunkSize = 0x10000
    // This channel's unbuffered. Add a buffer for better performance.
    results := make(chan int64)

    chunks := 0    
    for len(data) > 0 {
        size := chunkSize
        if len(data) < chunkSize {
            size = len(data)
        }
        go processChunk(data[:size], results)
        data = data[size:]
        chunks++
    }

    aggregateResults(results, chunks)
}

func processChunk(chunk []byte, results chan int64) {
    sum := int64(0)
    for _, b := range chunk {
        sum += int64(b)
    }
    results <- sum
}

func aggregateResults(results chan int64, chunks int) {
    sum := int64(0)
    for chunks > 0 {
        sum += <-results
        chunks--
    }
    fmt.Println("The sum of all bytes is", sum)
}