将多个映射组合到一个映射中，该映射的给定键值是组合映射中的键值之和

I have written a program that identifies all unique words in a text document and counts how many times each of the words appears. To improve the performance of my program I am trying to break up the word counting into several goroutines that can run in parallel.

Initially, I tried using a single map that was passed by reference to each goroutine, where each goroutine would count the words in part of the document. This caused a panic because the program was trying to write to the same map from multiple goroutines simultaniously. To solve this issue, I created a mutex that would prevent multiple goroutines from writing to the map simultaniously. At this point, the program functioned as expected, but with no performance difference compared to the original sequential implementation of the WordCount function. At second thought, this is not surprising given that the mutex forces the other goroutines to wait before writing to the map, hence preventing parallel computation.

Below is the code that uses a mutex to avoid the described runtime panic, but also fails to count words in parallel.

func WordCount(words []string, startWord int, endWord int, freqs map[string]int, waitGroup *sync.WaitGroup, mutex *sync.Mutex) {
    mutex.Lock()
    for i := startWord; i < endWord; i++ {
        word := words[i]
        freqs[word]++
    }
    mutex.Unlock()
    waitGroup.Done()
}

func ParallelWordCount(text string) map[string]int {
    // Split text into string array of the words in text.
    text = strings.ToLower(text)
    text = strings.ReplaceAll(text, ",", "")
    text = strings.ReplaceAll(text, ".", "")
    words := strings.Fields(text)
    length := len(words)

    freqs := make(map[string]int)

    var mutex sync.Mutex
    var waitGroup sync.WaitGroup
    waitGroup.Add(2)
    defer waitGroup.Wait()

    threads := 2
    wordsPerThread := length / threads // always rounds down
    wordsInLastThread := length - (threads-1)*wordsPerThread
    startWord := -wordsPerThread
    var endWord int
    for i := 1; i <= threads; i++ {
        if i < threads {
            startWord += wordsPerThread * i
            endWord += wordsPerThread * i
        } else {
            startWord += wordsInLastThread
            endWord += wordsInLastThread
        }
        go WordCount(words, startWord, endWord, freqs, &waitGroup, &mutex)
    }

    return freqs
}

I belive that I could achieve parallel word counting if I would create a local map of word frequencies for each goroutine and in the end combine the local frequency maps into a single map with the word counts for the entire text file. The problem I am currently facing is how to combine the local frequency maps. Concretely, I need to know how to combine multiple maps into a map whose value for a given key is the sum of the values of the key in the maps to be combined.

To clarify the underlying logic of what I am trying to do I have included the below example. The ConcurrentSum function returns the sum of the elements in an array by computing the lower and upper halves of the array concurrently. In my case, I want to, in parallel, count words in different parts of my text file and, ultimately, combine the word counts into a single map of word counts representative of the entire document.

func sum(a []int, res chan<- int) {
    var sum int
    for i := 0; i < len(a); i++ {
        sum += a[i]
    }
    res <- sum
}

// concurrently sum the array a.
func ConcurrentSum(a []int) int {
    n := len(a)
    ch := make(chan int)
    go sum(a[:n/2], ch)
    go sum(a[n/2:], ch)
    return <-ch + <-ch
}

I believe you could create an array of maps each one being used for each process and then reading in each map using a list to keep track of the words you've already counted. Assuming that each word is a key to the number of times counted, that's how it looks. Parallel processing here might not be the best choice considering the concurrency side since everything needs to be kept separate for a real performance increase. If you have the storage space then you most certainly can use a list and get at worst case O(N) efficiency from the integration of the maps. You will need to keep the the integration of the maps in a single thread or a single process.