字符串词库:由于起始字符太多,因此需要使用不等于逻辑的拆分

I have a .dat file that is a dictionary/thesaurus containing about 300k lines

For each word, the following lines below it that have a word in brackets at the start of the string are the thesaurus' alternatives with the word in the brackets being the type. So a noun or adjective. For example:

acceptant|1
(adj)|acceptive|receptive 
acceptation|3
(noun)|acceptance
(noun)|word meaning|word sense|sense|signified
(noun)|adoption|acceptance|espousal|blessing|approval|approving
accepted|6
(adj)|recognized|recognised|acknowledged 
(adj)|undisputed|uncontroversial |noncontroversial
(adj)|standard 
(adj)|acceptable|standard |received
(adj)|established |constituted
(adj)|received|conventional 
accepting|1
(adj)|acceptive 

So in the above there are 4 words from the dictionary, but each word has multiple different entries for the thesaurus

I want to split the strings using:

strings.Split(dictionary, !"(")

Meaning anything that isn't the "(" character. This is because it's an extensive dictionary with slang and abbreviations and whatnot. But I can't work out how to use the not equal to operator

Does anyone know how to use split with not equal to logic? Or can anyone suggest some clever alternative ideas?

package main

import (
    "bufio"
    "fmt"
    "os"
    "strings"
)

func main() {

    file, _ := os.Open("dic.dat")
    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        line := scanner.Text()
        if strings.HasPrefix(line, "(") {
            continue
        }
        fmt.Println(line)
    }

}

@MostafaSolati's solution could be improved by being written more efficiently.

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "os"
)

func main() {
    file, _ := os.Open("dic.dat")
    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        data := scanner.Bytes()
        if bytes.HasPrefix(data, []byte("(")) {
            continue
        }
        line := scanner.Text()
        fmt.Println(line)
    }
}

Output:

acceptant|1
acceptation|3
accepted|6
accepting|1

By design, Go code is expected to be efficient. The Go standard library testing package includes a benchmark feature.

It's important to avoid unnecessary conversions and allocations. For example, converting byte slices read from a file to a strings, an allocation and a copy.

In this case, we only need to convert accepted data to a string. For example, prefer Bytes to Text.

$ go test dict_test.go -bench=.
BenchmarkText-4      500    2486306 ns/op    898528 B/op    14170 allocs/op
BenchmarkBytes-4    1000    1489828 ns/op     34080 B/op      609 allocs/op
$

Sample benchmark data:

KEY: Aback.
SYN: Backwards, rearwards, aft, abaft, astern, behind, back.
ANT: Onwards, forwards, ahead, before, afront, beyond, afore.
=
KEY: Abandon.
SYN: Leave, forsake, desert, renounce, cease, relinquish,
discontinue, castoff, resign, retire, quit, forego, forswear,
depart from, vacate, surrender, abjure, repudiate.
ANT: Pursue, prosecute, undertake, seek, court, cherish, favor,
protect, claim, maintain, defend, advocate, retain, support, uphold,
occupy, haunt, hold, assert, vindicate, keep.
=

dict_test.go:

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "io/ioutil"
    "os"
    "strings"
    "testing"
)

func BenchmarkText(b *testing.B) {
    b.ReportAllocs()
    for N := 0; N < b.N; N++ {
        file := bytes.NewReader(benchData)
        scanner := bufio.NewScanner(file)
        for scanner.Scan() {
            line := scanner.Text()
            if !strings.HasPrefix(line, "KEY") {
                continue
            }
            _ = line // process line
        }
        if err := scanner.Err(); err != nil {
            b.Fatal(err)
        }
    }
}

func BenchmarkBytes(b *testing.B) {
    b.ReportAllocs()
    for N := 0; N < b.N; N++ {
        file := bytes.NewReader(benchData)
        scanner := bufio.NewScanner(file)
        for scanner.Scan() {
            data := scanner.Bytes()
            if !bytes.HasPrefix(data, []byte("KEY")) {
                continue
            }
            line := scanner.Text()
            _ = line // process line
        }
        if err := scanner.Err(); err != nil {
            b.Fatal(err)
        }
    }
}

var benchData = func() []byte {
    // A Complete Dictionary of Synonyms and Antonyms by Samuel Fallows
    // http://www.gutenberg.org/files/51155/51155-0.txt
    data, err := ioutil.ReadFile(`/home/peter/dictionary.51155-0.txt`)
    if err != nil {
        panic(err)
    }
    return data
}()