从大型nfcapd二进制文件获取IP地址

I need to get information about source IPs and destination IPs from nfcapd binary file. The problem is in file's size. I know that it is not desirable to open and read very large (more than 1 GB) files with io or os package.

Here is my hacking and draft start:

package main

import (
    "fmt"
    "time"
    "os"
    "github.com/tehmaze/netflow/netflow5"
    "log"
    "io"
    "bytes"
)

type Message interface {}

func main() {
    startTime := time.Now()
    getFile := os.Args[1]
    processFile(getFile)
    endTime := time.Since(startTime)
    log.Printf("Program executes in %s", endTime)
}

func processFile(fileName string) {
    file, err := os.Open(fileName)
    // Check if file is not empty. If it is, then exit from program
    if err != nil {
        fmt.Println(err)
        os.Exit(1)
    }

    // Useful to close file after getting information about it
    defer file.Close()
    Read(file)
}

func Read(r io.Reader) (Message, error) {
    data := [2]byte{}
    if _, err := r.Read(data[:]); err != nil {
        return nil, err
    }
    buffer := bytes.NewBuffer(data[:])
    mr := io.MultiReader(buffer, r)
    return netflow5.Read(mr)
}

I want to split file into chunks with 24 flows and process it concurrently after reading with netflow package. But I do not imagine how to do it without losing any data during division.

Please fix me if I missed something in code or description. I spend a lot of time in searching my solution on the web and thinking about another possible implementations.

Any help and/or advice will be highly appreciated.

File has the following properties (command file -I <file_name> in terminal):

file_name: application/octet-stream; charset=binary

The output of file after command nfdump -r <file_name> has this structure:

Date first seen          Duration Proto      Src IP Addr:Port          Dst IP Addr:Port   Packets    Bytes Flows

Every property is on own column.

UPDATE 1: Unfortunately, it is impossible to parse file with netflow package due to difference in binary file structure after saving it on disk via nfcapd. This answer was given by one of the nfdump contributors.

The only way now is to run nfdump from terminal in go program like pynfdump.

Another possible solution in the future is to use gopacket.

IO is is almost always going to be the limiting factor when parsing a file, and unless there is heavy computation involved, reading a single file serially is going to be the fastest way to process it.

Wrap the file in a bufio.Reader and give it to the Read function:

file, err := os.Open(fileName)
if err != nil {
    log.Fatal((err)
}
defer file.Close()

packet, err := netflow5.Read(bufio.NewReader(file))

Once it's parsed, you can then split up the records if you need to handle the chunks separately.