大型CSV Go频道内存过多

Ok so I have a massive 2.5G CSV file which is about 25million or so records with about 20 columns.. I am trying to use GO to process this monster and do some formatting then inserting into a database. I have this basic code setup with channels because I figured it would be the fastest using go routines and such: here

The problem being is because it is blocking, my channel just gets STUFFED with an insane amount of data and before I know it my memory is out of control. So before any processing or inserting gets done it fails.

Could someone help me out with this code and see if I can simultaneously build up the queue from reading the file WHILE processing and inserting?

For every record of your big CSV file you start a new goroutine. Every goroutine allocates ~2kB stack. It's not recommended to start a goroutine for everything.

Try to use a pipeline, the main goroutine would read the records and send trough a channel1.

You start eg. 10 worker goroutines that process the records received from channel1 and send the processed values trough channel2.

Then some other 10 goroutines would receive the values from channel2 and insert it to the database.

Here are some examples for pipelines.