I'm learning golang coroutines and I decide to create a small data parser.
First, lets say we have a data similar to json [{id: 1, data: "text"}, {id: 2, data: "text"}, ...{id: 2000, data: "text"}]
and let's say we have a function which parses our data and insert it into database
dataParser(string) error
So by running
for n = 0; n < 2000; n++ {
go dataParser(f[i])
}
we can see that data inserted into database in random order. That's basic nature of gorouting.
But let's say we have different type of data where each future record may depend on the previous record [{id: 1, data: "text"}, {id: 2, data: @1}, ...{id: 2000, data: @1950}]
where @<id>
means we need to take value from data with the #id
record.
if I run just do the same:
for n = 0; n < 2000; n++ {
go dataParser(f[i])
}
I'm getting in the situation where golang trying to insert data which have a reference on the previous record but that record is not parsed yet, i.e. golang choose record 2 but first record hasn't been parsed yet and I got empty data
I tried to use sync.WaitGroup:
for n = 0; n < 2000; n++ {
wg.Add(1)
go func() {
dataParser(f[i])
wg.Wait()
wg.Done()
}()
}
First of all - it feels strange like i don't need to use goroutines here wg.Wait
helped - before golang may insert record #100 before #1 but still, i'm get into a situation where golang works with record #3 when i need to know data from record #1.
So, I get confused.
Thank you for help!
I don't think this is suitable for goroutines, since all the 2000 elements are not independently, it could not happen concurrently.
Maybe you could run a loop for this json file and fill a map where k-v you can retrieve from id and data.
After that, you can run dataParser to get the data into database.