I have a goroutine which periodically checks for new files in a directory and then prints the contents of the files. However there is another goroutine which creates a file, writes contents into it and then saves the file.
How do I ignore the files which are open in WRITE mode in a directory?
Sample Code:
for {
fileList, err := ioutil.ReadDir("/uploadFiles")
if err != nil {
log.Fatal(err)
continue
}
for _, f := range fileList {
log.Println("File : ", f.Name())
go printContents(f.Name())
}
time.Sleep(time.Second * 5)
}
In the printContents goroutine I want to ignore the files which are open in WRITE mode.
That is not how it's done.
Off the top of my head I can think of these options:
If both goroutines are working in the same program, there is little problem: make the "producer" goroutine register the names of the files it has completed modifying into some registry, and make the "consumer" goroutine read (and delete) from that registry.
In the simplest case that could be a buffered channel.
If the producer works much faster than the consumer, and you don't want to block the former for some reason then a slice protected by a mutex would fit the bill.
If the goroutines work in different processes on the same machine but you control both programs, make the producer process communicate the same data to the consumer process via any suitable sort of IPC.
What method to do IPC is better depends on how the processes start up, interact etc.
There is a wide variety of options.
If you control both processes but do not want to mess with IPC between them (there are reasons, too), then make the producer follow best practices on how to write a file (more on this in a moment), and make the consumer use any filesystem-monitoring facility to report which files get created ("appear") once produced by the producer. You may start with github.com/fsnotify/fsnotify
.
To properly write a file, the producer have to write its data to a temporary file—that is, a file located in the same directory but having a filename which is well understood to indicate that the file is not done with yet—for instance, ".foobar.data.part" or "foobar.data.276gd14054.tmp" is OK for writing "foobar.data". (Other approaches exist but this one is good enough to start with.)
Once the file is ready, the producer have to rename the file from its temporary name to its "proper", final name. This operation is atomic on all sensible OSes/filesystems, and makes file atomically "spring into existense" from the PoV of the consumer. For instance, inotify
on Linux generates an event of type "moved to" for such an appearance.
If you don't feel like doing the proper thing yourself, github.com/dchest/safefile
is a good cross-platform start.
As you can see, with this approach you know the file is done just from the fact it was reported to having appeared.
If you do not control the producer, you may need to resort to guessing.
The simpest is to, again, monitor the filesystem for events—but this time for "file updated" events, not "file created" events. For each file reported as updated you had to remember the timestamp of that event, and when certain amount of time passes, you may declare that the file is done by the producer.
IMO this approach is the worst of all, but if you have no better options it's at least something.