使用goroutines复制子目录

My program copies multiple files and directories from different parts of the computer to one place.

One of the directories is very big, so it takes about 20-30 seconds to copy it. For now I just made this method which copies that directory to start as a goroutine:

func CopySpecificDirectory(source, dest string, quit chan int) (err error) {
    files, err := os.Open(source)
    file, err := files.Readdir(0)

    if err != nil {
        fmt.Printf("Error reading directory %s: %s
", source, err)
        return err
    }

    for _, f := range file {
        if f.IsDir() {
            copy.CopyDir(source+"\\"+f.Name(), dest+"\\"+f.Name())
        } else {
            copy.CopyFile(source+"\\"+f.Name(), dest+"\\"+f.Name())
        }
    }

    quit <- 1

    return nil
}

Main:

quit := make(chan int)
go CopySpecificDirectory(config.Location+"\\Directory", config.Destination, quit)

This just improves my program by a few seconds. Inside my CopySpecificDirectory method (if this is the best way) I want for each directory to create a goroutine, something like this maybe:

c := make(chan int)
for _, f := range file {
    if f.IsDir() {
        go func() {
            copy.CopyDir(source+"\\"+f.Name(), dest+"\\"+f.Name())
            c <- 1
        }()
    } else {
        copy.CopyFile(source+"\\"+f.Name(), dest+"\\"+f.Name())
    }
}

With this approach I don't know where to wait for the copy to finish for every directory (<- c).
Is this the best way ? If anyone has other suggestion what is the fastest way to copy a directory, I will love to hear it.

edit:

I used the aproach form the example of sync.WaitGroup from the website.

for _, f := range file {
    if f.IsDir() {
        wg.Add(1)
        go func() {
            defer wg.Done()
            copy.CopyDir(source+"\\"+f.Name(), dest+"\\"+f.Name())
        }()
    // more code

I have declared var wg sync.WaitGroup as global, and I do wg.Wait() in main right after I call CopySpecificDirectory.

But CopySpecificDirectory finishes before copying all the contents. What am I doing wrong ? Looks like it is not waiting for the goroutines to finish.

Use sync.WaitGroup() instead of channels:

  1. Create a wait group object.
  2. Before spawning a goroutine, Add() one to it.
  3. When a goroutine is about to quit, it calls Done() on that object.
  4. In your main (waiting) code, call Wait() on that object. This function will return once all the goroutines "tracked" this way finish their execution.

Note that your program is I/O bound, not CPU-bound. You could save some time if your code would need to copy files from physically different devices to (other) physically different devices. If you're just shuffling files around on the same filesystem, or all your sources are on the same filesystem, or all your destinations are on the same filesystem, you won't gain much as your goroutines would just compete over the single shared resource—the storage device—and the end result won't be much more different from the case when you were just executing copying operations sequentially.

To provide an example, the manual page for the /etc/fstab file which contains information on mounted/mountable filesystems on classic Unix systems mentions that the OS never checks filesystems located on the same physical medium at the same time—only sequentially, while at the same time it would check filesystems located on different drives in parallel. See the entry for the fs_passno parameter in the manual page.

With this approach I don't know where to wait for the copy to finish for every directory (<- c).

Instead of signaling on a channel, you could use SyncGroup to coordinate all your goroutines. You call wg.Add(1) for each spawned goroutine and make them call wg.Done() when they're, well, done. Then you do wg.Wait() after spawning all of them to wait until they all finish.

As for how to speed up copying in general, there is no definite answer. It depends on a lot of factors (OS probably, filesystem, hard disk, load, etc.).

Thanks to both @kostix and @justinas for helping out. I follow their solution, the only problem still left was that inside my for loop f doesn't bind necessarily until after the loop completes.

So I had to add f := f. This works now:

for _, f := range file {
    f := f
    if f.IsDir() {
        wg.Add(1)
        go func() {
            copy.CopyDir(source+"\\"+f.Name(), dest+"\\"+f.Name())
            defer wg.Done()
        }()
    } else {
        copy.CopyFile(source+"\\"+f.Name(), dest+"\\"+f.Name())
    }
}