如何在Go中有效处理大型数据数组(超过10MiB)?

I am working with go to download files from one server and after manipulating the files sending it to another server.

The files size can vary from 1MB to 200MB.

Currently, my code is pretty simple, I am using http.Client and bytes.Buffer .
It takes lot of time to handle does big files (the 100MB to 200MB) which there is a lot of them.

After a quick profiling, I see that most of the time I do bytes.(*Buffer).grow,
How can I create big buffers for example for 16MB?

What can I do in order to improve my efficiency of the code? General tips for handling with large http requests?

Edit

I will explain, exactly what I am trying to do. I have couchdb documents (with attachments) that I am trying to copy to another couchdb instance. The couchdb documents size can be from 30MB to 200MB, copying tiny (2 - 10MB) couchdb documents - is really fast.

But sending the document over the wire is really slow. I am currently, trying to profile, and try to use @Evan answer to see what is my problem.

Take a look at the description for bytes.NewBuffer: http://golang.org/pkg/bytes/#NewBuffer

Sounds like you can create a 16MB byte slice and use it to initialize the buffer.

You could consider the fact your program has no need to keep the data in memory if all it needs to do is to copy it.

Now the strong feature of Go's standard library is sensible uses of interfaces: http.Response's Body member is something implementing the io.ReadCloser interface, and that satisfies the type of the body argument of the http.Client's Post method.

So you could roll like this:

  1. Perform a request for the document—you'll get an instance of http.Response back, which has the Body member of type io.readCloser.

    Note that at this point you haven't actually started receiving the body from the "source" server because to do that you'll have to drain the io.ReadCloser of Body.

  2. Initiate another (supposedly POST) request to send the data, and when making the request supply it that Body member obtained in the first step.

    Once this request is done piping your data, call Close() on that Body member.

Something like this:

import "net/http"

func Pipe(from, to string) (err error) {
    src, err := http.Get(from)
    if err != nil {
        return
    }
    dst, err := http.Post(to, myPostType, src.Body)
    if err != nil {
        return
    }
    // Now read and then Close() the dst.Body member.
}

In this code, http.Post will read from src.Body and then Close() it itself.

You might add bytes.Buffer into the mix in hope to reduce the amount of syscalls performed but don't do that unless the plain method does not work.

As @Evan already pointed out: you can choose an initial buffer size when creating a new buffer.

Since allocation of buffers is so expensive (this is why your grow calls take so long; they re-allocate if the size does not fit anymore), picking the right buffer size is key. Picking the right strategy for buffer allocation depends on a lot of factors. You might choose your own method of growing buffers depending on your application profile.

You should also consider recycling your buffers to prevent heap fragmentation: http://blog.cloudflare.com/recycling-memory-buffers-in-go