I'm using the Goamz package and could use some help getting bucket.Multi
to stream an HTTP GET response to S3.
I'll be downloading a 2+ GB file via chunked HTTP and I'd like to stream it directly into an S3 bucket.
It appears that I need to wrap the resp.Body
with something so I can pass an implementation of s3.ReaderAtSeeker
to multi.PutAll
// set up s3
auth, _ := aws.EnvAuth()
s3Con := s3.New(auth, aws.USEast)
bucket := s3Con.Bucket("bucket-name")
// make http request to URL
resp, err := http.Get(export_url)
if err != nil {
fmt.Printf("Get error %v
", err)
return
}
defer resp.Body.Close()
// set up multi-part
multi, err := bucket.InitMulti(s3Path, "text/plain", s3.Private, s3.Options{})
if err != nil {
fmt.Printf("InitMulti error %v
", err)
return
}
// Need struct that implements: s3.ReaderAtSeeker
// type ReaderAtSeeker interface {
// io.ReaderAt
// io.ReadSeeker
// }
rs := // Question: what can i wrap `resp.Body` in?
parts, err := multi.PutAll(rs, 5120)
if err != nil {
fmt.Printf("PutAll error %v
", err)
return
}
err = multi.Complete(parts)
if err != nil {
fmt.Printf("Complete error %v
", err)
return
}
Currently I get the following (expected) error when trying to run my program:
./main.go:50: cannot use resp.Body (type io.ReadCloser) as type s3.ReaderAtSeeker in argument to multi.PutAll:
io.ReadCloser does not implement s3.ReaderAtSeeker (missing ReadAt method)
You haven't indicated which package you're using to access the S3 api but I'm assuming it's this one https://github.com/mitchellh/goamz/.
Since your file is of a significant in size, a possible solution might be to use the multi.PutPart. This will give you more control than multi.PutAll. Using the Reader from the standard library, your approach would be:
I don't have access to S3 so I can't test my hypothesis but the above could be worth exploring if you haven't already.
A simpler approach is to use - http://github.com/minio/minio-go
It implements PutObject() which is a fully managed self contained operation for uploading large files. It also automatically does multipart for more than 5MB worth of data in parallel. if no pre-defined ContentLength is specified. It will keep uploading until it reaches EOF.
Following example shows how to do it, when one doesn't have a pre-defined input length but an io.Reader which is streaming. In this example i have used "os.Stdin" as an equivalent for your chunked input.
package main
import (
"log"
"os"
"github.com/minio/minio-go"
)
func main() {
config := minio.Config{
AccessKeyID: "YOUR-ACCESS-KEY-HERE",
SecretAccessKey: "YOUR-PASSWORD-HERE",
Endpoint: "https://s3.amazonaws.com",
}
s3Client, err := minio.New(config)
if err != nil {
log.Fatalln(err)
}
err = s3Client.PutObject("mybucket", "myobject", "application/octet-stream", 0, os.Stdin)
if err != nil {
log.Fatalln(err)
}
}
$ echo "Hello my new-object" | go run stream-object.go