I've got a simple web scraper/spider based on goquery, which in turn uses net/http. It works great, until I hit a website with too many redirects.
Get http://www.example.com/some/path.html: stopped after 10 redirects
But why? Did it redirect to itself? Did it throw me into some spider jail? I want to know to what url's I got redirected, and in what order.
The function giving the error seems to know this, since it's checking the length of a slice of requests, but I don't really want to edit the net/http package myself.
Here's that function from http://golang.org/src/pkg/net/http/client.go
func defaultCheckRedirect(req *Request, via []*Request) error {
if len(via) >= 10 {
return errors.New("stopped after 10 redirects")
}
return nil
}
You can pass your own function to http.Client
, for example:
client := &http.Client{
CheckRedirect: func(req *Request, via []*Request) error {
log.Println("redirect", req.URL)
if len(via) >= 10 {
return errors.New("stopped after 10 redirects")
}
return nil
},
}