I have a Python program which check whether a product price in Amazon is lower than expected.
For learning purposes, and to make it more portable, I'm porting that code to Go. It's my first ever Go program.
For parsing the html, I'm using goquery. So far I'm just trying to retrieve the name of the product. Here's the code:
package main
import (
"flag"
"fmt"
"log"
"github.com/PuerkitoBio/goquery"
)
func main() {
url := flag.String("url", "", "URL of the product")
flag.Parse()
doc, err := goquery.NewDocument(*url)
if err != nil {
log.Fatal(err)
}
name := doc.Find("#productTitle").Text()
fmt.Println(name)
}
What problem I'm facing? It's returning the name around 1 out of 8 executions. That's definitively not correct. It's not a problem of Amazon or regarding the tag #productTitle
, because the Python code just work every single time.
What might be wrong? How can I further debug this issue? I repeat, first code ever using Go :)
I found the issue :)
There was something different between my Python code and the Go code. In Python I was issuing a real user-agent header, whereas in Go, it was the default for that package.
That means that, actually, it was Amazon blocking most of the attempts, returning a CAPTCHA.