I have some webpages and I would like to get only the text that is visible to a user. Currently I'm checking for text by doing the following:
n *html.Node
if n.Type == html.TextNode{
print
}
The problem is I'm getting CSS code thrown in with my text, is there a way to only get the text? i.e.
<h1> I want to get this text and all others like it </h1>
With GOQuery - this is really easy.
doc, err := goquery.NewDocument("http://yoursite.com")
doc2.Find("h1").Each(func(i int, s *goquery.Selection) {
your_text,_ := s.Text()
}
Good luck!