I am new to Go. I am using goquery to extract data from an HTML page. But the problem is the data I am looking for is not bounded by any HTML tag. It is simple text after a <br>
tag. How can I extract it?
Edit : Here is HTML code.
<div class="container">
<div class="row">
<div class="col-lg-8">
<p align="justify"><b>Name</b>Priyaka</p>
<p align="justify"><b>Surname</b>Patil</p>
<p align="justify"><b>Adress</b><br>India,Kolhapur</p>
<p align="justify"><b>Hobbies </b><br>Playing</p>
<p align="justify"><b>Eduction</b><br>12th</p>
<p align="justify"><b>School</b><br>New Highschool</p>
</div>
</div>
</div>
From this I want "Priyanka" and "12th".
Try query for
and get its siblings
http://godoc.org/github.com/PuerkitoBio/goquery#Selection.Siblings
The following is what you want:
doc.Find(".container").Find("[align=\"justify\"]").Each(func(_ int, s *goquery.Selection) {
prefix := s.Find("b").Text()
result := strings.TrimPrefix(s.Text(), prefix)
println(result)
})
import strings in front of your code. If you need complete code example, check here.