I try to find email match in a website in goland with a file include url, for example, if i put "http://facebook.com" in the file, he will try to find all email find in the website, but he always result 0. I think I choose the wrong function but i try to find other function but i've got the same result. Here the code :
package main
import (
"bufio"
"bytes"
"fmt"
"log"
"net/http"
"os"
"regexp"
"sync"
)
func main() {
var wg sync.WaitGroup
wg.Add(1)
go emailWeb(os.Args[1], &wg)
wg.Wait()
}
func emailWeb(name string, wg *sync.WaitGroup) {
file, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
str := scanner.Text()
nb_arobase := numberEmail(str)
fmt.Println("URL : ", str, " nb email: ", nb_arobase)
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
(*wg).Done()
}
func numberEmail(url string) int {
count := 0
reg := regexp.MustCompile(`[a-z0-9._%+\-]+@[a-z0-9.\-]+\.[a-z]{2,4}`)
response, err := http.Get(url)
if err != nil {
log.Fatal(err)
} else {
str := response.Body
buf := new(bytes.Buffer)
buf.ReadFrom(str)
bodyStr := buf.String()
for i := 0; i < len(bodyStr); i++ {
if reg.MatchString(string(bodyStr[i])) {
count += 1
}
}
}
return count
}
You're trying to match the regexp against each individual character in the http response body. You can count the matches in the entire body if you want by counting the matched indexes.
resp, err := http.Get(url)
if err != nil {
log.Println(err)
return 0
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Println(err)
return 0
}
return len(reg.FindAllIndex(body))