I wrote a little code in Go to parse a site and retrieve all link and their Http Response. My code works well but I would like to add GoRoutines to see how it works in recursive function.
package main
import (
"fmt"
"io/ioutil"
"net/http"
"regexp"
"strings"
"sync"
)
type linkWeb struct {
Link string
Code string
}
func parseLink(siteName string, arrayError []linkWeb) (arrayResult []linkWeb) {
var mutex = &sync.Mutex{}
var wg = sync.WaitGroup{}
var baseSite = siteName
site, _ := http.Get(baseSite)
html, _ := ioutil.ReadAll(site.Body)
errorCodeHTTP := site.Status
mutex.Lock()
errorArray := arrayError
mutex.Unlock()
allJs := regexp.MustCompile(`src="[^"]*"+`)
allA := regexp.MustCompile(`(.)*href="[^"]*"+`)
var resultsJs = allJs.FindAllStringSubmatch(string(html), -1)
var resultUrls = allA.FindAllStringSubmatch(string(html), -1)
resultsJs = append(resultsJs, resultUrls...)
for _, linkJs := range resultsJs {
wg.Add(1)
go func() {
re := regexp.MustCompile(`(href|src)(.)*="[^"]*"`)
var execReg = re.FindAllStringSubmatch(linkJs[0], -1)
link := regexp.MustCompile(`"(.)*"`)
var linkCenter = link.FindAllStringSubmatch(execReg[0][0], -1)
resultrmvbefore := strings.TrimPrefix(linkCenter[0][0], "\"")
resultrmvafter := strings.TrimSuffix(resultrmvbefore, "\"")
var already = 0
mutex.Lock()
for _, itemURL := range errorArray {
if resultrmvafter == itemURL.Link {
already = 1
}
}
mutex.Unlock()
if already == 0{
var actualState = linkWeb{resultrmvafter, "-> " + errorCodeHTTP + "
"}
mutex.Lock()
errorArray = append(errorArray, actualState)
mutex.Unlock()
return
} else {
if already == 0 {
var actualState = linkWeb{resultrmvafter, "-> " + errorCodeHTTP + "
"}
mutex.Lock()
errorArray = append(errorArray, actualState)
var arrayReturn = errorArray
mutex.Unlock()
parseLink(resultrmvafter, arrayReturn)
}
}
wg.Done()
}()
}
wg.Wait()
return
}
func main() {
var arrayError []linkWeb
var resultArray = parseLink("https://www.golem.ai/", arrayError)
}
I just don't know if it's necessary to pass my syncGroup as a function parameter because I made a test and I don't see any changes. I read the docs but I don't know if my problem is bound to my recursive function or something that I don't understand with Golang. Thank you very much for your help :)
There is nothing inherently special about recursion w.r.t. mutexes, wait groups or other objects. It's the same as any function call. Since mutexes are mutable, you have to be careful to pass them around by pointers - and that should be that. To debug this it's often useful to printf the address of the object in caller and callee and ensure they're the same object.
To get more specific help with your code snippet, I'd suggest you minimize it to something much simpler that demonstrates your problem: https://stackoverflow.com/help/mcve
From a quick look at your code, each call to parseLink
creates a new mutex and wait group, is this what you intended?