Problem
I am now running a regex grep to multiple files in order to get all TODOs, but it takes a couple of minutes before the commands completes:
real 5m8.073s
user 0m35.593s
sys 4m17.608s
Aim
The aim is to get the number of TODOs in no time.
Attempt
According to what I have found on the internet, I think that Golang would be a good candidate and I created the following code.
func check_for_todo(path string) {
text := "//\\sTODO\\s\\d"
b, err := ioutil.ReadFile(path)
if err != nil {
panic(err)
}
s := string(b)
// containsTodo := strings.Contains(s, text)
containsTodo, _ := regexp.MatchString(text, s)
if containsTodo {
numberOfTodos++
fmt.Println("This file contains a todo:", path)
}
}
Results
The query is now twice as fast compared to bash.
real 2m17.050s
user 0m0.015s
sys 0m0.015s
Discussion
I have the feeling that this code could be optimized and is currently devious. I am now looking into channels and goroutines.
You might want to check out the optimizations made by the silver searcher (a.k.a ag
). It does a number of optimizations in order to have extremely fast code search.
Another option might be to pre-construct an index so searches are even fast than anything performed in real time. In the ag
README, exuberant ctags is referenced which does this and could work for extremely large code bases.