I'm looking to get a column sum for a dataframe using Gota's cool dataframe in go.
I see from here that there's a way to apply functions to series which seems fine, that is doing:
mean := func(s series.Series) series.Series {
floats := s.Float()
sum := 0.0
for _, f := range floats {
sum += f
}
return series.Floats(sum / float64(len(floats)))
}
df.Cbind(mean)
df.Rbind(mean)
eg, just remove the division to get the sum function rather than the mean. That said, if I'm only looking to sum 1 column, am I stuck writing my own easy sum function, or is there something more idiomatic and built-in like R's
sum(df[,c("mycol")])
?
I'm currently working with:
sum := func(s series.Series) series.Series {
floats := s.Float()
sum := 0.0
for _, f := range floats {
sum += f
}
return series.Floats(sum)
}
df.Select([]string{"mycol"})).CBind(sum)
where df after I subset to only the column of interest becomes:
[31x1] DataFrame
mycol
0: 8.300000
1: 8.300000
2: 16.750000
3: 9.030000
...
<float>
And I get something like:
cannot use sum (type func(series.Series) series.Series) as type dataframe.DataFrame in argument to df.Select([]string literal).CBind
Ah, partial solution in that the documentation uses Cbind/Rbind where I think Capply/Rapply was meant, as these are documented here. That said, the idiom question stands in that it would be great if there are built-ins I'm missing.