tl;dr: I'm looking for methods to implement a weighted random choice based on the relative magnitude of values (or functions of values) in an array in golang. Are there standard algorithms or recommendable packages for this? Is so how do they scale?
Goals
I'm trying to write 2D and 3D markov process programs in golang. A simple 2D example of such is the following: Imagine one has a lattice, and on each site labeled by index (i,j) there are n(i,j) particles. At each time step, the program chooses a site and moves one particle from this site to a random adjacent site. The probability that a site is chosen is proportional to its population n(i,j) at that time.
Current Implementation
My current algorithm, e.g. for the 2D case on an L x L lattice, is the following:
cdfpop[i L +j]=initialpopulation[i][j]
.cdfpop[i]+=cdfpop[i-1]
.Rsite
whose range is from 1 to the largest value in the cdf (this is just the last value, cdfpop[L^2-1]
), and Rhop
whose range is between 1 and 4. The first random number chooses a weighted random site, and the second number a random direction to hop inindexhop
of cdfpop
that is greater than Rsite
. The index being hopped to is either indexhop +-1
for x direction hops or indexhop +- L
for y direction hops.cdfpop
to reflect the hop process. This means subtracting one from (adding one to) all values in cdfpop
between the index being hopped from (to) and the index being hopped to (from) depending on order.Edit: Requested Pseudocode looks like:
main(){
//import population LxL array
population:= import(population array)
//turn array into slice
for i number of rows{
cdf[ith slice of length L] = population[ith row]
}
//compute cumulant array
for i number of total sites{
cdf[i] = cdf[i-1]+cdf[i]
}
for i timesteps{
site = Randomhopsite(cdf)
cdf = Dohop(cdf, site)
}
Convertcdftoarrayandsave(cdf)
}
Randomhopsite(cdf) site{
//Choose random number in range of the cummulant
randomnumber=RandomNumber(Range 1 to Max(cdf))
site = binarysearch(cdf) // finds leftmost index such that
// cdf[i] > random number
return site
}
Dohop(cdf,site) cdf{
//choose random hop direction and calculate coordinate
randomnumber=RandomNumber(Range 1 to 4)
case{
randomnumber=1 { finalsite= site +1}
randomnumber=2 { finalsite= site -1}
randomnumber=3 { finalsite= site + L}
randomnumber=4 { finalsite= site - L}
}
//change the value of the cumulant distribution to reflect change
if finalsite > site{
for i between site and finalsite{
cdf[i]--
}
elseif finalsite < site{
for i between finalsite and site{
cdf[i]++
}
else {error: something failed}
return cdf
}
This process works really well for simple problems. For this particular problem, I can run about 1 trillion steps on a 1000x 1000 lattice in about 2 minutes on average with my current set up, and I can compile population data to gifs every 10000 or so steps by spinning a go routine without a huge slowdown.
Where efficiency breaks down
The trouble comes when I want to add different processes, with real-valued coefficients, whose rates are not proportional to site population. So say I now have a hopping rate at k_hop *n(i,j) and a death rate (where I simply remove a particle) at k_death *(n(i,j))^2. There are two slow-downs in this case:
cdfpop[i*L+j]= 4 *k_hop * pop[i][j]
for i*L+j<L*L
and cdfpop[i*L+j]= k_death*math. Power(pop[i][j],2)
for L*L<=i*L+j<2*L*L
, followed by cdfpop[i]+=cdfpop[i-1]
. I would then select a random real in the range of the cdf.This problem only gets worse if I have rates calculated as a function of populations on neighboring sites -- e.g. spontaneous particle creation depends on the product of populations on neighboring sites. While I hope to work out a way to just modify the cdf without recalculation by thinking really hard, as I try to simulate problems of increasing complexity, I can't help but wonder if there is a universal solution with reasonable efficiency I'm missing that doesn't require specialized code for each random process.
Thanks for reading!