I have a table like this:
ID chance
1 1
2 2
3 4
4 1
Now I need to choose a rand() from this table
SELECT * FROM table
ORDER BY RAND()
LIMIT 1
But ID #2 has double the chances to be selected compared to ID #1 AND 4. Likewise ID #3 has four times the chances to be selected compared to ID #1 AND 4.
Somewhat similar to lottery based on chance.
This is how lottery works in some of the games. Given table similar to your example (say, we also have chance
column that indicated value-based possibility of getting this particular reward), the algorithm is:
1 + 2 + 4 + 1 = 8
).1..max
(max
is 8
in current example) inclusive.Say, we have generated number 5
. Our comparison steps are:
0 < 5 <= (0) + 1
is false so ID1 is not what we get. Left side is 0 because we start our calculations from 0.1 < 5 <= (1) + 2
is false so ID2 is not what we get.1 + 2 < 5 <= (1 + 2) + 4
is true so we get ID3.Example in JavaScript:
var rewards = [
{ id: 1, chance: 1 },
{ id: 2, chance: 2 },
{ id: 3, chance: 4 },
{ id: 4, chance: 1 }
];
function getRandomInt(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
function generate() {
var sum = 0;
var next_sum = 0;
var random = getRandomInt(1, rewards.reduce(function(pv, cv) {
return pv + cv.chance;
}, 0));
for (var i = 0; i < rewards.length; i++) {
next_sum = sum + rewards[i].chance;
if ((random > sum) && (random <= next_sum)) {
return rewards[i].id;
}
sum += rewards[i].chance;
}
}
var winnerCounts = {}, i, winner;
for (i = 0; i < 8000; i++) {
winner = generate();
winnerCounts[winner] = (winnerCounts[winner] || 0) + 1;
}
console.log("Number of times each id was selected after %d itrations", i);
console.log(winnerCounts);
</div>
Here is SQL Fiddle with a MySQL-only solution
select * from (
select id, @running_total as previous_total, @running_total := @running_total + chance AS running_total, until.rand
from (
select round(rand() * init.max) as rand from (
select sum(chance) - 1 as max from demo
) as init
) as until,
demo,
( select @running_total := 0.00 ) as vars
) as results
where results.rand >= results.previous_total and results.rand < results.running_total
The algorithm is as follows:
max
[0, max)
previous_total (initially 0)
and a current_total
of the chances encountered so far[previous_total, current_total)
Because we have an even chance to pick each number in the interval [0, sum_of_all_chances)
, then we can asign every entry as many numbers in this interval as it has chances to get picked, ensuring an even distribution.
@running_total
is just a MySQL variable and I have used ( select @running_total := 0.00 ) as vars
only as a way to give it an initial value. Also, I have used ( select round(rand() * init.max) as rand from ( select sum(chance) - 1 as max from demo ) as init ) as until
just as a way to sum up the chances and store the random number generated by MySQL's rand
function. Hope this makes the code easy to understand.
You have a list of probabilities: 1⁄8, 2⁄8, 4⁄8 and 1⁄8 and you need to choose an item based on this. The solution is to pick a number between 1 and 8 and look it up inside these ranges [1,1] [2,3] [4,7] [8,8]
.
Here is SQL query that adds lower and upper bound to each row based on chances:
SELECT
testdata.id,
testdata.chance,
SUM(IFNULL(prevdata.chance, 0)) + 1 AS lb,
SUM(IFNULL(prevdata.chance, 0)) + testdata.chance AS ub
FROM testdata
LEFT JOIN testdata AS prevdata ON testdata.ID > prevdata.ID
GROUP BY testdata.id, testdata.chance
Output:
+------+--------+------+------+
| id | chance | lb | ub |
+------+--------+------+------+
| 1 | 1 | 1 | 1 |
| 2 | 2 | 2 | 3 |
| 3 | 4 | 4 | 7 |
| 4 | 1 | 8 | 8 |
+------+--------+------+------+
The rest is straight forward. Choose a random number between 1 and SUM_OF_ALL_CHANCES:
SELECT @total := SUM(chance) FROM testdata;
SELECT @random := FLOOR(RAND() * @total) + 1;
And lookup that number in the above table:
SELECT
testdata.id,
testdata.chance,
SUM(IFNULL(prevdata.chance, 0)) + 1 AS lb,
SUM(IFNULL(prevdata.chance, 0)) + testdata.chance AS ub
FROM testdata
LEFT JOIN testdata AS prevdata ON testdata.ID > prevdata.ID
GROUP BY testdata.id, testdata.chance
HAVING @random BETWEEN lb AND ub
If you need in clear MySQL solution you can use this:
SELECT id FROM `table` ORDER BY -LOG(1-RAND())/chance LIMIT 1
Here is about choosing a random number from the exponential distribution http://www.tushar-mehta.com/publish_train/xl_vba_cases/0806%20generate%20random%20numbers.shtml
Simple code "just for test"
$sql = "SELECT id FROM `table` ORDER BY -LOG(1-RAND())/chance LIMIT 1";
$Res=array();
for ($i=0;$i<10000;$i++) {
$result = mysqli_query($db,$sql);
$row=mysqli_fetch_array($result, MYSQLI_ASSOC);
if (isset($row['id'])) {
echo "$i. => ".($row['id'])."
";
if (!isset($Res[$row['id']])) $Res[$row['id']]=0;
$Res[$row['id']]++;
} else {
echo ' error.432 ';exit;
}
}
print_r($Res);
You will see that "2" twice more often than "4" or "1". And "3" twice more often than "2"