根据该行的“权重”随机选择行

I have a table like this:

ID chance
1 1
2 2
3 4
4 1

Now I need to choose a rand() from this table

SELECT * FROM table
ORDER BY RAND()
LIMIT 1

But ID #2 has double the chances to be selected compared to ID #1 AND 4. Likewise ID #3 has four times the chances to be selected compared to ID #1 AND 4.

Somewhat similar to lottery based on chance.

This is how lottery works in some of the games. Given table similar to your example (say, we also have chance column that indicated value-based possibility of getting this particular reward), the algorithm is:

  1. Calculate total lottery value (in your example it is 1 + 2 + 4 + 1 = 8).
  2. Generate a value that is from range 1..max (max is 8 in current example) inclusive.
  3. Iterate over all items from rewards list to find the one item where sum of all previous chances is more than generated number but less or equal than

Say, we have generated number 5. Our comparison steps are:

  1. 0 < 5 <= (0) + 1 is false so ID1 is not what we get. Left side is 0 because we start our calculations from 0.
  2. 1 < 5 <= (1) + 2 is false so ID2 is not what we get.
  3. 1 + 2 < 5 <= (1 + 2) + 4 is true so we get ID3.

Example in JavaScript:

var rewards = [
  { id: 1, chance: 1 },
  { id: 2, chance: 2 },
  { id: 3, chance: 4 },
  { id: 4, chance: 1 }
];

function getRandomInt(min, max) {
  return Math.floor(Math.random() * (max - min + 1)) + min;
}

function generate() {
  var sum = 0;
  var next_sum = 0;
  var random = getRandomInt(1, rewards.reduce(function(pv, cv) {
    return pv + cv.chance;
  }, 0));

  for (var i = 0; i < rewards.length; i++) {
    next_sum = sum + rewards[i].chance;
    if ((random > sum) && (random <= next_sum)) {
      return rewards[i].id;
    }
    sum += rewards[i].chance;
  }
}

var winnerCounts = {}, i, winner;
for (i = 0; i < 8000; i++) {
  winner = generate();
  winnerCounts[winner] = (winnerCounts[winner] || 0) + 1;
}
console.log("Number of times each id was selected after %d itrations", i);
console.log(winnerCounts);

</div>

Here is SQL Fiddle with a MySQL-only solution

select * from (
  select id, @running_total as previous_total, @running_total := @running_total + chance AS running_total, until.rand
  from (
    select round(rand() * init.max) as rand from (
      select sum(chance) - 1 as max from demo
    ) as init
  ) as until,
  demo,
  ( select @running_total := 0.00 ) as vars
) as results
where results.rand >= results.previous_total and results.rand < results.running_total

The algorithm is as follows:

  1. Find out the sum of all the chances and store it in max
  2. Generate a random number in the interval [0, max)
  3. For every row, compute a previous_total (initially 0) and a current_total of the chances encountered so far
  4. Keep only the row where the generated number is in the interval [previous_total, current_total)

Because we have an even chance to pick each number in the interval [0, sum_of_all_chances), then we can asign every entry as many numbers in this interval as it has chances to get picked, ensuring an even distribution.

@running_total is just a MySQL variable and I have used ( select @running_total := 0.00 ) as vars only as a way to give it an initial value. Also, I have used ( select round(rand() * init.max) as rand from ( select sum(chance) - 1 as max from demo ) as init ) as until just as a way to sum up the chances and store the random number generated by MySQL's rand function. Hope this makes the code easy to understand.

You have a list of probabilities: 18, 28, 48 and 18 and you need to choose an item based on this. The solution is to pick a number between 1 and 8 and look it up inside these ranges [1,1] [2,3] [4,7] [8,8].

Here is SQL query that adds lower and upper bound to each row based on chances:

SELECT
    testdata.id,
    testdata.chance,
    SUM(IFNULL(prevdata.chance, 0)) + 1 AS lb,
    SUM(IFNULL(prevdata.chance, 0)) + testdata.chance AS ub
FROM testdata
LEFT JOIN testdata AS prevdata ON testdata.ID > prevdata.ID
GROUP BY testdata.id, testdata.chance

Output:

+------+--------+------+------+
| id   | chance | lb   | ub   |
+------+--------+------+------+
|    1 |      1 |    1 |    1 |
|    2 |      2 |    2 |    3 |
|    3 |      4 |    4 |    7 |
|    4 |      1 |    8 |    8 |
+------+--------+------+------+

The rest is straight forward. Choose a random number between 1 and SUM_OF_ALL_CHANCES:

SELECT @total := SUM(chance) FROM testdata;
SELECT @random := FLOOR(RAND() * @total) + 1;

And lookup that number in the above table:

SELECT
    testdata.id,
    testdata.chance,
    SUM(IFNULL(prevdata.chance, 0)) + 1 AS lb,
    SUM(IFNULL(prevdata.chance, 0)) + testdata.chance AS ub
FROM testdata
LEFT JOIN testdata AS prevdata ON testdata.ID > prevdata.ID
GROUP BY testdata.id, testdata.chance
HAVING @random BETWEEN lb AND ub

If you need in clear MySQL solution you can use this:

SELECT id FROM `table` ORDER BY -LOG(1-RAND())/chance LIMIT 1

Here is about choosing a random number from the exponential distribution http://www.tushar-mehta.com/publish_train/xl_vba_cases/0806%20generate%20random%20numbers.shtml

Simple code "just for test"

$sql = "SELECT id FROM `table` ORDER BY -LOG(1-RAND())/chance LIMIT 1";
$Res=array();
for ($i=0;$i<10000;$i++) {
    $result = mysqli_query($db,$sql);
    $row=mysqli_fetch_array($result, MYSQLI_ASSOC);
    if (isset($row['id'])) {
       echo "$i. => ".($row['id'])."
";
       if (!isset($Res[$row['id']])) $Res[$row['id']]=0;
       $Res[$row['id']]++;
    } else {
        echo ' error.432 ';exit;
    }
}

print_r($Res);

You will see that "2" twice more often than "4" or "1". And "3" twice more often than "2"