基于权重的随机选择

2022-09-28

开发中经常会有需要随机选择某个物品的需求，如果是要求概率均等的，就很容易实现：

import random

items = ['A', 'B', 'C']
n = len(items)
i = random.randint(0, n - 1)
print(items[i])

或者直接使用random.choice:

print(random.choice(items))

但是如果要求每个元素有自己的权重呢？

可以先把所有的物品的权重都加起来，设为 s
随机出一个 0 到 s 之间的数，设为 c
遍历所有的物品，每次用 c 减去物品的权重，直至 c 小于 0，此时的物品就是随机的结果

用代码表示为：

def weighted_choice(items, weights):
    s = sum(weights)
    c = random.random() * s
    for i, w in enumerate(weights):
        c -= w
        if c < 0:
            return items[i]

还有一种方法是：

遍历每个物品，给每个物品维护一个权重的累加值，记为 cur_total，并存到一个列表中，记为 totals
然后随机出一个 0 到最大的 cur_total 之间的数字，记为 c
此时因为 totals 是有序的，可以用二分查找，找出 c 落在哪个位置

用代码表示为：

import random
import bisect

def weighted_choice(items, weights):
    totals = []
    cur_total = 0

    for w in weights:
        cur_total += w
        totals.append(cur_total)

    c = random.random() * cur_total
    i = bisect.bisect_right(totals, c)
    return items[i]

自从 python3.6 以后，标准库里直接加了一个函数 random.choices 用的就是第二种方法。

所以在 python 里之后就不用自己实现了，直接用标准库。如果用别的语言，可能还是会用到。

基于权重的随机选择

参考