`random` --- Generate pseudo-random numbers¶

ソースコード: Lib/random.py

このモジュールでは様々な分布をもつ擬似乱数生成器を実装しています。

整数用に、ある範囲からの一様な選択があります。シーケンス用には、シーケンスからのランダムな要素の一様な選択、リストのランダムな置換をインプレースに生成する関数、順列を置換せずにランダムサンプリングする関数があります。

実数用としては、一様分布、正規分布 (ガウス分布)、対数正規分布、負の指数分布、ガンマおよびベータ分布を計算する関数があります。角度の分布を生成するにはフォン・ミーゼス分布が利用できます。

ほとんど全てのモジュール関数は、基礎となる関数 random() に依存します。この関数はランダムな浮動小数点数を半開区間 0.0 <= X < 1.0 内に一様に生成します。Python は中心となる乱数生成器としてメルセンヌツイスタを使います。これは 53 ビット精度の浮動小数点を生成し、周期は 2**19937-1 です。本体は C で実装されていて、高速でスレッドセーフです。メルセンヌツイスタは、現存する中で最も広範囲にテストされた乱数生成器のひとつです。しかしながら、メルセンヌツイスタは完全に決定論的であるため、全ての目的に合致しているわけではなく、暗号化の目的には全く向いていません。

このモジュールで提供されている関数は、実際には random.Random クラスの隠蔽されたインスタンスのメソッドに束縛されています。内部状態を共有しない生成器を取得するため、自分で Random のインスタンスを生成することができます。

Class Random can also be subclassed if you want to use a different basic generator of your own devising: see the documentation on that class for more details.

random モジュールは SystemRandom クラスも提供していて、このクラスは OS が提供している乱数発生源を利用して乱数を生成するシステム関数 os.urandom() を使うものです。

警告

このモジュールの擬似乱数生成器をセキュリティ目的に使用してはいけません。セキュリティや暗号学的な用途については secrets モジュールを参照してください。

参考

M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator", ACM Transactions on Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998.

Complementary-Multiply-with-Carry recipe for a compatible alternative random number generator with a long period and comparatively simple update operations.

保守 (bookkeeping) 関数¶

random.seed(a=None, version=2)¶

乱数生成器を初期化します。

a が省略されるか None の場合、現在のシステム時刻が使用されます。乱数のソースがオペレーティングシステムによって提供される場合、システム時刻の代わりにそれが使用されます (利用可能性についての詳細は os.urandom() 関数を参照)。

a が int の場合、それが直接使われます。

バージョン2 (デフォルト) では、 str, bytes, bytearray オブジェクトは int に変換され、そのビットがすべて使用されます。

バージョン1 (Python の古いバージョンでの乱数列を再現するために提供される) では、 str と bytes に対して適用されるアルゴリズムは、より狭い範囲のシードを生成します。

バージョン 3.2 で変更: 文字列シードのすべてのビットを使うバージョン2スキームに移行。

バージョン 3.11 で変更: seed は次の型の内のいずれかでなければなりません: None、int、float、str、bytes、bytearray。

random.getstate()¶: 乱数生成器の現在の内部状態を記憶したオブジェクトを返します。このオブジェクトを setstate() に渡して内部状態を復元することができます。

random.setstate(state)¶: state は予め getstate() を呼び出して得ておかなくてはなりません。 setstate() は getstate() が呼び出された時の乱数生成器の内部状態を復元します。

バイト列用の関数¶

random.randbytes(n)¶

n バイトのランダムなバイト列を生成します。

セキュリティトークンを生成する目的で、このメソッドを使用しないでください。代わりに secrets.token_bytes() を使用してください。

バージョン 3.9 で追加.

整数用の関数¶

random.randrange(stop)¶

random.randrange(start, stop[, step])

Return a randomly selected element from range(start, stop, step). This is equivalent to choice(range(start, stop, step)), but doesn't actually build a range object.

The positional argument pattern matches that of range(). Keyword arguments should not be used because the function may use them in unexpected ways.

バージョン 3.2 で変更: 一様に分布した値の生成に関して randrange() がより洗練されました。以前は int(random()*n) のようなやや一様でない分布を生成するスタイルを使用していました。

バージョン 3.10 で非推奨: The automatic conversion of non-integer types to equivalent integers is deprecated. Currently randrange(10.0) is losslessly converted to randrange(10). In the future, this will raise a TypeError.

バージョン 3.10 で非推奨: The exception raised for non-integer values such as randrange(10.5) or randrange('10') will be changed from ValueError to TypeError.

random.randint(a, b)¶: a <= N <= b であるようなランダムな整数 N を返します。randrange(a, b+1) のエイリアスです。

random.getrandbits(k)¶: Returns a non-negative Python integer with k random bits. This method is supplied with the MersenneTwister generator and some other generators may also provide it as an optional part of the API. When available, getrandbits() enables randrange() to handle arbitrarily large ranges.

バージョン 3.9 で変更: k の値として 0 が許されるようになりました。

シーケンス用の関数¶

random.choice(seq)¶: 空でないシーケンス seq からランダムに要素を返します。 seq が空のときは、 IndexError が送出されます。

random.choices(population, weights=None, *, cum_weights=None, k=1)¶

population から重複ありで選んだ要素からなる大きさ k のリストを返します。population が空の場合 IndexError を送出します。

weights シーケンスが与えられた場合、相対的な重みに基づいて要素が選ばれます。あるいは、cum_weights シーケンスが与えられた場合、累積的な重み (itertools.accumulate() を用いて計算されるかもしれません) で要素が選ばれます。例えば、相対的な重み [10, 5, 30, 5] は累積的な重み [10, 15, 45, 50] と等価です。内部的には、相対的な重みは要素選択の前に累積的な重みに変換されるため、累積的な重みを渡すと手間を省けます。

weights および cum_weights が与えられなかった場合、要素は同じ確率で選択されます。重みのシーケンスが与えられた場合、その長さは population シーケンスと同じでなければなりません。weights と cum_weights を同時に与えると TypeError が送出されます。

weights や cum_weights には random() が返す float 値と相互に変換できるような、任意の数値型を使用できます (int、float、fraction を含みますが、decimal は除きます)。重みには非負で有限の値を指定することが規定されます。全ての重みが 0 の場合、例外 ValueError が送出されます。

For a given seed, the choices() function with equal weighting typically produces a different sequence than repeated calls to choice(). The algorithm used by choices() uses floating point arithmetic for internal consistency and speed. The algorithm used by choice() defaults to integer arithmetic with repeated selections to avoid small biases from round-off error.

バージョン 3.6 で追加.

バージョン 3.9 で変更: 全ての重みが 0 の場合、例外 ValueError を送出します。

random.shuffle(x)¶

シーケンス x をインプレースにシャッフルします。

イミュータブルなシーケンスをシャッフルしてシャッフルされたリストを新たに返すには、代わりに sample(x, k=len(x)) を使用してください。

たとえ len(x) が小さくても、x の並べ替えの総数 (訳注: 要素数の階乗) は大半の乱数生成器の周期よりもすぐに大きくなることに注意してください。つまり、長いシーケンスの大半の並べ替えは決して生成されないだろう、ということです。例えば、長さ 2080 のシーケンスがメルセンヌツイスタ生成器の周期に収まる中で最大のものになります。

バージョン 3.11 で変更: オプション引数 random が削除されました。

random.sample(population, k, *, counts=None)¶

母集団のシーケンスから選ばれた長さ k の一意な要素からなるリストを返します。値の置換を行わないランダムサンプリングに用いられます。

母集団自体を変更せずに、母集団内の要素を含む新たなリストを返します。返されたリストは選択された順に並んでいるので、このリストの部分スライスもランダムなサンプルになります。これにより、くじの当選者 (サンプル) を1等賞と2等賞（の部分スライス）に分けることも可能です。

母集団の要素はハッシュ可能でなくても、ユニークでなくてもかまいません。母集団が繰り返しを含む場合、出現するそれぞれがサンプルに選択されえます。

母集団に重複がある場合はその分だけ1つずつ指定するか、キーワード専用オプション引数 counts で指定することができます。例えば、 sample(['red', 'blue'], counts=[4, 2], k=5) は sample(['red', 'red', 'red', 'red', 'blue', 'blue'], k=5) と等価になります。

ある範囲の整数からサンプルを取る場合、引数に range() オブジェクトを使用してください。大きな母集団の場合、これは特に速く、メモリ効率が良いです: sample(range(10000000), k=60)。

サンプルの大きさが母集団の大きさより大きい場合 ValueError が送出されます。

バージョン 3.9 で変更: counts 引数が追加されました。

バージョン 3.11 で変更: population 変数はシーケンスでなければなりません。 setsのリストへの暗黙の型変換はサポートされていません。

実数分布¶

以下の関数は特定の実数値分布を生成します。関数の引数の名前は、一般的な数学の慣例で使われている分布の公式の対応する変数から取られています; これらの公式のほとんどはどんな統計学のテキストにも載っています。

random.random()¶: Return the next random floating point number in the range 0.0 <= X < 1.0

random.uniform(a, b)¶

Return a random floating point number N such that a <= N <= b for a <= b and b <= N <= a for b < a.

端点の値 b が範囲に含まれるかどうかは、式 a + (b-a) * random() における浮動小数点の丸めに依存します。

random.triangular(low, high, mode)¶: Return a random floating point number N such that low <= N <= high and with the specified mode between those bounds. The low and high bounds default to zero and one. The mode argument defaults to the midpoint between the bounds, giving a symmetric distribution.

random.betavariate(alpha, beta)¶: ベータ分布です。引数の満たすべき条件は alpha > 0 および beta > 0 です。 0 から 1 の範囲の値を返します。

random.expovariate(lambd)¶: 指数分布です。lambd は平均にしたい値の逆数です。(この引数は "lambda" と呼ぶべきなのですが、Python の予約語なので使えません。) 返す値の範囲は lambd が正なら 0 から正の無限大、lambd が負なら負の無限大から 0 です。

random.gammavariate(alpha, beta)¶

Gamma distribution. (Not the gamma function!) The shape and scale parameters, alpha and beta, must have positive values. (Calling conventions vary and some sources define 'beta' as the inverse of the scale).

確率分布関数は:

          x ** (alpha - 1) * math.exp(-x / beta)
pdf(x) =  --------------------------------------
            math.gamma(alpha) * beta ** alpha

random.gauss(mu=0.0, sigma=1.0)¶

正規分布、またはガウス分布と呼ばれます。 mu は平均であり、 sigma は標準偏差です。この関数は後で定義する関数 normalvariate() より少しだけ高速です。

Multithreading note: When two threads call this function simultaneously, it is possible that they will receive the same return value. This can be avoided in three ways. 1) Have each thread use a different instance of the random number generator. 2) Put locks around all calls. 3) Use the slower, but thread-safe normalvariate() function instead.

バージョン 3.11 で変更: mu と sigma にデフォルト値を追加しました。

random.lognormvariate(mu, sigma)¶: 対数正規分布です。この分布を自然対数を用いた分布にした場合、平均 mu で標準偏差 sigma の正規分布になります。 mu は任意の値を取ることができ、sigma はゼロより大きくなければなりません。

random.normalvariate(mu=0.0, sigma=1.0)¶: 正規分布です。 mu は平均で、 sigma は標準偏差です。

バージョン 3.11 で変更: mu と sigma にデフォルト値を追加しました。

random.vonmisesvariate(mu, kappa)¶: mu は平均の角度で、0 から 2*pi までのラジアンで表されます。 kappa は濃度パラメータで、ゼロ以上でなければなりません。kappa がゼロに等しい場合、この分布は範囲 0 から 2*pi の一様でランダムな角度の分布に退化します。

random.paretovariate(alpha)¶: パレート分布です。 alpha は形状パラメータです。

random.weibullvariate(alpha, beta)¶: ワイブル分布です。 alpha は尺度パラメタで、 beta は形状パラメータです。

他の生成器¶

class random.Random([seed])¶

random モジュールがデフォルトで使用する疑似乱数生成器を実装したクラスです。

バージョン 3.11 で変更: 以前は seed は任意のハッシュ可能オブジェクトが設定可能でした。現在は None, int, float, str, bytes, bytearray に制限されています。

Subclasses of Random should override the following methods if they wish to make use of a different basic generator:

seed(a=None, version=2)¶: Override this method in subclasses to customise the seed() behaviour of Random instances.

getstate()¶: Override this method in subclasses to customise the getstate() behaviour of Random instances.

setstate(state)¶: Override this method in subclasses to customise the setstate() behaviour of Random instances.

random()¶: Override this method in subclasses to customise the random() behaviour of Random instances.

Optionally, a custom generator subclass can also supply the following method:

getrandbits(k)¶: Override this method in subclasses to customise the getrandbits() behaviour of Random instances.

class random.SystemRandom([seed])¶: オペレーティングシステムの提供する発生源によって乱数を生成する os.urandom() 関数を使うクラスです。すべてのシステムで使えるメソッドではありません。ソフトウェアの状態に依存してはいけませんし、一連の操作は再現不能です。従って、 seed() メソッドは何の影響も及ぼさず、無視されます。 getstate() と setstate() メソッドが呼び出されると、例外 NotImplementedError が送出されます。

再現性について¶

Sometimes it is useful to be able to reproduce the sequences given by a pseudo-random number generator. By re-using a seed value, the same sequence should be reproducible from run to run as long as multiple threads are not running.

random モジュールのアルゴリズムやシード処理関数のほとんどは、Python バージョン間で変更される対象となりますが、次の二点は変更されないことが保証されています:

新しいシード処理メソッドが追加されたら、後方互換なシード処理器が提供されます。
生成器の random() メソッドは、互換なシード処理器に同じシードが与えられた場合、引き続き同じシーケンスを生成します。

使用例¶

基礎的な例:

>>> random()                          # Random float:  0.0 <= x < 1.0
0.37444887175646646

>>> uniform(2.5, 10.0)                # Random float:  2.5 <= x <= 10.0
3.1800146073117523

>>> expovariate(1 / 5)                # Interval between arrivals averaging 5 seconds
5.148957571865031

>>> randrange(10)                     # Integer from 0 to 9 inclusive
7

>>> randrange(0, 101, 2)              # Even integer from 0 to 100 inclusive
26

>>> choice(['win', 'lose', 'draw'])   # Single random element from a sequence
'draw'

>>> deck = 'ace two three four'.split()
>>> shuffle(deck)                     # Shuffle a list
>>> deck
['four', 'two', 'ace', 'three']

>>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement
[40, 10, 50, 30]

シミュレーション:

>>> # Six roulette wheel spins (weighted sampling with replacement)
>>> choices(['red', 'black', 'green'], [18, 18, 2], k=6)
['red', 'green', 'black', 'black', 'red', 'black']

>>> # Deal 20 cards without replacement from a deck
>>> # of 52 playing cards, and determine the proportion of cards
>>> # with a ten-value:  ten, jack, queen, or king.
>>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20)
>>> dealt.count('tens') / 20
0.15

>>> # Estimate the probability of getting 5 or more heads from 7 spins
>>> # of a biased coin that settles on heads 60% of the time.
>>> def trial():
...     return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5
...
>>> sum(trial() for i in range(10_000)) / 10_000
0.4169

>>> # Probability of the median of 5 samples being in middle two quartiles
>>> def trial():
...     return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500
...
>>> sum(trial() for i in range(10_000)) / 10_000
0.7958

サンプルの平均の信頼区間を推定するのに、重複ありでリサンプリングして統計的ブートストラップを行う例:

# https://www.thoughtco.com/example-of-bootstrapping-3126155
from statistics import fmean as mean
from random import choices

data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95]
means = sorted(mean(choices(data, k=len(data))) for i in range(100))
print(f'The sample mean of {mean(data):.1f} has a 90% confidence '
      f'interval from {means[5]:.1f} to {means[94]:.1f}')

薬と偽薬の間に観察された効果の違いについて、統計的有意性、すなわち p 値を決定するために、リサンプリング順列試験を行う例:

# Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson
from statistics import fmean as mean
from random import shuffle

drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
observed_diff = mean(drug) - mean(placebo)

n = 10_000
count = 0
combined = drug + placebo
for i in range(n):
    shuffle(combined)
    new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
    count += (new_diff >= observed_diff)

print(f'{n} label reshufflings produced only {count} instances with a difference')
print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')
print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')
print(f'hypothesis that there is no difference between the drug and the placebo.')

マルチサーバーキューにおける到達時間とサービス提供のシミュレーション:

from heapq import heapify, heapreplace
from random import expovariate, gauss
from statistics import mean, quantiles

average_arrival_interval = 5.6
average_service_time = 15.0
stdev_service_time = 3.5
num_servers = 3

waits = []
arrival_time = 0.0
servers = [0.0] * num_servers  # time when each server becomes available
heapify(servers)
for i in range(1_000_000):
    arrival_time += expovariate(1.0 / average_arrival_interval)
    next_server_available = servers[0]
    wait = max(0.0, next_server_available - arrival_time)
    waits.append(wait)
    service_duration = max(0.0, gauss(average_service_time, stdev_service_time))
    service_completed = arrival_time + wait + service_duration
    heapreplace(servers, service_completed)

print(f'Mean wait: {mean(waits):.1f}   Max wait: {max(waits):.1f}')
print('Quartiles:', [round(q, 1) for q in quantiles(waits)])

参考

Statistics for Hackers Jake Vanderplas による統計解析のビデオ。シミュレーション、サンプリング、シャッフル、交差検定といった基本的な概念のみを用いています。

Economics Simulation Peter Norvig による市場価格のシミュレーション。このモジュールが提供する多くのツールや分布 (gauss, uniform, sample, betavariate, choice, triangular, randrange) の活用法を示しています。

A Concrete Introduction to Probability (using Python) Peter Norvig によるチュートリアル。確率論の基礎、シミュレーションの書き方、Python を使用したデータ解析法をカバーしています。

レシピ¶

These recipes show how to efficiently make random selections from the combinatoric iterators in the itertools module:

def random_product(*args, repeat=1):
    "Random selection from itertools.product(*args, **kwds)"
    pools = [tuple(pool) for pool in args] * repeat
    return tuple(map(random.choice, pools))

def random_permutation(iterable, r=None):
    "Random selection from itertools.permutations(iterable, r)"
    pool = tuple(iterable)
    r = len(pool) if r is None else r
    return tuple(random.sample(pool, r))

def random_combination(iterable, r):
    "Random selection from itertools.combinations(iterable, r)"
    pool = tuple(iterable)
    n = len(pool)
    indices = sorted(random.sample(range(n), r))
    return tuple(pool[i] for i in indices)

def random_combination_with_replacement(iterable, r):
    "Choose r elements with replacement.  Order the result to match the iterable."
    # Result will be in set(itertools.combinations_with_replacement(iterable, r)).
    pool = tuple(iterable)
    n = len(pool)
    indices = sorted(random.choices(range(n), k=r))
    return tuple(pool[i] for i in indices)

The default random() returns multiples of 2⁻⁵³ in the range 0.0 ≤ x < 1.0. All such numbers are evenly spaced and are exactly representable as Python floats. However, many other representable floats in that interval are not possible selections. For example, 0.05954861408025609 isn't an integer multiple of 2⁻⁵³.

The following recipe takes a different approach. All floats in the interval are possible selections. The mantissa comes from a uniform distribution of integers in the range 2⁵² ≤ mantissa < 2⁵³. The exponent comes from a geometric distribution where exponents smaller than -53 occur half as often as the next larger exponent.

from random import Random
from math import ldexp

class FullRandom(Random):

    def random(self):
        mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
        exponent = -53
        x = 0
        while not x:
            x = self.getrandbits(32)
            exponent += x.bit_length() - 32
        return ldexp(mantissa, exponent)

All real valued distributions in the class will use the new method:

>>> fr = FullRandom()
>>> fr.random()
0.05954861408025609
>>> fr.expovariate(0.25)
8.87925541791544

The recipe is conceptually equivalent to an algorithm that chooses from all the multiples of 2⁻¹⁰⁷⁴ in the range 0.0 ≤ x < 1.0. All such numbers are evenly spaced, but most have to be rounded down to the nearest representable Python float. (The value 2⁻¹⁰⁷⁴ is the smallest positive unnormalized float and is equal to math.ulp(0.0).)

参考

Generating Pseudo-random Floating-Point Values a paper by Allen B. Downey describing ways to generate more fine-grained floats than normally generated by random().

`random` --- Generate pseudo-random numbers¶

保守 (bookkeeping) 関数¶

バイト列用の関数¶

整数用の関数¶

シーケンス用の関数¶

実数分布¶

他の生成器¶

再現性について¶

使用例¶

レシピ¶

目次

前のトピックへ

次のトピックへ

このページ

random --- Generate pseudo-random numbers¶

保守 (bookkeeping) 関数¶

バイト列用の関数¶

整数用の関数¶

シーケンス用の関数¶

実数分布¶

他の生成器¶

再現性について¶

使用例¶

レシピ¶

`random` --- Generate pseudo-random numbers¶