Using Random Numbers in Numpy#

Learning Objectives#

  • Learn to use the numpy.random module to generate random numbers.

  • Understand what pseudo-random number generators are and the importance of setting the seed.

Sampling random Numbers in Numpy#

The numpy library has a submodule called numpy.random that provides a suite of functions for generating random(ish) numbers. This suite contains functions for generating random numbers from a base distribution, such as

  • The uniform distribution between 0 and 1,

np.random.rand(10) # 10 random numbers between 0 and 1
np.random.rand(2, 3) # 2x3 array of random numbers between 0 and 1
  • The standard normal distribution,

np.random.randn(10) # 10 random numbers from a normal (Gaussian) distribution with mean 0 and standard deviation 1
np.random.randn(2, 3) # 2x3 array of random numbers from a normal distribution
  • Random integers drawn from a specified range,

np.random.randint(1, 7, 10) # 10 random integers between 1 and 6
np.random.randint(-2, 3, (5, 2)) # 5x2 array of random integers between -2 and 2

Numpy also has more bespoke functions for generating random numbers from specific distributions, such as the binomial, Poisson, and exponential distributions. For example:

  • To generate 10 random numbers from a binomial distribution with 10 trials and a success probability of 0.5, we can use

np.random.binomial(10, 0.5, 10)
  • To generate 4 random numbers from a Poisson distribution with a rate of 2, we can use

np.random.poisson(2, 4)
  • To generate a 4x7 grid of random numbers from a Gaussian distribution with a mean of 2 and a standard deviation of 0.5, we can use

np.random.normal(2, 0.5, (4, 7))

Numpy also has utilities for shuffling arrays, generating random permutations, and selecting random elements from an array. For example:

  • To shuffle an array in place, we can use

x = np.array([1, 2, 3, 4, 5])
np.random.shuffle(x)

We can also use np.random.permutation(x) to return a shuffled copy of the array.

  • To select a random element from an array, we can use

x = np.array([1, 2, 3, 4, 5])
np.random.choice(x)

The np.random.choice function also allows us to select multiple elements from an array, with or without replacement.

np.random.choice(x, 3, replace=False) # select 3 elements without replacement
np.random.choice(x, 3, replace=True) # select 3 elements with replacement

For further information on the numpy.random module, see the Numpy documentation.

Pseudo-Random Number Generators and Setting the Seed#

An enterprising student might ask, “How can a computer, a deterministic machine, generate random numbers?” The answer is that, in general, they can’t. Instead, computers use pseudo-random number generators (PRNGs) to generate sequences of numbers that appear random.

A very, very, very simple PRNG#

To demonstrate the concept of a PRNG, we give a very basic example of a PRNG that generates a sequence of numbers that appear random. This PRNG is called the “linear congruential generator” and is defined by the recurrence relation

\[ X_{n+1} = (a X_n + c) \mod m, \]

where \(X_0\) is the “seed” of the PRNG, and \(a\), \(c\), and \(m\) are constants that define the PRNG. The sequence of numbers generated by this PRNG is deterministic and periodic, but the period can be very long if a, c, and m are chosen well (typically, as large prime numbers). Consequently, for most purposes, the sequence appears random.

Setting different seeds gives us different patterns of “random” numbers. However, using the same seed gives us the same sequence of “random” numbers. This is important for debugging and reproducibility: you might have seen the line

np.random.seed(0)

in the homework tests. This line sets the seed of the PRNG to 0, ensuring the test runs the same way every time.

A note: objectively, the linear congruential generator is not the best PRNG. Modern PRNGs such as the Mersenne Twister, are much more sophisticated and have better properties. However, the linear congruential generator is simple and easy to understand, making it a good starting point for understanding PRNGs.