# Animated Visualization of Random Walks in Python

The random processes are fundamental in a plethora of engineering and science domains. They are used to model complex systems like weather forecasting, which are highly unpredictable.

In this **blog post**, we will introduce the simplest random process, i.e. the
*simple random walk*,
which is the basis and foundation for many random processes. We will simulate a random walk using the *Python*
numerical libraries like *NumPy*. Moreover, we will make an animated visualization using the *Matplotlib* Animation
API in order to emphasize the random nature of this phenomenon. This is an important part because we
convey our message more easily.
So let's begin and **stay tuned**!

## Intuition

We are not in control of everything and we can't predict all outcomes although we are used thinking that way.
For example, how could we know the exact number of people in our favorite book store exactly at 12:36? We never
know since this number is random and can't be calculated based on all observations we have. It is not possible
to fit a magical deterministic formula to give us the answer. Instead, we can think of rough estimates and let the
exact number of people be pure mere of luck. Thus, the number of visitors in the book store from minute to minute
is a **random process**.

The random processes are used to model plenty of phenomena in the exact sciences and engineering. The best possible
way to start thinking of them is to play a game with
coin tossing.
Every time we flip the coin if the outcome is *head* we win 1 dollar and of the outcome is *tail* we
lose 1 dollar. Can we know how much dollars we would win or lose after 10 consecutive coin flips? Certainly no,
but what we can know is how probable some outcome is after 10 steps. This leads us to the definition of a random walk.

## Random Walk Definition

Let \( \{\epsilon_{k}\}_{k \in \mathbb{N}} \) be a sequence of independent and identically distributed (i.i.d)
discrete random variables such that \( \mathbb{P}\{ \epsilon_{k} = \pm 1 \} = {1 \over 2} \). We can think of
this in terms of the coin-tossing we had before. With a probability of \( 0.5 \), it can happen to be a *head*,
thus winning 1 dollar, and with a probability of \( 0.5 \) it can be a *tail* thus losing 1 dollar.

Then, a *simple random walk* \( \{X_{n}\}_{n \in \mathbb{N}} \) is defined as
\[ X_{n} = {\sum_{k=1}^{n} \epsilon_{k}} \]
We can think of this random variable like the cumulative sum of dollars we would have at time step \( n \). After flipping the coin
10 times, this sum might range from \( -10 \) to \( +10 \), or in general from \( -N \) to \( +N \).
Because \( N \) might be a large number we can normalize this range to be most of the time between \( -1 \) and
\( +1 \). We can construct such a thing using the
Central Limit Theorem:
\[ {X_{N} \over \sqrt{N}} \longrightarrow {\mathcal{N}(0, 1)} \]
where \( \mathcal{N} \) denotes a
Gaussian Random Variable
with given mean and variance, in this case, \( 0 \) and \( 1 \). There is a
proof showing
that the given identity always holds.

Once we know the definition of a *simple random walk*, we can implement a simulation in Python and make a
visualization of the possible outcomes.

## Random Walk in Python

By using the *NumPy* utilities we can easily simulate a *simple random walk*. Given the number of steps \( N \)
as an input argument, we can randomly generate \( N \) samples from the set \( \{ +1, -1\} \) with an equal probability
of \( 0.5 \). Then, we will only use the *cumsum* function, to give us the cumulative sum in every time step.
The Python code is given below:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | import numpy as np np.random.seed(1234) def random_walk(N): """ Simulates a discrete random walk :param int N : the number of steps to take """ # event space: set of possible increments increments = np.array([1, -1]) # the probability to generate 1 p=0.5 # the epsilon values random_increments = np.random.choice(increments, N, p) # calculate the random walk random_walk = np.cumsum(random_increments) # return the entire walk and the increments return random_walk, random_increments # generate a random walk N = 500 X, epsilon = random_walk(N) # normalize the random walk using the Central Limit Theorem X = X * np.sqrt(1./N) |

### Animating the Random Walk

*Matplotlib* library provides an
animation API that
offers a different perspective of the plotting in Python. Instead of serving static charts, we can use an eye-catching
animation to emphasize and more efficiently transfer the message we would like to send. This comes quite handy and the
effect is furthermore amplified for time-series data such as the *simple random walk*

Using this animation API we can animate one realization of a *simple random walk* as shown in the code snippet
below:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | import matplotlib.pyplot as plt import matplotlib.animation as animation plt.style.use('seaborn') #general figure options fig = plt.figure(figsize=(15, 7)) ax = plt.axes(xlim=(0, N), ylim=(np.min(X) - 0.5, np.max(X) + 0.5)) line, = ax.plot([], [], lw=2) ax.set_title('2D Random Walk', fontsize=18) ax.set_xlabel('Steps', fontsize=16) ax.set_ylabel('Value', fontsize=16) ax.tick_params(labelsize=12) # initialization function def init(): # creating an empty plot/frame line.set_data([], []) return line, # lists to store x and y axis points xdata, ydata = [], [] # animation function def animate(i): # x, y values to be plotted y = X[i] # appending new points to x, y axes points list xdata.append(i) ydata.append(y) line.set_data(xdata, ydata) return line, # call the animator anim = animation.FuncAnimation(fig, animate, init_func=init, frames=N, interval=20, blit=True) # save the animation as mp4 video file anim.save('random_walk.gif',writer='imagemagick') |

The animation function needs three essential inputs: *i)* an instance of the figure to show the animation,
*ii)* data initialization function and *iii)* a function that defines the animation logic. Then we
provide the number of frames to draw (how many times to iterate over the data) and the delay in milliseconds
between the consecutive frames. One final option we could use is the *bit blit*, i.e. to redraw only those parts
that have changed between two consecutive frames, thus leading to faster animation. The result we get is shown in
the animation below:

The full source code behind this random walk simulation and animation can be found in this GitHub repository. For more information, please follow me on Twitter.

If you liked what you just saw, it would be really helpful to subscribe to the mailing list below. You will not get spammed that's a promise! You will get updates for the newest blog posts and visualizations from time to time.

## Conclusion

In this short tutorial blog post, we simulated and animated a simple random walk using Python's Matplotlib library. This random process, although very simple is quite important and opens the way towards the other applied random processes.

This list includes but is not limited to the *Brownian Motion* and its variants like the
*Geometric Brownian Motion*, used to model systems with random exponential growth. Moreover, it is a foundation
for the *Stochastic Differential Equations* and *Stochastic Integration*. We will cover all of these
topics in the next blog posts.

## Appendix: Central Limit Theorem Numerical Simulation

In this tutorial, we claimed that the normalized random walk follows a Gaussian distribution with mean 0 and variance 1, for which there is a strong mathematical proof. Although this is a universal truth, we can still make a numerical simulation, as shown in the code snippet below:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | num_simulations = 10000 # num of simulation N = 100000 # number od steps in each simulation dt = 1./N # the time step X_norm = [0] * num_simulations # the normalized random variable # run the simulations for i in range(num_simulations): X, _ = random_walk(N) X_norm[i] = X[N - 1] * np.sqrt(dt) print('The mean is: {0}'.format(np.mean(X_norm))) print('The var is: {0}'.format(np.var(X_norm))) plt.figure(figsize=(15, 7)) plt.grid(True, which='major', linestyle='--', color='black', alpha=0.4) plt.title('Probability distribution of X_n', fontsize=18) plt.xlabel('Value for one realization of X_n', fontsize=18) plt.ylabel('Probability Density', fontsize=22) plt.hist(X_norm, 100, density=True, color='#0492C2') |

The resulting plot of this simulation is depicted in the image below:

## Leave a comment