Focus on what instead of how to plot your data
In many situations we donβt have time to think how to plot our data. To make a quick data visualization we can use the Altair Python library. Altair focuses on what instead on how to visualize the data.
What
Altair is a Python library built on top of Vega-Lite. Vega-Lite is a light version of Vega. It is a visualization grammar, a declarative language for describing how to make visualizations. We write these declarations in JSON format.
Why
With high-level visualization grammars we can spend more time understanding the data. Altair is well aligned with this paradigm.
How
Using Altair for Python is quite simple. The most common pattern is to chain the following functions:
Chart(pd.Dataframe)
: create an instace of theChart
object using a loadedPandas
dataframemark_*
: with all themark_*
methods we specify how to plot the data in thePandas
dataframe. We can pass various arguments tomark_*
methods.encode
: we define how to map the data in the columns in the loadedPandas
dataframe
Without any effort, using these 3 calls we can visualze the Iris dataset. The Python code is given below:
1
2
3
4
5
6
7
8
9
10
import altair as alt
from vega_datasets import data
iris = data.iris()
alt.Chart(df).mark_circle(size=60).encode(
x='sepalLength',
y='sepalWidth',
color='species',
tooltip=['sepalLength', 'sepalWidth', 'petalLength', 'petalWidth']
).interactive()
With the call to interactive()
we make the plot interactive. The resulting interactive plot is shown below:
We can also apply aggregate encodings on the data. For instance, we can split the data into bins and apply an average
aggregation.
An example in Python is given below:
1
2
3
4
5
6
stocks = data.stocks()
alt.Chart(stocks[stocks["symbol"] != "GOOG"]).mark_line().encode(
x=alt.X("year(date):T", title="Year", bin=True),
y="average(price)",
color="symbol",
)
We make a line chart using mark_line()
. We split the data into bins by the year. We select the year and make the data of temporal type with year(date):T
.
The resulting plot is depicted below:
Altair supports as well numerous transformation of the data. They are all summarized here.
The source code for this work can be found in this Jupyter Notebook. If this is something you like and would like to see similar content you could follow me on LinkedIn or Twitter. Additionally, you can subscribe to the mailing list below to get similar updates from time to time.
Leave a comment