Focus on what instead of how to plot your data
In many situations we don’t have time to think how to plot our data. To make a quick data visualization we can use the Altair Python library. Altair focuses on what instead on how to visualize the data.
Altair is a Python library built on top of Vega-Lite. Vega-Lite is a light version of Vega. It is a visualization grammar, a declarative language for describing how to make visualizations. We write these declarations in JSON format.
With high-level visualization grammars we can spend more time understanding the data. Altair is well aligned with this paradigm.
Using Altair for Python is quite simple. The most common pattern is to chain the following functions:
Chart(pd.Dataframe): create an instace of the
Chartobject using a loaded
mark_*: with all the
mark_*methods we specify how to plot the data in the
Pandasdataframe. We can pass various arguments to
encode: we define how to map the data in the columns in the loaded
Without any effort, using these 3 calls we can visualze the Iris dataset. The Python code is given below:
1 2 3 4 5 6 7 8 9 10 import altair as alt from vega_datasets import data iris = data.iris() alt.Chart(df).mark_circle(size=60).encode( x='sepalLength', y='sepalWidth', color='species', tooltip=['sepalLength', 'sepalWidth', 'petalLength', 'petalWidth'] ).interactive()
With the call to
interactive() we make the plot interactive. The resulting interactive plot is shown below:
We can also apply aggregate encodings on the data. For instance, we can split the data into bins and apply an
An example in Python is given below:
1 2 3 4 5 6 stocks = data.stocks() alt.Chart(stocks[stocks["symbol"] != "GOOG"]).mark_line().encode( x=alt.X("year(date):T", title="Year", bin=True), y="average(price)", color="symbol", )
We make a line chart using
mark_line(). We split the data into bins by the year. We select the year and make the data of temporal type with
The resulting plot is depicted below:
Altair supports as well numerous transformation of the data. They are all summarized here.
The source code for this work can be found in this Jupyter Notebook. If this is something you like and would like to see similar content you could follow me on LinkedIn or Twitter. Additionally, you can subscribe to the mailing list below to get similar updates from time to time.
Leave a comment