This piece has been updated and amended by our Technical Content Writer, Chioma Dunkley.
Scroll through the Python Package Index and you'll find libraries for practically every data visualization need—from GazeParser for eye movement research to pastalog for realtime visualizations of neural network training. And while many of these libraries are intensely focused on accomplishing a specific task, some can be used no matter what your field.
This list is an overview of 12 interdisciplinary Python data visualization libraries, from the well-known to the obscure. Mode Python Notebooks support five libraries on this list - matplotlib, Seaborn, Plotly, pygal, and Folium - and more than 60 others that you can explore on our Notebook support page.
Two histograms (matplotlib)
matplotlib is the O.G. of Python data visualization libraries. Despite being over a decade old, it's still the most widely used library for plotting in the Python community. It was designed to closely resemble MATLAB, a proprietary programming language developed in the 1980s.
Because matplotlib was the first Python data visualization library, many other libraries are built on top of it or designed to work in tandem with it during analysis. Some libraries like pandas and Seaborn are “wrappers” over matplotlib. They allow you to access a number of matplotlib’s methods with less code.
While matplotlib is good for getting a sense of the data, it's not very useful for creating publication-quality charts quickly and easily. As Chris Moffitt points out in his overview of Python visualization tools, matplotlib “is extremely powerful but with that power comes complexity.”
matplotlib has long been criticized for its default styles, which have a distinct 1990s feel. It’s current release of matplotlib 3.4.3 still reflects this style.
Violinplot (Michael Waskom)
Seaborn harnesses the power of matplotlib to create beautiful charts in a few lines of code. The key difference is Seaborn's default styles and color palettes, which are designed to be more aesthetically pleasing and modern. Since Seaborn is built on top of matplotlib, you'll need to know matplotlib to tweak Seaborn's defaults.
Created by: Michael Waskom, available in Mode
Where to learn more: http://web.stanford.edu/~mwaskom/software/seaborn/index.html
Change in Rank (Plotnine)
Plotnine is a python implementation of ggplot2, an R plotting system, and concepts from The Grammar of Graphics. It's a powerful visualization package that you layer components to create a complete plot. For instance, you can start with axes, then add points, then a line, a trendline, etc. As a functional port of ggplot2, R programmers familiar with ggplot2 will find Plotnine easy to transition to.
Plotnine tightly integrated with pandas, so it's best to store your data in a DataFrame when using Plotnine.
Created by: Hassan Kibirige
Where to learn more: https://plotnine.readthedocs.io/en/stable/index.html
Interactive weather statistics for three cities (Bokeh)
Like ggplot, Bokeh is based on The Grammar of Graphics, but unlike ggplot, it's native to Python, not ported over from R. Its strength lies in the ability to create interactive, web-ready plots, which can be easily output as JSON objects, HTML documents, or interactive web applications. Bokeh also supports streaming and real-time data.
Bokeh provides three interfaces with varying levels of control to accommodate different user types. The highest level is for creating charts quickly. It includes methods for creating common charts such as bar plots, box plots, and histograms. The middle level has the same specificity as matplotlib and allows you to control the basic building blocks of each chart (the dots in a scatter plot, for example). The lowest level is geared toward developers and software engineers. It has no pre-set defaults and requires you to define every element of the chart.
Box plot (Florian Mounier)
Like Bokeh and Plotly, pygal offers interactive plots that can be embedded in the web browser. Its prime differentiator is the ability to output charts as SVGs. As long as you're working with smaller datasets, SVGs will do you just fine. But if you're making charts with hundreds of thousands of data points, they'll have trouble rendering and become sluggish.
Since each chart type is packaged into a method and the built-in styles are pretty, it's easy to create a nice-looking chart in a few lines of code.
Line plot (Plotly)
You might know Plotly as an online platform for data visualization, but did you also know you can access its capabilities from a Python notebook? Like Bokeh, Plotly's forte is making interactive plots, but it offers some charts you won't find in most libraries, like contour plots, dendograms, and 3D charts.
Choropleth (Andrea Cuttone)
geoplotlib is a toolbox for creating maps and plotting geographical data. You can use it to create a variety of map-types, like choropleths, heatmaps, and dot density maps. You must have Pyglet (an object-oriented programming interface) installed to use geoplotlib. Nonetheless, since most Python data visualization libraries don't offer maps, it's nice to have a library dedicated solely to them.
Scatter plot with trend line (David Robinson)
Nullity matrix (Aleksey Bilogur)
Dealing with missing data is a pain. missingno allows you to quickly gauge the completeness of a dataset with a visual summary, instead of trudging through a table. You can filter and sort data based on completion or spot correlations with a heatmap or a dendrogram.
Chart grid with consistent scales (Christopher Groskopf)
Leather's creator, Christopher Groskopf, puts it best: “Leather is the Python charting library for those who need charts now and don’t care if they’re perfect.” It's designed to work with all data types and produces charts as SVGs, so you can scale them without losing image quality. Since this library is relatively new, some of the documentation is still in progress. The charts you can make are pretty basic—but that's the intention.
Created by: Christopher Groskopf
Where to learn more: https://leather.readthedocs.io/en/latest/index.html\
Like Seaborn, Altair is a declarative visualization library that allows you to create aesthetically pleasing graphs & charts; but unlike Seaborn which is based on Matplotlib, Atair is based on Vega and Vega-Lite. It is great for creating interactive visualizations easily and quickly. One downside is that it’s latest version still does not support pie charts, though the cofounder states that the next release should. Luckily, Matplotlib can take care of pie-chart needs.
Where to learn more: https://altair-viz.github.io/index.htm
Created by: Folium
Where to learn more: https://github.com/python-visualization/folium
Other great reads on Python data visualization
There are a ton of great evaluations and overviews of Python data visualization libraries out there. Check out some of our favorites:
- One Chart, Twelve Charting Libraries (Lisa Charlotte Rost)
- Overview of Python Visualization Tools (Practical Business Python)
- Python data visualization: Comparing 7 tools (Dataquest.io)