Matplotlib is the oldest and most widely-used Python library for data visualization. It was created by neurobiologist John D. Hunter to plot data of electrical activity in the brains of epilepsy patients, but today is used in a number of fields.
When analysts and data scientists use matplotlib, they're usually using it in tandem with other Python libraries. Matplotlib is designed to work with NumPy, a numerical mathematics library and is a core part of the SciPy stack—a group of scientific computing tools for Python.
Some libraries like pandas and Seaborn are “wrappers” over matplotlib—they allow you to access a number of matplotlib's methods with less code. For instance, pandas'
.plot() combines multiple matplotlib methods into a single method, so you can plot a chart in a few lines.
The large amount of code required in matplotlib to generate a nice-looking plot is often its biggest criticism. As a result, other visualization libraries like Seaborn, Bokeh, and plotly have emerged.
Here are some tutorials that offer instructions in plain English while highlighting matplotlib's capabilities.
- Simple Graphing with IPython and Pandas (Chris Moffitt): A good primer on plotting with matplotlib, pandas and NumPy. Uses sales data, so you can get an idea of what it's like to explore datasets in a business scenario.
- How to make beautiful data visualizations in Python with matplotlib (Randy Olson): There are lots of examples of basic matplotlib visualizations out there, but this tutorial takes the time to show you how to make aesthetically-pleasing plots.
- Matplotlib tutorial (Nicolas P. Rougier): Goes beyond chart type examples and digs into the details of plotting and how you can customize all kinds of properties, like controlling the width and color of lines, annotating, or adding a legend.
For examples of the visualizations you can make with matplotlib, see this gallery.