Figure. Two plots of the same data. The first uses all the default settings in matplotlib, and the second shows what you can do with just a little extra effort. When communicating your scientific results in a publication or presentation, this effort is worth the investment.
I’m always a little surprised when someone is giving a slick slide presentation, and then shows figures made with old-school, ugly, default settings from, say, Excel or Gnuplot. With a little effort, it’s actually possible to make beautiful and high-quality figures with many plotting tools. I’m no visualization expert, but am very happy with what I can achieve using matplotlib.
See matplotlib’s documentation for plot() for all the basic options for marker shapes, colors, line styles, etc.
Here are some things to keep in mind when making figures:
- The default font size, line width, marker size, etc. are usually too small.
- If applicable, use markers to communicate the discreteness of your data.
- Avoid light colors, since they might be difficult to see in printouts or projector presentations.
- The legend should be equally useful if the figure is printed in color or grayscale.
- The automatically selected location for the legend might not be optimal.
- Label axes with units if applicable and label the figure itself with a title.
- The automatically selected ranges on the axes might not be exactly what you want.
- You may want to explore outside of the default color palette.
- Image file types come in many flavors; I like to generate PDFs of figures
- If possible, make the order of the legend entries correspond to the (top-to-bottom, left-to-right, clockwise, counterclockwise, or other) layout of what is being plotted, e.g., if line A is above line B, then label ‘A’ should appear above label ‘B’ in the legend.
- Try to avoid using both red and green in your legend. According to Wikipedia’s page on color blindness, about 8% of males (and 0.5% of females) are color blind, and red-green color blindness is most common.
- For publications, print out your paper (with figures) to see how they look when printed in grayscale (always) and color (if applicable).
- Be creative and have fun communicating your data!
If you haven’t already, check out matplotlib, including their awesome gallery and examples, which come complete with code!
Download scipub.py and the sample data. Put them in the same directory and unzip the latter.
From the command line, make both naive and effective plots of the data:
$ python scipub.py
Do the same thing from the Python shell:
>>> execfile('scipub.py')
NumPy, tabular, IPython and matplotlib.
Input files: Sample data (.zip containing three small text files)
Python code: scipub.py
Output files: “Before and after” naive plot and illustrative plot
Script that makes two plots of the same data contained in the F2files directory.
The first plot basically uses all the default settings in matplotlib, and the second shows what you can do with just a little extra effort.
Below is a slightly modified version of the metadata and instructions given to me. The details are not important.
“In all files in the directory F2files, the first column is stream length (x-axis) and the THIRD column is prover time (y-axis). These are the only two columns relevant to the plots.
“Each file corresponds to a different “line” in the graph. There are three files: one for the multiround prover, one for the single round prover without FFT techniques, and one for the single round prover with FFT techniques. The file names say which is which, and the legend should identify the lines accordingly.”
Usage: python scipub.py
Author: Elaine Angelino <elaine at eecs dot harvard dot edu>
Copyright 2011
Make an effective plot in matplotlib by specifying custom settings.
Parameters
root : str
Name of the input directory containing the data files.flist : list of strings
List of the names of the data files in the input directory. Their order gives the order in which we plot the lines. We want control over this in designing a legend.fout : str
Name of output file for matplotlib figure. Use an extension like ‘.pdf’ or ‘.png’.legend : list of strings
Labels to appear in the legend. We specify these by hand for greater clarity. Specify one label corresponding to the data for each file.marker_list : list of strings
Specify a different marker shape for plotting data for each file.color_list : list of strings
Specify a different color for plotting data for each file.