Home » MATLAB

Category Archives: MATLAB

‘With the great power of graphs comes great responsibility’

Modern neuroscience overwhelmingly relies on empirical evidence: the collection of data through observation or experimentation. Once the data have been collected, they must be analyzed, summarized, and shared with others. Data visualization is a crucial part of these steps, particularly when researchers communicate their results in scientific publications.

Despite the central role that data visualization plays in modern neuroscience, it is hardly taught at the undergraduate or graduate level. From my own short experience, you mostly pick it up as you go, starting in grad school. While there is nothing inherently wrong with such a hands-on, problem-based approach to learning, accurate and efficient graphical representation of data does follow a few basic but crucial principles. Simply put:

  • make sure that you indicate what variable you are plotting (e.g. are you plotting the mean or the median? did you include a scale?)
  • display the uncertainty about the plotted variable (again, indicate what you are plotting, e.g. standard deviations or percentiles)

Easy, right? Well… When Elana Allen and colleagues, from the Mind Research Network in Albuquerque, NM, surveyed 288 articles published in 2010 in six leading neuroscience journals, they found that a significant part of the figures did not include all of these basic features. That was especially true of figures that attempted to represent more than 2 dimensions by using color or shades of gray (“3D figures” in Allen’s words): most of those omitted reporting the uncertainty of the reported effects. It is admittedly difficult to analyze multi-dimensional data sets and represent them onto a flat two-dimensional image (hence the title of Allen and colleagues’ article: “Data Visualization in the Neurosciences: Overcoming the Curse of Dimensionality“). However, more disturbingly, only 43% of the 3D figures even labelled the dependent variable to begin with. All is not well with the simpler 2D figures, either: 30% of the figures that included some measure of uncertainty (error bars) failed to indicate which one.

‘Show more, hide less’

How can scientists produce better graphical representations of their data? The authors provide a series of recommendations in the form of a great checklist. But where the paper really shines, in my opinion, is in the brilliant case studies that it provides, which underline to what extent the structure of a dataset can be hidden when depicted with an inappropriate graphical display, and how to avoid this. The article is accessible for free here, so you can check out the figures yourself. Because the article is made available under an Elsevier user license, I’m reproducing some of the figures here. The copyright remains with Elsevier. You can find more information about the Elsevier user license here.

The first example (see their Figure 2 below) covers the venerable 2D bar plot, which the authors improve upon first with box plots and then by plotting an almost complete graphical representation of the dataset using a violin plot that is way more informative than the original. Personally, I’m also enthusiast about bee-swarm plots, which plot every single data point. I know of an R package for bee-swarm plots (check out the great graphical examples!); unfortunately, I don’t know of any equivalent for MATLAB that looks as good.

Figure 2 from Allen et al., Neuron 2012. Made available under an Elsevier user license. Copyright Elsevier.

Figure 2 from Allen et al., Neuron 2012. Made available under an Elsevier user license. Copyright Elsevier.

Moving to more complex data sets, Allen and colleagues turn to EEG and event-related potentials (ERP; see their Figure 3A below). There, they suggest displaying the uncertainty around the ERP waveforms using shaded areas (don’t forget to label what measure of uncertainty you’re plotting!). This is easy to implement in MATLAB, for instance using the boundedline function, by Kelly Kearney. The authors also encourage plotting a graphical representation of the results of statistical testing on the same plot. This makes total sense, adds minimal work to preparing the figure, and should definitely be standard practice.

The last example looks truly spectacular. Allen and colleagues use both color hues and transparency to illustrate areas of the brain that undergo significant changes in activity in a task-based functional MRI dataset (see their Figure 3B below). The conventional way of representing fMRI results would be to apply a threshold to the indices of brain activity, and only show those brain regions that were beyond the threshold. However, thresholds are arbitrary, and most of the brain’s activity gets “erased” from the plot. The authors’ approach allows them to show the data’s structure in a much more thorough fashion (in this case including areas of the brain that undergo de-activation during the task, likely corresponding to the “default mode network”), without cluttering the display or making it too complicated. They also provide an example dataset and MATLAB scripts to reproduce their figure.

Figure 3 from Allen et al., Neuron 2012. Made available under an Elsevier user license. Copyright Elsevier.

Figure 3 from Allen et al., Neuron 2012. Made available under an Elsevier user license. Copyright Elsevier.

‘The jet colormap must die!’

I found one minor weakness in Allen et al.’s paper: their recommendation of color scales (or colormaps). Specifically, for bipolar data, which can range from negative through zero to positive values, they suggest using a rainbow (or “jet”) colormap: negative values are mapped to progressively lighter shades of blue, moving to green for data whose value is zero, then through yellows and oranges to reds for positive data. There are several problems about this particular colormap, detailed in multiple papers and blog posts (I took this section’s title from one such post). To summarize them briefly:

  • human vision does not perceive the color changes of the jet colormap as homogeneous, creating artificial “borders” (called Mach bands) when continuous surfaces are plotted (see the illustration below)
  • the order of the colors is arbitrary (despite being that of the rainbow)
  • the luminance of successive colors does not follow a monotonous increase or decrease
  • the presence of greens and reds make it hard to interpret by people with the most common disturbance of color vision

The bright blues and yellows of the jet colormap cause stripes to appear in the mexican hat at the top left. This is much less apparent if the jet colormap is not used to paint over continuous surfaces, such as on the mesh at the top right. A cold-to-warm colormap does not create false stripes (bottom).

For plotting bipolar data, other colormaps can avoid the “illusory border” problem, for instance a colormap that goes from cold (blue) to warm (red) colors. In the example above, I’ve taken the cold-to-warm colormap from an excellent paper on colormaps by Kenneth Moreland, of the Sandia National Laboratories, USA.

Apart from this minor objection, however, I warmly recommend reading Allen et al.’s thoughtful discussion–you will likely produce better data visualizations thanks to the authors!

(And if you still need to plot bar plots, I’ve got a great MATLAB function for you!)

References

Allen, E., Erhardt, E., & Calhoun, V. (2012). Data Visualization in the Neurosciences: Overcoming the Curse of Dimensionality Neuron, 74 (4), 603-608 DOI: 10.1016/j.neuron.2012.05.001

Moreland K. Diverging Color Maps for Scientific Visualization (Expanded). Proceedings of the 5th International Symposium on Visual Computing. 2009 December.

Grouped bar plots with error bars

I have recently attempted to use MATLAB to plot grouped bar plots (similar to the BAR(Y,’grouped’) call) together with their error bars. It’s not straightforward. There are a few user-made custom function on the File Exchange that tackle this issue, but I wasn’t all that happy with the graphic results. So I’ve made my own wrapper function that successively calls BAR, then ERRORBAR, taking care to overlay the error bars right on top of the corresponding bars. I think that the function could be useful to others, so I’ve uploaded it to the File Exchange: ERRORBAR_GROUPS produces customizable grouped bar plots with overlaid error bars.

At its most basic, this function produces bar plots similar to those obtained using MATLAB’s BAR(Y,’grouped’) function call, and then overlays error bars onto the corresponding bars.

ERRORBAR_GROUPS allows customizing the plot in several ways. For instance, both the width of the bars themselves and that of the error bars can be adjusted. The function allows asymmetric values for the lower and upper bounds of the error bars. The colors of the bars and error bars can also be customized. By default, ERRORBAR_GROUPS uses the function DISTINGUISHABLE_COLORS by Timothy E. Holy (which is a great feature, by the way!).

ERRORBAR_GROUPS allows transmitting optional input property-value pairs to both the BAR and ERRORBAR functions, making it quite versatile.

Here are some examples of what ERRORBAR_GROUPS can do.

 

Basic usage.

Basic usage. Plot 3 groups with 8 bars each and their corresponding error bars.

 

Upper bounds of the error bars only.

The upper and lower bounds of the error bars need not be the same. Here is an example with the lower bounds set to be 0, effectively plotting only the upper bounds.

 

Reduce the width of the bars and of the error bars.

When plotting smaller numbers of groups and bars, it might be visually more appealing to reduce the width of the bars and of the error bars.

 

The function can pass PropertyName - PropertyValue pairs of input arguments to both the BAR and ERRORBAR functions.

The function can pass PropertyName – PropertyValue pairs of input arguments to both the BAR and ERRORBAR functions, which allows for considerable customization!