Why Statistic Design Matters

Hi Reader, when putting together your figures, we have to talk about statistics.

Although it might seem straightforward, how you visualize them matters a lot for scientific accuracy.

Design them properly and you will have robust as well as beautiful figures.

Let's see how they often both go hand in hand:

Ensure Accuracy

When talking about designing figures, an essential point is how we include statistics.

How we decide to display them is crucial for scientific accuracy. In fact, the design frequently becomes as important as the proper analysis itself.

Picture: bad-looking, barely visible vs. good-looking statistics

There are two keys to properly displaying statistics.

A) Actually displaying as much information as possible.

For example, including individual data points is, in many cases, the most accurate way to communicate your data.

Which design do you like best? At first, you can state the number of data points in the description, but that doesn’t help much in visualizing their distribution. The violin plot style on the far right is commonly used by Nature, but in the age of computers, there is no inherent advantage of one style over the others. The only trade-off is using simple boxplots for more clarity without showing individual data points (which becomes sensible, especially if you have more than 30).

And no worries, in today’s day and age, you won’t be charged for print or number of figure panels ; )

Don’t Mislead Unintentionally

B) Avoid biasing your viewer.

The most common biases come into existence because insufficient attention was paid to how statistics are displayed.

Ever wondered why you normally don’t see pie charts in scientific publications? Because they are suboptimal to read. A bar chart is much better for that purpose. This website has an amazing “game” that tests your ability to analyze different graphs.

Ask yourself: If someone were only able to see one graph with no other context, could they draw the same conclusion as you?

Notice how design choices such as the one-sided the error bars on the left make it very difficult to assess the spread; moreover, the different geometric forms might bias our perception, with triangles “pointing” in a certain direction or some shapes appearing more similar than others. There is nothing to be said against using different shapes, but double-check whether they are appropriate.

Also consider that you might have a certain (to some extent preferred) interpretation of your data in the back of your mind, but your job is to allow readers to draw their own conclusions.

The point is that you don’t display statistics as a necessity, but as crucial information without which your data isn’t complete.

Conversely, overcrowding the figure can cause the statistics to distract from the data.

It is not uncommon that choosing the wrong layout, in this case, plotting moving averages by differences instead of values over time can quickly make a graph overloaded. It may look appealing and might have made sense with two or three datasets, but in this case it is entirely overwhelming. Read more about the figure and its designhere.

In essence: when in doubt, provide more statistical information rather than less. However, don't forget that clarity matters for the accuracy of our visual perception - data is not just data once we look at it.

When Beauty Serves A Purpose

Luckily, statistics and beauty don’t exclude each other - quite the opposite is true.

Given that statistics are dependent on our data, we have straightforward design choices that support this hierarchy.

This is what I mean - the confidence bands make the graph more informative while also improving its appearance. More about such tweaks in this amazing blog.

Lighter and more transparent colors, or clear yet thin lines, look great and don’t distract the eye from the main data, while still being easy to assess.

Statistics should be informative and clear, but visually secondary, supporting the data.

Adding information such as a barcode graph can help leverage the specific advantages of a given visualization technique and thereby allow the reader to gain a better understanding of the data. The graph above taken from this blog shows the dependency of cell lines on the gene FOXA1 - those to the left of the minus-1 reference line require the gene to survive. Note that for lower frequencies of cell lines, the barcode graph is advantageous, whereas for high densities, the bell curve excels.

And still, at some point, design and statistical expertise mix.

For instance, imagine we have to decide which variability measure to show. In bar graphs, you typically choose one of the following: standard deviation, standard error of the mean, or confidence intervals.

In the end, you can only choose one - and it matters:

I don’t want to go into too much detail; I think the book “Experimental Design and Data Analysis for Biologists” does an amazing job. In short, I would argue that using the standard deviation is most appropriate in most cases since the standard error of the mean is often inappropriately used and confidence intervals are less common (and not always properly interpreted) although they are a powerful statistical tool.

Remember, many scientists do not clearly understand the differences, and even fewer take the time to check which one is shown.

Therefore, you have to consider that the length of the error bars might bias your reader, depending on other factors such as the y-axis range.

Once again, notice how reading the graph in the upper right feels much more reliable than the one on the right. Similarly, while visualizing variation as a “smear” is a cool idea, it ultimately makes the graph more difficult to read. Read more here.

But no matter the design, make sure to add all key information in the description, even though this makes it longer.

Leverage Standards

If you deviate from standard formats, readers will notice.

This can be a good thing - psychological research shows that mild surprise & challenge increases attention.

But if you go too far, you risk confusing or even alarming your audience.

Avoid oddly unconventional designs like those used in this paper. First, using arched lines to denote significance is far less common than using straight lines. Second, the black error bars disappear against the black sample. However, aligning axis labels differently can be a good idea if you find them visually distracting. That said, readers’ eyes will have to search for them if they are not in their usual position. In principle, one could also combine both figures into a single figure with shared axes to avoid the problem altogether.

For example, few people include helper lines or display the exact numeric value above bar graphs, although this can be extremely helpful - especially during where viewers cannot simultaneously read tables.

What you shouldn’t change are functional design conventions, such as lines for significance testing, the appearance of whiskers or fundamentals:

As outlined in this book, which picks up on several pragmatic design lessons, 3D designs are suboptimal for readers. They may look fancy, but they often complicate analysis.

Of course, differentiate between contexts - presentations vs. posters vs. publications.

While the fundamental principles remain the same, formatting expectations differ. In presentations, horizontal bar graphs may be perfectly reasonable and space-saving; in manuscripts, they are less conventional.

And finally, ensure consistent design across all figures.

Especially when working with co-authors. Use the same line thickness, outlines, color palette, fonts, and maintain consistent colors for identical sample types across all figures.

How We Feel Today

Edited by Patrick Penndorf
Connection@ReAdvance.com
Lutherstraße 159, 07743, Jena, Thuringia, Germany
Data Protection & Impressum
Unsubscribe · Preferences

A Sign For Science

SciCom - Are Your Figures Statistically Well Designed?

Why Statistic Design Matters

Ensure Accuracy

Don’t Mislead Unintentionally

When Beauty Serves A Purpose

Leverage Standards

How We Feel Today

SciCom – What Belongs In A Graphical Abstract

SciCom – What Makes a Good Graphical Abstract?

SciCom – Unique Habits That Improve Your Designs