How we decide to display them is crucial for scientific accuracy. In fact, the design frequently becomes as important as the proper analysis itself.
Picture: bad-looking, barely visible vs. good-looking statistics
There are two keys to properly displaying statistics.
A) Actually displaying as much information as possible.
For example, including individual data points is, in many cases, the most accurate way to communicate your data.
Which design do you like best? At first, you can state the number of data points in the description, but that doesn’t help much in visualizing their distribution. The violin plot style on the far right is commonly used by Nature,but in the age of computers, there is no inherent advantage of one style over the others. The only trade-off is using simple boxplots for more clarity without showing individual data points (which becomes sensible, especially if you have more than 30).
And no worries, in today’s day and age, you won’t be charged for print or number of figure panels ; )
Don’t Mislead Unintentionally
B) Avoid biasing your viewer.
The most common biases come into existence because insufficient attention was paid to how statistics are displayed.
Ever wondered why you normally don’t see pie charts in scientific publications? Because they are suboptimal to read. A bar chart is much better for that purpose. This website has an amazing “game” that tests your ability to analyze different graphs.
Ask yourself: If someone were only able to see one graph with no other context, could they draw the same conclusion as you?
Notice how design choices such as the one-sided the error bars on the left make it very difficult to assess the spread; moreover, the different geometric forms might bias our perception, with triangles “pointing” in a certain direction or some shapes appearing more similar than others. There is nothing to be said against using different shapes, but double-check whether they are appropriate.
Also consider that you might have a certain (to some extent preferred) interpretation of your data in the back of your mind, but your job is to allow readers to draw their own conclusions.
The point is that you don’t display statistics as a necessity, but as crucial information without which your data isn’t complete.
Conversely, overcrowding the figure can cause the statistics to distract from the data.
It is not uncommon that choosing the wrong layout, in this case, plotting moving averages by differences instead of values over time can quickly make a graph overloaded. It may look appealing and might have made sense with two or three datasets, but in this case it is entirely overwhelming. Read more about the figure and its designhere.
In essence: when in doubt, provide more statistical information rather than less. However, don't forget that clarity matters for the accuracy of our visual perception - data is not just data once we look at it.
When Beauty Serves A Purpose
Luckily, statistics and beauty don’t exclude each other - quite the opposite is true.
Given that statistics are dependent on our data, we have straightforward design choices that support this hierarchy.
This is what I mean - the confidence bands make the graph more informative while also improving its appearance. More about such tweaks in this amazing blog.
Lighter and more transparent colors, or clear yet thin lines, look great and don’t distract the eye from the main data, while still being easy to assess.
Statistics should be informative and clear, but visually secondary, supporting the data.
Adding information such as a barcode graph can help leverage the specific advantages of a given visualization technique and thereby allow the reader to gain a better understanding of the data. The graph above taken from this blog shows the dependency of cell lines on the gene FOXA1 - those to the left of the minus-1 reference line require the gene to survive. Note that for lower frequencies of cell lines, the barcode graph is advantageous, whereas for high densities, the bell curve excels.
And still, at some point, design and statistical expertise mix.
For instance, imagine we have to decide which variability measure to show. In bar graphs, you typically choose one of the following: standard deviation, standard error of the mean, or confidence intervals.
In the end, you can only choose one - and it matters:
I don’t want to go into too much detail; I think the book “Experimental Design and Data Analysis for Biologists” does an amazing job. In short, I would argue that using the standard deviation is most appropriate in most cases since the standard error of the mean is often inappropriately used and confidence intervals are less common (and not always properly interpreted) although they are a powerful statistical tool.
Remember, many scientists do not clearly understand the differences, and even fewer take the time to check which one is shown.
Therefore, you have to consider that the length of the error bars might bias your reader, depending on other factors such as the y-axis range.
Once again, notice how reading the graph in the upper right feels much more reliable than the one on the right. Similarly, while visualizing variation as a “smear” is a cool idea, it ultimately makes the graph more difficult to read. Read more here.
But no matter the design, make sure to add all key information in the description, even though this makes it longer.
Leverage Standards
If you deviate from standard formats, readers will notice.
This can be a good thing - psychological research shows that mild surprise & challenge increases attention.
But if you go too far, you risk confusing or even alarming your audience.
Avoid oddly unconventional designs like those used in this paper. First, using arched lines to denote significance is far less common than using straight lines. Second, the black error bars disappear against the black sample. However, aligning axis labels differently can be a good idea if you find them visually distracting. That said, readers’ eyes will have to search for them if they are not in their usual position. In principle, one could also combine both figures into a single figure with shared axes to avoid the problem altogether.
For example, few people include helper lines or display the exact numeric value above bar graphs, although this can be extremely helpful - especially during where viewers cannot simultaneously read tables.
What you shouldn’t change are functional design conventions, such as lines for significance testing, the appearance of whiskers or fundamentals:
As outlined in this book, which picks up on several pragmatic design lessons, 3D designs are suboptimal for readers. They may look fancy, but they often complicate analysis.
Of course, differentiate between contexts - presentations vs. posters vs. publications.
While the fundamental principles remain the same, formatting expectations differ. In presentations, horizontal bar graphs may be perfectly reasonable and space-saving; in manuscripts, they are less conventional.
And finally, ensure consistent design across all figures.
Especially when working with co-authors. Use the same line thickness, outlines, color palette, fonts, and maintain consistent colors for identical sample types across all figures.
What's a Society Journal? Hi Reader, let’s talk about something that has been essential to the development of the scientific system. They published the first scientific journal ever (despite it being far different from what we call a journal today). I am sure many scientists have even published in “them” without realizing it - I am referring to Society Journals. I bring this up because they might represent one of the best antidotes to predatory publishing and the larger issue of publication...
What's the Right Journal? Hi Reader, where do you normally publish your papers? But why exactly there? It remains one of the most important decisions for your career. If you’re unsure where to publish or if you’re considering switching journals, how do you find a good fit? Here is a pragmatic 7-step framework that should help you make a decision: Step 0 – Build an Initial List First, create a list of potential journals. While several will be top of mind, consider including those that you...
What Makes a Journal? Hi Reader, how many active, peer-reviewed academic journals exist today? There are 40,000! And are you interested in how many papers they publish every day? So, how can you know which journal to publish in? Today, we will discuss five key features of journals to help you differentiate them: What Differentiates Journals With so many journals available, it is easy to think that they differ only by name or impact factor. Click to enlarge. Please take these numbers with a...