# Visualizing data with raincloud plots

## Table of Contents

## Motivation

I love effective data visualization that combines the distribution of the data, individual data points, and summary statistics. I recently discovered a new tool for doing just this, with the R package {raincloudplots}. The ‘rain’ stands for the raw data, and the ‘cloud’ stands for the data distribution. In this post, I showcase my first raincloud plot, and how to recreate it.

## Step 1: Install {raincloudplots}

First, we’ll install the raincloudplots package from GitHub and load it.

```
if (!require(remotes)) {
install.packages("remotes")
}
remotes::install_github('jorvlan/raincloudplots')
library(raincloudplots)
```

## Step 2: Simulate some data

In one of my research projects, I’m planning to explore whether crowdsourced data analysis, also known as **multi-analyst** studies — giving the same dataset to different teams of scientists, who independently analyze it to answer the same research question — can increase the sway and credibility of scientific research. That is, are the results of multi-analyst (vs. single-analyst) studies more likely to sway someone’s prior beliefs?

## Give me some more background information on the data and research setting

Before reading the results of a scientific study, study participants will report their prior beliefs in a specific phenomenon on a scale of 0% (very unlikely) to 100% (very likely).

```
graph LR
A[Report <br> prior beliefs]
A ==>B[Read <br> single-analyst results]
A ==>C[Read consistent <br> multi-analyst results]
A ==>D[Read inconsistent <br> multi-analyst results]
E[Report <br> final beliefs]
B ==> E
C ==> E
D ==> E
style A fill:#f6d4d1,stroke:#333,stroke-width:4px
style B fill:#f6d4d1,stroke:#333,stroke-width:4px
style C fill:#f6d4d1,stroke:#333,stroke-width:4px
style D fill:#f6d4d1,stroke:#333,stroke-width:4px
style E fill:#f6d4d1,stroke:#333,stroke-width:4px
```

Afterwards, study participants will be randomly allocated to one of three groups: one group will read the results of a conventional, single-analyst study; one group will read the results of a multi-analyst study with consistent results; one group will read the results of a multi-analyst study with inconsistent result.

After reading the study results, the study participants will again report their belief in the phenomenon on a scale of 0% (very unlikely) to 100% (very likely).

Below, I simulate some data that are in line with the hypotheses we’re planning to preregister: we expect that, compared to single-analyst studies, multi-analyst studies with inconsistent, highly variable results will negatively affect prior beliefs (i.e., research consumers will be less likely to believe in the reported phenomenon), while multi-analyst studies with consistent, positive results will positively affect prior beliefs.

## Give me some more information about the simulation

For the **prior beliefs** of all three participant groups, we’ll make 500 random draws from a normal distribution with `𝜇 = 65`

and `𝜎 = 5`

.

For the **final beliefs** of participants reading a *single-analyst* study, we’ll draw from a normal distribution with `𝜇 = 72`

and `𝜎 = 6`

. For the final beliefs of participants reading a *multi-analyst study with consistent results*, we’ll draw from a normal distribution with `𝜇 = 75`

and `𝜎 = 7`

.

For the remaining group (the *inconsistent multi-analyst* condition), we’ll draw from a normal distribution with `𝜇 = 55`

and `𝜎 = 9`

. This reflects our hypothesis that beliefs in a phenomenon will decrease after observing a multi-analyst study with highly variable results.

```
set.seed(3)
prior <- replicate(n = 3, rnorm(n = 500, mean = 65, sd = 5))
final_single <- rnorm(n = 500, mean = 72, sd = 6)
final_multi_consistent <- rnorm(n = 500, mean = 75, sd = 7)
final_multi_inconsistent <- rnorm(n = 500, mean = 55, sd = 9)
```

## Step 3: Initialize the data-format

For this step, you’ll need to choose between several possible designs that you can find here: e.g., a 1-by-1, 2-by-2, or 2-by-3 (repeated measures) raincloud. In my case, I have 3 different groups (the *single-analyst* condition, the *consistent multi-analyst* condition, and the *inconsistent multi-analyst* condition) with 2 measures each (*prior beliefs* and *final beliefs*), so I’m creating a 2 x 3 raincloud.

The {raincloudplot} function to set up the desired data format for a 2 x 3 raincloud is called `data_2x2()`

. (Confusing, I know). For other options and how to initialize them, check out this page.

```
df_2x3 <- data_2x2(
array_1 = prior[,1],
array_2 = final_single,
array_3 = prior[,2],
array_4 = final_multi_consistent,
array_5 = prior[,3],
array_6 = final_multi_inconsistent,
labels = (c('Prior Beliefs','Final Beliefs')),
jit_distance = .09,
jit_seed = 321)
```

## Step 4: Make it rain! 🌧

Finally, we’ll use the `raincloud_2x3_repmes()`

function to create our 2x3 raincloud.

```
colors <- rep(c("dodgerblue", "darkorange"), 3) #choose colors
raincloud_2x3_repmes(
data = df_2x3,
colors = colors,
fills = colors,
size = 1,
alpha = .6,
ort = "h") + #set to v for vertical plot
scale_x_continuous(
breaks = c(1,2,3),
limits = c(0.8, 4.3),
labels = rep("", 3)) +
ylab("Rated Beliefs") +
annotate(geom = "text",
label = "Single-Analyst",
x = 1.5, y = 42) +
annotate(geom = "text",
label = "Multi-Analyst: Consistent",
x = 2.5, y = 38) +
annotate(geom = "text",
label = "Multi-Analyst: Inconsistent",
x = 3.9, y = 38) +
annotate(geom = "text",
label = "Prior Beliefs",
x = 4.2, y = 55, size = 5,
color = "dodgerblue") +
annotate(geom = "text",
label = "vs.",
x = 4.2, y = 66, size = 5) +
annotate(geom = "text",
label = "Final Beliefs",
x = 4.2, y = 77, size = 5,
color = "darkorange") +
theme_classic() +
theme(axis.ticks.y = element_blank(),
axis.text = element_text(size = 9),
axis.title.y = element_blank())
```

And we’re done! For more options and information, feel free to check out the paper cited below. A huge thank you to the {raincloudplots} package developers.❤︎

## Acknowledgements

Allen M, Poggiali D, Whitaker K et al. Raincloud plots: a multi-platform tool for robust data visualization [version 2; peer review: 2 approved]. Wellcome Open Res 2021, 4:63. DOI: 10.12688/wellcomeopenres.15191.2