Welcome! This document serves 2 purposes:
Want to try the code out yourself? You can clone the Github repo, or download the simulated data and the into_the_bayesian_multiverse.Rmd file to run & experiment with the code.
A multiverse analysis is an analytical tool that helps provide an understanding of whether study results hinge on decisions made during the analysis process. Briefly, a multiverse analysis means that one identifies a set of analyses methods that are all theoretically justified then conducts all of these analysis “specifications” in parallel. Then, a specification curve can be used for visualization of all of the different analysis specifications at once, as well as statistical inference.
The specification curve analyses here were particularly inspired by work from Dani Cosme and Amy Orben.
See more background on the statistics here and here.
If you’ve ever seen Spider Man: Into The Spiderverse, first of all – its a great movie. But more specifically, there are many “parallel universes” with different versions of “spider people” in each one. The “spidey-essence” is the same in all universes, but each spider-person (or pig…) is a little different based on the specifics of that universe. So, we can think of the raw data as the “spidey-essence” in a multiverse analysis, and each model specification is a different “parallel universe” of how the results of the analysis could come out when combined with a certain set of analytical decisions.
I’m not an expert, but “the multiverse” is now part of the canon of the Marvel Cinematic Universe…
Multiverse analyses are particularly useful when there are decisions we have to make during the data preparation and analysis process where we aren’t sure what the “right” answer is. If we can think of different methods for cleaning, preprocessing, excluding, transforming, or modeling the data that are theoretically justifiable (but we don’t have a certain answer for which is ‘best’) multiverse analysis allows us to run all of these theoretically justifiable analyses at the same time. This allows us to look at 2 main questions:
These analyses have been particularly useful with neuroimaging data where there are MANY decisions to be made where we don’t have consensus for an “optimal” method. However, multiverse analyses and specification curves equally useful in any topic of research where there are multiple theoretically justifiable ways an analysis can be done.
Simonsohn et al (2020) describe 3 steps for specification curve analyses:
Because the analyses presented here are considered exploratory, we aren’t going to cover the inferential procedures in step #3. For more info on this step, including applying bootstrapping & permutation testing to spec curves, see this great tutorial from Dani Cosme.
Note: these data are fake! Under our Institutional Review Board protocol and for the purpose of creating participant identities private, we cannot share our actual data publicly. However, we’ve simulated an amygdala reactivity dataset for the purposes of creating a multiverse analysis where the code can actually be run! You can find the simulated data here
library(tidyverse)
library(specr)
library(brms)
library(cowplot)
Here’s what is in each column:
id
- participant ID, identifies a participant across time pointswave
- the study time point (either 1
, 2
, or 3
)age
- participant age at the given time point, in yearsblock
- the temporal position of the task run relative to other tasks in the scanner (1
= first, 2
= second, 3
= third)motion
- head motion (mean framewise displacement), which has been z-scored herescanner
- whether the data were collected on a first MRI scanner (1
= time points 1 & 2) or a second (2
= time point 3). Both were Siemens Tim Trioprev_studied
- whether this scan was previously analyzed in similar work by Gee et al (2013). 1
indicates a scan was previously studiedAll of the rest of the columns are measurements of amygdala reactivity to fear faces > baseline for each scan, labeled such that:
ho
prefix are from amygdala ROIs defined by the Harvard-Oxford subcortical atlas, native
prefix columns are in native space defined through Freesurferright
prefix are the right amygdala, and left
are the leftbeta
prefix denote raw beta estimates of amygdala reactivity magnitude, while the tstat
prefix denote t-statistic measurements of amygdala reactivity scaled by estimation uncertainty (the standard error)fake_data = readr::read_csv('simulated_amygdala_reactivity.csv')
## Parsed with column specification:
## cols(
## id = col_double(),
## wave = col_double(),
## age = col_double(),
## block = col_double(),
## motion = col_double(),
## scanner = col_double(),
## prev_studied = col_double(),
## ho_right_amyg_beta = col_double(),
## ho_left_amyg_beta = col_double(),
## ho_right_amyg_tstat = col_double(),
## ho_left_amyg_tstat = col_double(),
## native_right_amyg_beta = col_double(),
## native_left_amyg_beta = col_double(),
## native_right_amyg_tstat = col_double(),
## native_left_amyg_tstat = col_double()
## )
head(fake_data[,1:8])
## # A tibble: 6 x 8
## id wave age block motion scanner prev_studied ho_right_amyg_beta
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 3 16.1 1 0.293 2 NA 0.785
## 2 2 1 15.2 3 0.08 1 1 0.375
## 3 2 2 17.3 1 -1.32 1 NA -0.168
## 4 2 3 19.4 1 0.063 2 NA -0.230
## 5 3 1 22.7 2 -0.986 1 NA 0.00908
## 6 7 1 5.83 1 1.18 1 NA -0.817
Here’s a plot displaying when each (fake) participant came in for visits. The x-axis indicates the age of the participants at each visit, and participants (on the y-axis) are ordered by the age at their first study visit (min_age
). Horizontal lines connecting dots indicate the multiple study visits over time for the same participant (although some participants only completed 1 visit)
designPlot = dplyr::mutate(fake_data, wave = as.factor(wave)) %>%
dplyr::group_by(id) %>%
dplyr::mutate(min_age = min(age)) %>%
ggplot(aes(x = age, y = forcats::fct_reorder(as.factor(id), min_age))) +
geom_line(aes(group = id)) +
geom_point(aes(color = wave)) +
theme_bw() + theme(axis.text.y = element_blank(), panel.grid.major = element_blank()) +
labs(y = 'Participants', x = 'Age at visit') +
guides(color = guide_legend(title = 'Study Wave')) +
theme(text = element_text(face = 'bold'),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
designPlot
specr
versionspecr is a great package with a lot of nice functionality for running specification curve analyses. This wasn’t the software we used for the main analyses (because it hadn’t yet been released yet), but it’s probably the most straightforward way to go for regression-based specification curve analyses.
First, we’ll need to define what different ‘reasonable’ and theoretically justifiable ways there are for doing this analysis of age-related change in amygdala reactivity to fear faces > baseline. For the sake of this project, it isn’t computationally feasible to try ALL reasonable ways of doing this analysis, so we can think of this approach as attempting to sample from the space of possible analyses specifications.
scanner
and block
in our group-level model, or different subsets of these.beta
images and tstat
images for 2 different amygdala reactivity estimates. The beta
is the contrast parameter estimate and represents the point estimate (beta estimate) of the regression coefficient. Thus, the beta
is the magnitude of the estimated relationship between the presence of the task stimuli and the BOLD signal. The tstat
represents the estimate scaled by the uncertainty (i.e. by the standard error of the estimate), and so it is a standardized effect size measure of the relationship between the presence of the task stimuli and the BOLD signalFirst, a little data wrangling before modeling.
grp
to index whether scans were previously studied for age-related changes in amygdala reactivity in Gee et al. (2013). The not_studied
group here can be considered a more independent sample from the previous study.fake_data = dplyr::mutate(fake_data, grp = ifelse(is.na(prev_studied), 'not_studied', 'prev_studied'))
outcomes = names(fake_data)[grepl('amyg', names(fake_data))]
Specr
has a nice feature of allowing a custom model function. For more detailed info on this, see here.
Here, we set up a multilevel model using the lme4
package’s lmer()
function to take advantage of varying intercepts for each participant (as denoted by the (1|id)
syntax in the model formula). We also include a covariate for motion
in the custom formula itself (rather than later on in the spec curve), because we want all models to include this covariate for head motion during the scan. We also need to load the lme4
and broom.mixed
packages inside the custom model function so that specr
can use them to run the model and parse the results.
lmer_model_with_motion <- function(formula, data,...) {
require(lme4)
require(broom.mixed)
# set up the model base formula (basically specr will past all other model info in here)
formula <- paste(formula, "+ motion + (1|id)")
lme4::lmer(formula, data)
}
run_specs()
How does this run_specs()
function work?
df=fake_data
: specifies the data frame to fit the model tox = 'age'
: the ‘x’ variable, or predictor of interest in the model. You can have multiple x variables if you want by specifying a vector with multiple values herey = outcomes
: here we specify a vector of multiple outcome variables, the 8 different measures of amygdala reactivitycontrols = c('block', 'scanner')
: covariates to include in the model.model = 'lmer_model_with_motion'
: specifies the custom model. If you don’t need a custom model, there are also defaults, like "lm"
to run OLS regressionsubsets = list(grp = 'not_studied')
: subsets of the data to run the model on, expressed as a list. Here, we want to run the model specifically on the not_studied
group, in addition to the full dataset.all.comb=TRUE
, this will run specifications with all possible combinations of covariates. Otherwise, it will run models with no covariates, each single covariate, and then all together. Note: random slopes are not included for these covariates by default unless we add this to the custom modelHere, 8 outcomes X 4 possible covariate specifications X 2 subsets = 64 total specifications in our multiverse
specs = specr::run_specs(df = fake_data,
x = 'age', y = outcomes,
controls = c('block', 'scanner'),
model = 'lmer_model_with_motion',
subsets = list(grp = 'not_studied'),
all.comb = TRUE)
The output of run_specs()
is a data frame with each row representing one specification from the multiverse analysis we have just run. The x
, y
, model
, controls
and subsets
columns give us information on the setup of each model, then we get a variety of statistical outputs from the fit models. Because we have 64 specifications here, the dataframe is 64 rows.
estimate
: the beta estimate in the regression model for the given x
term of interest (or the ‘slope’ for a continuous predictor). Here, these beta estimates represent estimates for age-related change in amygdala reactivity, such that negative estimates indicate age-related decreases in amygdala reactivity.std.error
: The standard error about the beta estimatestatistic
: The t-statistic for the beta estimateconf.low
: The lower bound of the 95% confidence interval for the beta estimateconf.high
: The upper bound of the 95% confidence interval for the beta estimatefit_
columns provide information on the goodness-of-fit of the model more generally (i.e. log likelihood, AIC, BIC), as well as the residual degrees of freedom.specs
## # A tibble: 64 x 18
## x y model controls effect group estimate std.error statistic conf.low
## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 age ho_r… lmer… block fixed <NA> -0.0416 0.0145 -2.87 -0.0700
## 2 age ho_l… lmer… block fixed <NA> -0.0390 0.0146 -2.67 -0.0677
## 3 age ho_r… lmer… block fixed <NA> -0.0423 0.0142 -2.97 -0.0702
## 4 age ho_l… lmer… block fixed <NA> -0.0298 0.0135 -2.21 -0.0562
## 5 age nati… lmer… block fixed <NA> -0.0313 0.0159 -1.97 -0.0624
## 6 age nati… lmer… block fixed <NA> -0.0303 0.0156 -1.94 -0.0609
## 7 age nati… lmer… block fixed <NA> -0.0318 0.0157 -2.03 -0.0624
## 8 age nati… lmer… block fixed <NA> -0.0269 0.0142 -1.89 -0.0548
## 9 age ho_r… lmer… scanner fixed <NA> -0.0405 0.0146 -2.78 -0.0690
## 10 age ho_l… lmer… scanner fixed <NA> -0.0386 0.0147 -2.63 -0.0674
## # … with 54 more rows, and 8 more variables: conf.high <dbl>, fit_sigma <dbl>,
## # fit_logLik <dbl>, fit_AIC <dbl>, fit_BIC <dbl>, fit_REMLcrit <dbl>,
## # fit_df.residual <int>, subsets <chr>
specr::plot_specs(specs)
This type of plot can be overwhelming at first because it shows us a lot of information, but it is really helpful for looking at several different things!
x
predictor (plus 95% confidence intervals) for all of our specifications. Specifications are ordered from least to greatest, and colored such that red indicates a significant negative association, gray indicates a non-significant association, and blue (not shown on this plot) indicates a significant positive association. So, here, the majority of the specifications indicate a significant negative estimate (under \(\alpha=0.05\), such that the 95% confidence interval excludes 0), suggesting a negative relationship here between age and amygdala reactivity. Here, we can see that the point estimates of all the models are negative, indicating that all the models agreed on the direction of age-related change, if not the significance.|
) indicates that the choice marked on the left was made, where a blank space it wasn’t made.
age
indicates that that the x
variable (the term being examined) in all models was age
.y
section below that, dashes on a given line indicate what y
variable for amygdala reactivity was used in a given specification (note that different choices for y
are mutually exclusive).model
section indicates that the lmer_model_with_motion
was the model used in all specifications here.controls
section shows which covariates were used in each specification, although we should remember here that ALL models had a covariate for head motion built into them (so no covariates
really means no covariates other than head motion)subsets
indicates which subset of the data the model was run on, either all
, or just the not_studied
group.ho_right_amyg_tstat
or ho_right_amyg_beta
as the y
variable tended to result in the most strongly negative estimates compared to the others (they show up towards the left side)native_left_amyg_tstat
as the y
variable tended to result in the most strongly positive estimates compared to the others (they show up towards the right side)specr
also gives us a nice plot_decisiontree()
function for summarizing what our specifications are, and how many of them there are
specr::plot_decisiontree(specs, legend = TRUE)
specr
and brms
For many analyses, we might want to use Bayesian estimation for models in our specification curve, particularly if we are getting convergence issues or singular fits with lmer
. In this cases, lmer
will refuse to let us fit a fully specified longitudinal model with varying terms (slopes) for age for each participant because of the high number of model parameters relative to data points. However, the brms R package can fit these models with nearly identical syntax to lmer
using Stan for fully Bayesian estimation under the hood (note: the high number of parameters relative to the number of observations is definitely still a problem, but with Bayesian estimation we can at least fit the model and quantify uncertainty for each parameter).
While they have many advantages, the main drawback to Bayesian models it that they take longer to fit and require more RAM than do frequentist models.
brms
model with participant-varying age termsThe brm()
function from the brms package fits models using Bayesian estimation using mostly the same syntax as lme4
models. Here, we set up a custom model function, making sure to include (age|id)
to allow age terms to vary for each participant. Also, inside the call to fit the model, we’ll add cores=4
to parallelize running of each model across 4 cores. Most laptops have 4-8 cores, so this should work on most computers, and should speed up each model by a factor of about 4.
brms_model_with_motion <- function(formula, data,...) {
require(brms)
require(broom.mixed)
# set up the model base formula (basically specr will paste all other model info in here)
formula <- paste(formula, "+ motion + (age|id)")
brms::brm(formula, data, cores = 4)
}
run_specs()
We’ll run the equivalent specifications to the previous curve, only now with model = 'brms_model_with_motion'
for our custom brms
model
brms_specs = specr::run_specs(df = fake_data,
x = 'age', y = outcomes,
controls = c('block', 'scanner'),
model = 'brms_model_with_motion',
subsets = list(grp = 'not_studied'),
all.comb = TRUE)
The summary here is quite similar to when we used the lme4
model, only now the estimate
is the median estimate for the x
term (age) across posterior samples, and conf.low
and conf.high
represent the bounds of the 95% posterior interval.
brms_specs
## # A tibble: 64 x 15
## x y model controls effect component group estimate std.error conf.low
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 age ho_r… brms… block fixed cond <NA> -0.0386 0.0156 -0.0681
## 2 age ho_l… brms… block fixed cond <NA> -0.0383 0.0154 -0.0677
## 3 age ho_r… brms… block fixed cond <NA> -0.0405 0.0152 -0.0704
## 4 age ho_l… brms… block fixed cond <NA> -0.0287 0.0147 -0.0568
## 5 age nati… brms… block fixed cond <NA> -0.0264 0.0175 -0.0601
## 6 age nati… brms… block fixed cond <NA> -0.0284 0.0165 -0.0601
## 7 age nati… brms… block fixed cond <NA> -0.0294 0.0171 -0.0622
## 8 age nati… brms… block fixed cond <NA> -0.0261 0.0153 -0.0563
## 9 age ho_r… brms… scanner fixed cond <NA> -0.0385 0.0156 -0.0696
## 10 age ho_l… brms… scanner fixed cond <NA> -0.0374 0.0157 -0.0677
## # … with 54 more rows, and 5 more variables: conf.high <dbl>,
## # fit_algorithm <chr>, fit_pss <dbl>, fit_nobs <int>, subsets <chr>
With the Bayesian models including participant-varying age terms, results are largely similar to before. The main difference is that a slightly larger proportion of the 95% posterior intervals overlap with 0 than with the previous models without participant-varying age terms.
specr::plot_specs(brms_specs)
brms
and lme4
models togetherWe can also bind the two output data frames with all the specifications we’ve already fit together, then plot a curve across all of them to more directly compare the models. Now, in the models
section of Panel B in the plot, we can tell whether the model used for each specification was the lme4
one (lmer_model_with_motion
) or the brms
one (brms_model_with_motion
)
# columns are slightly different depending on the model type, so use rbind.fill() to deal with this
all_specs = plyr::rbind.fill(specs, brms_specs)
specr::plot_specs(all_specs)
specr
is a super useful tool, but when we were first running specification curve analyses, the package hadn’t been released yet. For the analyses in the manuscript, we used mostly functions from the tidyverse, especially the purrr package for dealing with nested data, for estimation of many Bayesian models within multiverse analyses (in fact, specr
is built using purrr
). For more background on extreme tidyverse
usage and purrr
, check out this walkthrough from Monica Thieu.
While the methods we used required writing more lines of code, one very helpful thing that specr
doesn’t do automatically is do is to save the fitted model object for each specification. Storing the fitted model objects will allow us to do a few useful things:
First, we’ll create a grand mean-centered version of our age
variable called age_center
. Now, a 0 for age_center
represents the grand mean age in the dataset, or an age of 12.251. We’ll use this centered age variable for modeling.
Then, we use tidyr::pivot_longer() to reshape the data into long form, such that all amygdala reactivity estimates are in 1 column called amygdala_reactivity
, with a variable called amygdala_measure
indicating which amygdala reactivity measure it is.
fake_data_long = fake_data %>%
dplyr::mutate(age_center = age - mean(age)) %>%
tidyr::pivot_longer(contains('amyg'), names_to = 'amygdala_measure', values_to = 'amygdala_reactivity')
head(fake_data_long)
## # A tibble: 6 x 11
## id wave age block motion scanner prev_studied grp age_center
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 1 3 16.1 1 0.293 2 NA not_studied 3.87
## 2 1 3 16.1 1 0.293 2 NA not_studied 3.87
## 3 1 3 16.1 1 0.293 2 NA not_studied 3.87
## 4 1 3 16.1 1 0.293 2 NA not_studied 3.87
## 5 1 3 16.1 1 0.293 2 NA not_studied 3.87
## 6 1 3 16.1 1 0.293 2 NA not_studied 3.87
## # … with 2 more variables: amygdala_measure <chr>, amygdala_reactivity <dbl>
In this step, we first group the data by amygdala_measure
, then use tidyr::nest() to collapse the data within each amygdala measure into a single cell containing a tibble object(basically a fancy dataframe). Now, our dataset has only 8 rows, with the data
column containing an entire dataset for each amygdala measure in each cell. We also add an index
variable to keep track of the data shape here.
fake_data_nested = fake_data_long %>%
dplyr::group_by(amygdala_measure) %>%
tidyr::nest() %>%
dplyr::ungroup() %>%
dplyr::mutate(., index = 1:nrow(.))
fake_data_nested
## # A tibble: 8 x 3
## amygdala_measure data index
## <chr> <list> <int>
## 1 ho_right_amyg_beta <tibble [178 × 10]> 1
## 2 ho_left_amyg_beta <tibble [178 × 10]> 2
## 3 ho_right_amyg_tstat <tibble [178 × 10]> 3
## 4 ho_left_amyg_tstat <tibble [178 × 10]> 4
## 5 native_right_amyg_beta <tibble [178 × 10]> 5
## 6 native_left_amyg_beta <tibble [178 × 10]> 6
## 7 native_right_amyg_tstat <tibble [178 × 10]> 7
## 8 native_left_amyg_tstat <tibble [178 × 10]> 8
To keep things a little more concise for the sake of this walkthrough, we’ll pare down to a total of 16 specifications for this analysis: 8 amygdala measures (right vs. left, native space vs. Harvard-Oxford, beta vs. tstat) X 2 model types (with participant-varying slopes + intercepts vs. only participant-varying intercepts)
Now, we’ll use the purrr:map()
function inside dplyr::mutate()
. This is a lot at once, but it allows us to make new columns with brms
model objects stored in each cell. We use data
as the first argument to the purrr:map()
function to tell brms::brm()
to fit the model to the nested dataset stored in each cell of the `data column.
Since we are making 2 new columns with dplyr::mutate()
, one for each of the two model formulations (model_varying_slopes
, and model_varying_intercepts
), we’ll use tidyr::pivot_longer()
again to put all models objects into one model_object
column, with the model_type
column now indicating whether there are varying slopes or just intercepts. Note: this step takes about 15 min (on a 2019 Macbook Pro) because we are running 16 Bayesian models in a row!
nested_model_frame = fake_data_nested %>%
dplyr::group_by(amygdala_measure) %>%
dplyr::mutate(., model_varying_slopes = purrr::map(data, ~brms::brm(amygdala_reactivity ~ age_center + motion + (age|id),
data = ., cores = 4, chains = 4)),
model_varying_intercepts = purrr::map(data, ~brms::brm(amygdala_reactivity ~ age_center + motion + (1|id),
data = ., cores = 4, chains = 4))) %>%
tidyr::pivot_longer(contains('model'), names_to = 'model_type', values_to = 'model_object')
broom.mixed::tidy()
Now that we’ve run lots of models, we can use broom.mixed::tidy()
to pull out all the coefficients for each one. This function is just like the broom::tidy() function, except that it allows us to work with multilevel models (lme4
, brms
, or rstanarm
, for example).
So, well make a new dataframe called purrr_specs
where we pull the coefficients for each model (again using purrr::map()
for storing the output in individual cells). Finally, we’ll use tidyr::unnest() to ‘flatten’ the coefficient output for each model back into regular dataframe columns. In other words, we’re taking all of the coefficient information, and making 1 column for each piece of information.
purrr_specs = nested_model_frame %>%
mutate(., coefs = purrr::map(model_object, ~broom.mixed::tidy(.))) %>%
dplyr::select(., -data, -model_object) %>%
tidyr::unnest(coefs)
The output looks very similar to what we previously were working with from specr
EXCEPT that we have multiple rows for each specification here. In fact, we have 1 row for each of the coefficients in each model pulled by broom.mixed::tidy()
(we are pulling the Intercept
, age_center
, motion
fixed effects terms from the model, parameter estimates for some of the variances/covariances of the participant-varying terms).
purrr_specs
## # A tibble: 96 x 11
## # Groups: amygdala_measure [8]
## amygdala_measure index model_type effect component group term estimate
## <chr> <int> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 ho_right_amyg_be… 1 model_varyin… fixed cond <NA> (Inter… 0.597
## 2 ho_right_amyg_be… 1 model_varyin… fixed cond <NA> age_ce… -0.0412
## 3 ho_right_amyg_be… 1 model_varyin… fixed cond <NA> motion -0.298
## 4 ho_right_amyg_be… 1 model_varyin… ran_p… cond id sd__(I… 0.168
## 5 ho_right_amyg_be… 1 model_varyin… ran_p… cond id sd__age 0.0127
## 6 ho_right_amyg_be… 1 model_varyin… ran_p… cond id cor__(… -0.330
## 7 ho_right_amyg_be… 1 model_varyin… ran_p… cond Resi… sd__Ob… 0.741
## 8 ho_right_amyg_be… 1 model_varyin… fixed cond <NA> (Inter… 0.593
## 9 ho_right_amyg_be… 1 model_varyin… fixed cond <NA> age_ce… -0.0419
## 10 ho_right_amyg_be… 1 model_varyin… fixed cond <NA> motion -0.299
## # … with 86 more rows, and 3 more variables: std.error <dbl>, conf.low <dbl>,
## # conf.high <dbl>
If we want to make this look almost like the specr
output, we can filter for the model term we’re looking for. In this case, age_center
is the parameter of primary interest for looking at age-related change in amygdala reactivity for the fear > baseline contrast.
purrr_specs_age = dplyr::filter(purrr_specs, term == 'age_center')
purrr_specs_age
## # A tibble: 16 x 11
## # Groups: amygdala_measure [8]
## amygdala_measure index model_type effect component group term estimate
## <chr> <int> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 ho_right_amyg_beta 1 model_varyin… fixed cond <NA> age_c… -0.0412
## 2 ho_right_amyg_beta 1 model_varyin… fixed cond <NA> age_c… -0.0419
## 3 ho_left_amyg_beta 2 model_varyin… fixed cond <NA> age_c… -0.0417
## 4 ho_left_amyg_beta 2 model_varyin… fixed cond <NA> age_c… -0.0423
## 5 ho_right_amyg_tst… 3 model_varyin… fixed cond <NA> age_c… -0.0422
## 6 ho_right_amyg_tst… 3 model_varyin… fixed cond <NA> age_c… -0.0431
## 7 ho_left_amyg_tstat 4 model_varyin… fixed cond <NA> age_c… -0.0323
## 8 ho_left_amyg_tstat 4 model_varyin… fixed cond <NA> age_c… -0.0321
## 9 native_right_amyg… 5 model_varyin… fixed cond <NA> age_c… -0.0297
## 10 native_right_amyg… 5 model_varyin… fixed cond <NA> age_c… -0.0311
## 11 native_left_amyg_… 6 model_varyin… fixed cond <NA> age_c… -0.0301
## 12 native_left_amyg_… 6 model_varyin… fixed cond <NA> age_c… -0.0311
## 13 native_right_amyg… 7 model_varyin… fixed cond <NA> age_c… -0.0304
## 14 native_right_amyg… 7 model_varyin… fixed cond <NA> age_c… -0.0319
## 15 native_left_amyg_… 8 model_varyin… fixed cond <NA> age_c… -0.0260
## 16 native_left_amyg_… 8 model_varyin… fixed cond <NA> age_c… -0.0275
## # … with 3 more variables: std.error <dbl>, conf.low <dbl>, conf.high <dbl>
Now, we want to make our descriptive specification curve plot! We’ve created a function for this study that is very heavily inspired by Dani Cosme’s spec curve R code.
It’s a chunky function with a lot of code, but overall what it does is this:
sca_decision_frame
, a dataframe with 1 column for each design point coded as a binary decision, such that there is a |
to indicate a ‘yes’ and NA
to indicate a ‘no’ (this is important for the bottom panel). This dataframe also creates a rank
variable to rank specifications by their point estimates, and an overlap_0
variable to indicate the sign of the point estimate, plus whether the 95% posterior interval overlaps 0sca_decision_frame
to long format, then group decision points into categories for visualizationsca_decision_frame
using ggplot2 with the rank
for each specification on the x-axis and each specification’s point estimate + 95% posterior interval on the y axis. We also color each specification according to the sign of the point estimate and whether the 95% posterior interval overlaps 0. Also, add a horizontal dashed line and an additional point/error bar in black to indicate the median estimate across specifications.sca_decision_frame_long
using ggplot2 as well, with specification rank
again on the x-axis. This time, the y axis is categorical, with each level representing a different decision point. Then, we use geom_text(label=choice)
, where choice
is the variable coded as either |
or NA
for each binarized decision point. This will give us a little vertical tick on the plot for each row every time the choice
was ‘yes’, and a blank space every time the choice
was ‘no’. So, we’ll get 1 row each on the y-axis for each binarized choice, then vertical ticks marking if that choice was made or not. For each choice, we’ll also plot a black point + horizontal interval representing the median rank
and 25%-75% quantiles of the rank
when the choice
was ‘yes’.Note: this is a slightly shortened version of the function intended specifically for amygdala reactivity analyses in the current study. It will have to be somewhat modified to work more generally in other contexts.
make_sca_plot = function(specs, fork_list, plot_title, y_label, term_choice){
# compile sca decision frame
sca_decision_frame = specs %>%
ungroup %>%
dplyr::filter(term == term_choice) %>%
dplyr::arrange(estimate) %>%
mutate(., rank = 1:nrow(.),
tstat = ifelse(grepl('tstat', amygdala_measure), '|', NA),
random_slopes = ifelse(grepl('intercepts', model_type), '|', NA),
amyg_right = ifelse(grepl('right', tolower(amygdala_measure)), '|', NA),
amyg_left = ifelse(grepl('left', tolower(amygdala_measure)), '|', NA),
native_space = ifelse(grepl('native', amygdala_measure), '|', NA),
overlap_0 = case_when(
conf.low < 0 & conf.high < 0 ~ 'neg_y',
conf.low < 0 & estimate < 0 & conf.high > 0 ~ 'neg_n',
conf.low < 0 & estimate > 0 & conf.high > 0 ~ 'pos_n',
conf.low > 0 & conf.high > 0 ~ 'pos_y',
))
# median model
median_model_frame = sca_decision_frame %>%
summarise(estimate = median(estimate),
conf.low = median(conf.low), conf.high = median(conf.high),
rank= median(rank))
# order factor levels for overlap_0 variable
sca_decision_frame$overlap_0 = factor(sca_decision_frame$overlap_0,
levels = c("neg_y", "neg_n", "pos_n", "pos_y"))
# convert to long, assign decition types
sca_decision_frame_long = sca_decision_frame %>%
tidyr::pivot_longer(names_to = 'fork', values_to = 'choice', all_of(fork_list)) %>%
mutate(decisiontype = case_when(
grepl('amyg', fork) ~ 'Amygdala\n Roi',
fork == 'native_space' ~ 'Amygdala\n Roi',
fork %in% c('random_slopes', 'model_type') ~ 'Group-Level\nModel',
fork %in% c('tstat') ~ 'Subject-Level\nModel'
))
# get average rank of each amygdala_measure by beta estimate
sca_decision_frame_long_ranks = sca_decision_frame_long %>%
dplyr::filter(choice == '|') %>%
dplyr::group_by(fork) %>%
summarise(mean_rank = -1*mean(rank))
# join ranks with decision frame
sca_decision_frame_long = left_join(sca_decision_frame_long, sca_decision_frame_long_ranks)
# rename variables to be human-interpretable
sca_decision_frame_long$fork =
dplyr::recode(sca_decision_frame_long$fork,
'tstat' = 'use tstats (vs. beta estimates)',
'native_space' = 'native space (vs. mni space)',
'model_type' = 'random intercepts only (vs. random slopes)',
'amyg_right' = 'right amygdala',
'amyg_left' = 'left amygdala')
# reorder forks by mean rank
sca_decision_frame_long$fork_ordered = reorder(sca_decision_frame_long$fork,
sca_decision_frame_long$mean_rank)
# color palette to code the following:
# blue = negative, distinct from 0
# red = negative, not distinct from 0
# green = positive, not distinct from 0
# purple = positive, distinct from 0
if('neg_y' %in% sca_decision_frame$overlap_0){
my_colors <- RColorBrewer::brewer.pal(4, "Set1")[1:4]
}else if('neg_n' %in% sca_decision_frame$overlap_0){
my_colors <- RColorBrewer::brewer.pal(4, "Set1")[2:4]
}else if('pos_n' %in% sca_decision_frame$overlap_0){
my_colors <- RColorBrewer::brewer.pal(4, "Set1")[3:4]
}else{
my_colors <- RColorBrewer::brewer.pal(4, "Set1")[4:4]
}
# recode overlap 0 markings for informative legend
sca_decision_frame$overlap_0 = dplyr::recode(sca_decision_frame$overlap_0,
'neg_y' = '-, 95% PI excluding 0',
'neg_n' = '-, 95% PI including 0',
'pos_n' = '+, 95% PI including 0',
'pos_y' = '+, 95% PI excluding 0')
# summary for lower plot (median + IQR)
decision_summary = sca_decision_frame_long %>%
group_by(decisiontype, choice, fork, fork_ordered) %>%
summarise(n = n(), median_rank = median(rank),
lwr_rank = quantile(rank, .25),
upr_rank = quantile(rank, .75)) %>%
dplyr::filter(choice =='|')
# top panel
sca_top =
ggplot(sca_decision_frame, aes(x = rank, y = estimate, color = overlap_0)) +
geom_hline(yintercept = 0, lty = 1, color = 'black') +
geom_hline(yintercept = median(sca_decision_frame$estimate), color = 'black', lty = 2) +
geom_point(alpha = .5) +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0, lwd = .15, alpha = .5) +
geom_point(data = median_model_frame,aes(x = rank, y = estimate),
color = 'black') +
geom_errorbar(data = median_model_frame,
aes(x = rank, y = estimate, ymin = conf.low, ymax = conf.high),
color = 'black', width = 0) +
labs(x = '', y = y_label, title = plot_title) +
theme_classic() +
theme(legend.position = 'top', legend.title = element_blank()) +
scale_color_manual(values = my_colors)
# lower panel
sca_bottom =
ggplot(sca_decision_frame_long, aes(x = rank, y = fork_ordered, color = overlap_0)) +
geom_text(aes(label = choice), alpha = .4) +
facet_grid(rows = vars(decisiontype), drop = TRUE,
scales = 'free_y', space = 'free_y') +
geom_point(data = decision_summary, aes(x = median_rank, y = fork_ordered),
color = 'black') +
geom_errorbarh(data = decision_summary,
aes(x = median_rank, y = fork_ordered, xmin = lwr_rank, xmax = upr_rank),
color = 'black', width = 0) +
labs(x = "Analysis specifications ranked by beta estimates", y = "Decision Points") +
theme_bw() +
theme(legend.title = element_text(size = 10),
legend.text = element_text(size = 8),
axis.text = element_text(color = "black", size = 8),
legend.position = "none",
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
strip.text.y = element_text(size = 8)) +
scale_color_manual(values = my_colors)
# put together using cowplot
sca_panel = cowplot::plot_grid(sca_top, sca_bottom, ncol = 1, align = "v", axis = 'lr',
labels = c('A', 'B'))
# return a list including formatted data, as well as plots
return(list('sca_decision_frame' = sca_decision_frame,
'sca_decision_frame_long' = sca_decision_frame_long,
'sca_top' = sca_top,
'sca_bottom' = sca_bottom,
'sca_panel' = sca_panel))
}
brms
spec curveFirst, we run make_sca_plot()
to generate the plot object
age_center_spec_curve = make_sca_plot(specs = purrr_specs,
fork_list = c('tstat', 'random_slopes', 'amyg_right', 'amyg_left', 'native_space'),
y_label = 'Age-related change in amyg reactivity\nFear > Neutral Faces',
term_choice = 'age_center',
plot_title = 'Custom plot for simulated data spec curve')
Then display the sca_panel
. Although this is a slightly different set of specifications to the previous curves, we can see that the results are pretty similar – most specifications show negative age-related change estimates.
age_center_spec_curve$sca_panel
Here, we can also plot a specification curve for the regression intercept term. In this case, the intercept represents the average estimated amygdala reactivity to fear faces > baseline for a participant of age_center=0
(age ~12.3 years) and motion=0
(average head motion amounts). We can do this by setting term_choice = '(Intercept)'
and re-running make_sca_plot()
intercept_spec = make_sca_plot(specs = purrr_specs,
fork_list = c('tstat', 'random_slopes', 'amyg_right', 'amyg_left', 'native_space'),
y_label = 'Estimated Amygdala Reactivity at Age 12.3\nFear > Neutral Faces',
term_choice = '(Intercept)',
plot_title = 'Custom plot for the spec curve for the model *intercepts*')
This plot shows that all specifications estimate positive average amygdala reactivity a participant at age 12.3 and mean head motion levels, with 95% posterior intervals excluding 0.
intercept_spec$sca_panel
brms
multiverseBecause we’ve saved the fitted model objects into a nested data frame for each specification in our custom multiverse, we can also extract fitted predictions for each model. Here, we’ll do this to visualize each model’s predictions for average amygdala reactivity (for fear faces > baselines) as a function of age.
First, we’ll define a grid of predictor values for which to make predictions. Here, we’ll set motion=0
(average head motion across the dataset), and age_center=-8:10
to be able to generate a prediction for a range of ages
pred_grid = expand.grid(age_center = -8:10, motion = 0) %>%
dplyr::mutate(age = age_center + mean(fake_data$age))
head(pred_grid)
## age_center motion age
## 1 -8 0 4.251169
## 2 -7 0 5.251169
## 3 -6 0 6.251169
## 4 -5 0 7.251169
## 5 -4 0 8.251169
## 6 -3 0 9.251169
Now, we can use purrr::map()
again with stats::fitted() to generate fitted values (estimates of the linear predictor) for each specification.
newdata = pred_grid
to specify which values of the predictors to generate predictions for, and re_formula = NA
to indicate not to worry about participant-varying random effects herebroom.mixed::tidy()
, the predictions for each specification are nested into cells of the model_preds
column. We use tidyr::unnest(model_preds)
to flatten them back out and get a data frame (in long form) of predictions for each modelpurrr_specs_preds = nested_model_frame %>%
dplyr::mutate(model_preds = purrr::map(model_object, ~stats::fitted(., newdata = pred_grid, re_formula = NA) %>%
cbind(pred_grid, .))) %>%
dplyr::select(-data, -model_object) %>%
tidyr::unnest(model_preds)
For each specification, and for each level of age_center
, we get an Estimate
(the median posterior estimate), as well as the upper/lower bounds of the 95% posterior interval.
head(purrr_specs_preds)
## # A tibble: 6 x 10
## # Groups: amygdala_measure [1]
## amygdala_measure index model_type age_center motion age Estimate Est.Error
## <chr> <int> <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 ho_right_amyg_be… 1 model_vary… -8 0 4.25 0.926 0.123
## 2 ho_right_amyg_be… 1 model_vary… -7 0 5.25 0.885 0.111
## 3 ho_right_amyg_be… 1 model_vary… -6 0 6.25 0.844 0.0987
## 4 ho_right_amyg_be… 1 model_vary… -5 0 7.25 0.803 0.0875
## 5 ho_right_amyg_be… 1 model_vary… -4 0 8.25 0.762 0.0774
## 6 ho_right_amyg_be… 1 model_vary… -3 0 9.25 0.720 0.0688
## # … with 2 more variables: Q2.5 <dbl>, Q97.5 <dbl>
Now that we’ve extracted the fitted predictions for each model, we can plot the group-average ‘regression line’ for each one.
ggplot(data = purrr_specs_preds, aes(x = age, y = Estimate,
group = interaction(amygdala_measure, model_type),
color = model_type)) +
geom_hline(yintercept = 0, lty = 2) +
geom_line() +
theme_bw() +
labs(y = 'Estimated Amygdala Reactivity\nFear > Baseline')
We can also plot each specification here as a separate panel, and include the 95% posterior interval for the fitted model predictions at each level of age
. This is just about possible with 16 specifications, but many more than that and we probably won’t have room on the screen to view them all.
ggplot(data = purrr_specs_preds, aes(x = age, y = Estimate)) +
geom_hline(yintercept = 0, lty = 2) +
geom_line() +
geom_ribbon(aes(ymin = `Q2.5`, ymax = `Q97.5`), alpha = .3) +
theme_bw() +
labs(y = 'Estimated Amygdala Reactivity\nFear > Baseline',
title = 'Panel plot with fitted model predictions for each specification') +
facet_grid(rows = vars(model_type), cols = vars(amygdala_measure)) +
theme(strip.text = element_text(size = 6))
As a check of a model’s fit, it is often helpful to look at the fitted predictions overplotted on top of the raw data points. We can do that by filtering our predictions to just look at one single specification (amygdala_measure == 'ho_right_amyg_beta'
, model_type == 'model_varying_slopes'
), then plotting with ggplot()
with age on the x-axis and estimated amygdala-reactivity on the y-axis, along with the raw data used by that specification.
one_spec_preds = dplyr::filter(purrr_specs_preds, amygdala_measure == 'ho_right_amyg_beta', model_type == 'model_varying_slopes')
ggplot(fake_data, aes(x = age, y = ho_right_amyg_beta)) +
geom_hline(yintercept = 0, lty = 2) +
geom_point() +
geom_line(aes(group = id), alpha = .1) +
geom_line(data = one_spec_preds, aes(x = age, y = Estimate)) +
geom_ribbon(data = one_spec_preds, aes(x = age, y = Estimate, ymin = `Q2.5`, ymax = `Q97.5`), alpha = .3) +
theme_bw() +
labs(y = 'Estimated Amygdala Reactivity\nFear > Baseline',
title = 'Model predictions & raw data for one specification',
subtitle = 'Harvard-Oxford right amygdala betas, brms model with varying slopes')
This is a nice check of our model fits! However, the more specifications we have, the more difficult it is to look at so many plots for individual specifications like this. To that end, for this study we have made an interactive multiverse explorer Shiny app for browsing model fits across specifications.
In addition to looking at the coefficients and predictions from our different specifications, we may also want to perform posterior predictive checks for each one.
What is a posterior predictive check? * We draw a “simulated dataset” from the model’s posterior distribution, then compare the overall distribution to the real observed data (see Gabry et al., 2019). * Basically, a posterior predictive check is helpful in answering the question “is the model giving us a distribution of predictions that matches the real data”. If the distributions match, this doesn’t mean the model is good per se (the individual predictions could still be very bad), but if the distributions don’t match, then we know that the model is definitely not doing a good job at describing the data generation process.
To get these graphical summaries of the posterior predictive check, we use the pp_check() function for brms
models (I know…). Here, we’ll first combine this with purrr_map()
to draw 10 samples from the posterior predictive distribution for all observations in the real dataset.
pp_check_grid = nested_model_frame %>%
dplyr::mutate(chex = purrr::map(model_object, ~brms::pp_check(., nsamples = 10)))
# Make a new column to label all specifications
nested_model_frame = mutate(nested_model_frame,
spec = paste0(amygdala_measure, ', ', model_type))
Now, we can use cowplot::plot_grid()
again to make a panel of all 16 graphical posterior predictive checks, one for each model in our specification curve. We can use the labels
argument to label each plot for each specification. Here, \(y\) represents the distribution of actual observations, and each \(y_{rep}\) distribution is one set of the posterior draws.
cowplot::plot_grid(plotlist = pp_check_grid$chex,
labels = as.vector(nested_model_frame$spec),
label_size = 4, label_x = 0, hjust = 0)
Again, this is a more descriptive analysis – it allows us to visually compare which specifications result in a model posterior distribution that most closely matches the observed data.
We might often want to to get a specification curve for the relative model ‘goodness-of-fit’, or an approximate of predictive performance, for all specifications. We can use brms::loo() to perform approximate leave-one-out (LOO) cross-validation for each model. So, here, well use purrr::map()
again with brms:loo()
to get these estimates
loo_cv_grid = nested_model_frame %>%
dplyr::mutate(loo_cv = purrr::map(model_object, ~brms::loo(.)$estimates))
Now, we can unnest our ELPD estimates as well as their standard errors, then plot them for each specification.
# unnest and wrangle elpd estimates into the data frame
loo_unnest = unnest(loo_cv_grid, loo_cv)
loo_cv_grid$elpd_loo = as.numeric(loo_unnest$loo_cv[,1][row.names(loo_unnest$loo_cv)=='elpd_loo'])
loo_cv_grid$elpd_loo_se = as.numeric(loo_unnest$loo_cv[,2][row.names(loo_unnest$loo_cv)=='elpd_loo'])
The raw value of ELPD by itself doesn’t mean much intrinsically, but since higher (or less negative) ELPD is better, we can use this to compare expected model predictive performance. So, we can see here that model predictive performance is mostly impacted here by the amygdala measure used, although for most measures, the models with varying intercepts have slightly higher (better) ELPD than models with varying slopes.
ggplot(loo_cv_grid, aes(x = fct_reorder(spec, elpd_loo), y = elpd_loo)) +
geom_point() +
geom_errorbar(aes(ymin = elpd_loo - elpd_loo_se, ymax = elpd_loo + elpd_loo_se)) +
coord_flip() +
theme_bw() +
labs(x = 'Specification', y = 'elpd_loo')
While the values of elpd_loo
don’t tell us so much intrinsically, the relative ranking of the values tells us which models (i.e. the ones with larger elpd_loo
, towards the right of the plot) have better expected predictive performance.
Though we can’t share the real data used in the current study, hopefully these walkthroughs of multiverse analyses using publicly available simulated data have been helpful! Overall, we find that specification curves can a super useful as a way to examine the robustness of results when we’re faced with many analysis decision points. For anyone who is interested, good luck traveling the multiverse!
Questions? Feel free to reach out to paul.bloom@columbia.edu
purrr
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] cowplot_1.0.0 brms_2.12.0 Rcpp_1.0.7 specr_0.2.2
## [5] forcats_0.4.0 stringr_1.4.0 dplyr_1.0.3 purrr_0.3.3
## [9] readr_1.3.1 tidyr_1.0.2 tibble_3.1.2 ggplot2_3.3.3.9000
## [13] tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] readxl_1.3.1 backports_1.1.5 plyr_1.8.5
## [4] igraph_1.2.4.2 splines_3.6.2 crosstalk_1.0.0
## [7] rstantools_2.0.0 inline_0.3.15 digest_0.6.27
## [10] htmltools_0.4.0 viridis_0.5.1 rsconnect_0.8.16
## [13] fansi_0.5.0 magrittr_2.0.1 graphlayouts_0.7.1
## [16] modelr_0.1.5 RcppParallel_4.4.4 matrixStats_0.55.0
## [19] xts_0.12-0 prettyunits_1.1.1 colorspace_2.0-1
## [22] rvest_0.3.5 ggrepel_0.8.1 haven_2.2.0
## [25] xfun_0.19 callr_3.4.2 crayon_1.4.1
## [28] jsonlite_1.6.1 lme4_1.1-21 zoo_1.8-7
## [31] glue_1.4.2 polyclip_1.10-0 gtable_0.3.0
## [34] V8_3.0.1 pkgbuild_1.0.6 rstan_2.21.1
## [37] abind_1.4-5 scales_1.1.1 mvtnorm_1.1-0
## [40] DBI_1.1.0 miniUI_0.1.1.1 viridisLite_0.4.0
## [43] xtable_1.8-4 stats4_3.6.2 StanHeaders_2.21.0-1
## [46] DT_0.11 htmlwidgets_1.5.1 httr_1.4.1
## [49] threejs_0.3.3 RColorBrewer_1.1-2 ellipsis_0.3.2
## [52] pkgconfig_2.0.3 loo_2.2.0 farver_2.1.0
## [55] dbplyr_1.4.2 utf8_1.2.1 tidyselect_1.1.0
## [58] labeling_0.4.2 rlang_0.4.11 reshape2_1.4.3
## [61] later_1.0.0 munsell_0.5.0 cellranger_1.1.0
## [64] tools_3.6.2 cli_2.5.0 generics_0.0.2
## [67] broom_0.7.4 ggridges_0.5.2 evaluate_0.14
## [70] fastmap_1.0.1 yaml_2.2.1 processx_3.4.2
## [73] knitr_1.30 fs_1.3.1 tidygraph_1.2.0
## [76] ggraph_2.0.4 nlme_3.1-142 mime_0.9
## [79] xml2_1.2.2 compiler_3.6.2 bayesplot_1.7.1
## [82] shinythemes_1.2.0 rstudioapi_0.10 curl_4.3
## [85] reprex_0.3.0 tweenr_1.0.1 stringi_1.4.6
## [88] ps_1.3.2 Brobdingnag_1.2-6 lattice_0.20-38
## [91] Matrix_1.2-18 nloptr_1.2.1 markdown_1.1
## [94] shinyjs_1.1 vctrs_0.3.8 pillar_1.6.1
## [97] lifecycle_1.0.0 bridgesampling_1.0-0 httpuv_1.5.2
## [100] R6_2.5.0 promises_1.1.0 gridExtra_2.3
## [103] codetools_0.2-16 boot_1.3-23 colourpicker_1.0
## [106] MASS_7.3-51.4 gtools_3.8.1 assertthat_0.2.1
## [109] withr_2.4.2 shinystan_2.5.0 parallel_3.6.2
## [112] hms_0.5.3 grid_3.6.2 coda_0.19-3
## [115] minqa_1.2.4 rmarkdown_2.1 ggforce_0.3.1
## [118] shiny_1.4.0 lubridate_1.7.4 base64enc_0.1-3
## [121] dygraphs_1.1.1.6