#Overview In this demo, I will show how to model intensive longitudinal data of distinguishable dyads using a Bayesian approach.

For those less familiar with dyadic analysis, it is an extremely interesting and important type of analysis that is useful for cases where we are interested in examining multiple outcomes with meaningful grouping. A classic example of this is the case of married couples, where X and Y variables are measured repeatedly for each partner in a couple. As may seem obvious, when working with individuals who are grouped in dyads, we cannot treat the individuals in the dyad as being separate individuals. They are nonindependent and by virtue of their relationship, they are going to show associations in their data that would not be expected for two randomly paired individuals.

The “distinguishable” part of distinguishable dyads indicates that the individuals within a dyad have some differentiating feature that they carry with them for the duration of the study. Returning to the earlier example, heterosexual married couples are an example of a distinguishable dyad: There is a wife and a husband, and these roles are fixed throughout the study. Parent and child would be another example, or supervisor and employee.

The analysis of dyadic intensive longitudinal data, such as those collected from daily diary studies or laboratory studies with intensively sampled measurements, is discussed in the book Intensive Longitudinal Methods: An Introduction to Diary and Experience Sampling Research by Niall Bolger and J-P Laurenceau (2013). The purpose of this demo is to show how their example analysis from Chapter 8 can be implemented in a Bayesian framework using the brms package in R.

brms offers some key advantages to this kind of analysis. First, as discussed in Chapter 8, most multilevel software programs can only handle a single outcome. But, dyadic data are multivariate: There is an outcome measure for each partner. Typically, this issue can be circumvented by stacking one’s dataset and using dummy coded variables to estimate effects separately for each dyad member within one model. brms can handle true multivariate models, however, which bypasses the need for stacking the data, thereby saving some time at the data processing stage. Second, currently available R packages for multilevel modeling using maximum likelihood estimation do not presently allow one to estimate all features of a dyadic multilevel model. This comes in in the error structures. As discussed in Chapter 8, it is advisable to use what is known as an UN@AR(1) error structure, which combines both an autocorrelation part and a cross-partner correlation part on the residuals. The nlme package can estimate an AR(1) error structure OR correlated residuals between dyad members, but not both. The lme4 package does not allow not allow the user to specify error structures. brms, however, can do both. And as we will see in a moment, it is also advantageous because it can generate separate AR(1) estimates for each partner; to my knowledge, this is not currently supported in nlme.

Now that we have some background, on to the analysis!


Dyads Chapter (Ch. 8) Example from Bolger & Laurenceau (2013)

We will draw on the example from Chapter 8 of Bolger & Laurenceau’s wonderful book.


#Load Packages

library(dplyr)
library(data.table)
library(brms)


Read in Data

We can pull in the data directly from the book’s website.

c <- as.data.frame(
  fread("curl http://www.intensivelongitudinal.com/ch8/ch8R.zip | tar -xf- --to-stdout *dyads.csv")
)

head(c)
##   coupleid personid time     time7c gender female male reldis wrkstrs
## 1        1        1    0 -1.5000000      1      1    0   3.03       3
## 2        1        1    1 -1.3571429      1      1    0   4.62       3
## 3        1        1    2 -1.2142857      1      1    0   2.85       3
## 4        1        1    3 -1.0714286      1      1    0   6.40       4
## 5        1        1    4 -0.9285714      1      1    0   2.54       1
## 6        1        1    5 -0.7857143      1      1    0   5.16       2
##     wrkstrsc  wrkstrscb  wrkstrscw
## 1  0.0095238 -0.3238095  0.3333333
## 2  0.0095238 -0.3238095  0.3333333
## 3  0.0095238 -0.3238095  0.3333333
## 4  1.0095238 -0.3238095  1.3333333
## 5 -1.9904762 -0.3238095 -1.6666667
## 6 -0.9904762 -0.3238095 -0.6666667


Reformat Data

As discussed above, brms can estimate “true” multivariate models, so there is no need to stack the dataset and use dummies for female and male, as with nlme and lme4. But, we still need to do some pre-processing of the data for use with brms: The data needs to be arranged such that there is one row per timepoint per dyad, with separate columns for male and female variables


##A. Subset out female partner observations and rename columns with prefix “f_” First, we will subset out the data for the female partner. We will then relabel the columns, adding the prefix “f_” to most variable names to indicate that they refer to the female partner.

ch8_f <- subset(c, female == 1)
colnames(ch8_f) <- c("coupleid", "f_personid", "time", 
                     "time7c", "gender", "female", "male", "f_reldis",
                     "f_wrkstrs", "f_wrkstrsc", "f_wrkstrscb", "f_wrkstrscw")
head(ch8_f)
##   coupleid f_personid time     time7c gender female male f_reldis f_wrkstrs
## 1        1          1    0 -1.5000000      1      1    0     3.03         3
## 2        1          1    1 -1.3571429      1      1    0     4.62         3
## 3        1          1    2 -1.2142857      1      1    0     2.85         3
## 4        1          1    3 -1.0714286      1      1    0     6.40         4
## 5        1          1    4 -0.9285714      1      1    0     2.54         1
## 6        1          1    5 -0.7857143      1      1    0     5.16         2
##   f_wrkstrsc f_wrkstrscb f_wrkstrscw
## 1  0.0095238  -0.3238095   0.3333333
## 2  0.0095238  -0.3238095   0.3333333
## 3  0.0095238  -0.3238095   0.3333333
## 4  1.0095238  -0.3238095   1.3333333
## 5 -1.9904762  -0.3238095  -1.6666667
## 6 -0.9904762  -0.3238095  -0.6666667


##B. subset out male partner observations and rename columns with prefix “m_”, also remove duplicated columns that we don’t need We will do something similar for the male partner’s data. Note that we have also used the pipe (%>%) from the dplyr package to also indicate that we want to drop the variables time7c, gender, male, and female from our dataset. The reason for dropping these variables is that we are going to bind the columns from the female partners’ data and the columns from the male partners’ data together in a next step. This would lead to duplicated columns that we don’t actually need for our analysis.

ch8_m <- subset(c, male == 1) %>% dplyr::select(-time7c, -gender, -male, -female)
colnames(ch8_m) <- c("coupleid", "m_personid", "time", 
                     "m_reldis", "m_wrkstrs", "m_wrkstrsc", "m_wrkstrscb", "m_wrkstrscw")
head(ch8_m)
##    coupleid m_personid time m_reldis m_wrkstrs m_wrkstrsc m_wrkstrscb
## 22        1          2    0     4.46         3  0.0095238  -0.1333333
## 23        1          2    1     4.88         3  0.0095238  -0.1333333
## 24        1          2    2     4.58         3  0.0095238  -0.1333333
## 25        1          2    3     4.49         1 -1.9904762  -0.1333333
## 26        1          2    4     5.04         3  0.0095238  -0.1333333
## 27        1          2    5     4.87         3  0.0095238  -0.1333333
##    m_wrkstrscw
## 22   0.1428571
## 23   0.1428571
## 24   0.1428571
## 25  -1.8571429
## 26   0.1428571
## 27   0.1428571


##C. Merge dataframes with female data and male data. Next, we will merge the two partial datasets together–the data for female partners and for the male partners. We will merge the data by coupleid, which is shared by couple members, and by time, to preserve information about the time dependency.

ch8 <- merge(ch8_f, ch8_m, by = c("coupleid", "time"))
head(ch8)
##   coupleid time f_personid      time7c gender female male f_reldis f_wrkstrs
## 1        1    0          1 -1.50000000      1      1    0     3.03         3
## 2        1    1          1 -1.35714286      1      1    0     4.62         3
## 3        1   10          1 -0.07142857      1      1    0     5.70         2
## 4        1   11          1  0.07142857      1      1    0     5.89         1
## 5        1   12          1  0.21428571      1      1    0     4.90         3
## 6        1   13          1  0.35714286      1      1    0     4.86         4
##   f_wrkstrsc f_wrkstrscb f_wrkstrscw m_personid m_reldis m_wrkstrs m_wrkstrsc
## 1  0.0095238  -0.3238095   0.3333333          2     4.46         3  0.0095238
## 2  0.0095238  -0.3238095   0.3333333          2     4.88         3  0.0095238
## 3 -0.9904762  -0.3238095  -0.6666667          2     4.36         2 -0.9904762
## 4 -1.9904762  -0.3238095  -1.6666667          2     4.91         3  0.0095238
## 5  0.0095238  -0.3238095   0.3333333          2     4.91         3  0.0095238
## 6  1.0095238  -0.3238095   1.3333333          2     3.72         4  1.0095238
##   m_wrkstrscb m_wrkstrscw
## 1  -0.1333333   0.1428571
## 2  -0.1333333   0.1428571
## 3  -0.1333333  -0.8571429
## 4  -0.1333333   0.1428571
## 5  -0.1333333   0.1428571
## 6  -0.1333333   1.1428571


Fit Model

Now that are our data are prepped and formatted correctly, we are now ready for the analysis.


##D. Set up brms formulas The first step is to set up the brms formulas, with the bf() function. This will specify the models that we want to fit. As you can see below, we will set up two brms formulas: one for female partner and one for male partner.

For each partner, we are modeling relationship dissatisfaction as a function of an intercept term, time (centered), within-person centered work stressors, and between-person centered work stressors as fixed effects. We also model random intercepts and random slopes of time and work stressors for each subject. Note that the latter variables were already centered accordingly when we got the data from the book’s websites.

A difference in syntax compared to most brms models (and other multilevel packages) is the |p| term. |p| indicates that we want to model correlations of random effects (the choice of the letter p is arbitrary). To get cross-partner correlations, the letter inside || should be the same in both formulas. For our purposes, those cross-partner correlations are potentially very interesting and important (e.g., do female partners who show stronger effects of work stressors on relationship dissatisfaction have male partners who also show stronger effects of work stressors on relationship dissatisfaction?).

As a finale note, I have modeled the (fixed) intercept with the code O + intercept rather than writing in 1 or nothing at all (the default). This won’t make a difference for our purposes, but this method of modeling the intercept can be useful in cases where we want to set common priors for intercepts and slopes. We are not setting priors here (we use the default priors, which range from -\(\infty\) to +\(\infty\)), so it won’t make a difference in terms of the estimates. But just mentioning it here to clarify.

f_model <- bf(f_reldis ~ 0 + intercept + time7c + f_wrkstrscw + f_wrkstrscb + (time7c + f_wrkstrscw |p| coupleid))

m_model <- bf(m_reldis ~ 0 + intercept + time7c + m_wrkstrscw + + m_wrkstrscb + (time7c + m_wrkstrscw |p| coupleid))


##E. Plug in the two bfs into a brm model. Now that we have specified our brms formulas, we can put them into a model usign the brm() function.

By default, multivariate models in brms model correlated residuals, which is useful since this is a dyadic analysis and it is entirely plausible that there will be other variables that explain the relationship between dyad members’ outcomes that we didn’t not explicitly model.

We can also specify an AR(1) error structure, given by the autocor = cor_ar(~ time7c | coupleid, p = 1). The p = 1 part indicates the lag. For our purposes, this part is unnecessary to specify, as cor_ar defaults to AR(1) already.

As a note, the model sometimes produces warnings about adding more iterations. For a paper, I would probably rerun the analysis and increase the number of iterations. But, this is a demo and the results are extremely similar to those reported in the Bolger & Laurenceau book, so it is not essential for our purposes.

fit <- brm(f_model + m_model, data = ch8, 
           seed = 111, # set a seed to ensure reproducibility
           iter = 8000, # number of iterations
           autocor = cor_ar(~ time7c | coupleid, p = 1))


Be advised that this model might take awhile (several hours) to run! You can save your model object so you don’t have to run the model from scratch each time you start a new R session. It will automatically save the model object .rds file to your working directory unless you set another location as part of the filename.

saveRDS(fit, "my_brms_model.rds")

# To load the model object in a fresh R session:
fit <- readRDS("my_brms_model.rds")


Onto the results!

fit
##  Family: MV(gaussian, gaussian) 
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity 
## Formula: f_reldis ~ time7c + f_wrkstrscw + f_wrkstrscb + (time7c + f_wrkstrscw | p | coupleid) 
##          autocor ~ arma(time = time7c, gr = coupleid, p = 1, q = 0, cov = FALSE)
##          m_reldis ~ time7c + m_wrkstrscw + +m_wrkstrscb + (time7c + m_wrkstrscw | p | coupleid) 
##          autocor ~ arma(time = time7c, gr = coupleid, p = 1, q = 0, cov = FALSE)
##    Data: ch8 (Number of observations: 2100) 
##   Draws: 4 chains, each with iter = 8000; warmup = 4000; thin = 1;
##          total post-warmup draws = 16000
## 
## Correlation Structures:
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## ar_freldis[1]    -0.00      0.02    -0.05     0.05 1.00    13673    11705
## ar_mreldis[1]     0.02      0.02    -0.03     0.06 1.00    15772    11841
## 
## Group-Level Effects: 
## ~coupleid (Number of levels: 100) 
##                                              Estimate Est.Error l-95% CI
## sd(freldis_Intercept)                            0.98      0.08     0.85
## sd(freldis_time7c)                               0.07      0.05     0.00
## sd(freldis_f_wrkstrscw)                          0.12      0.04     0.04
## sd(mreldis_Intercept)                            1.04      0.08     0.90
## sd(mreldis_time7c)                               0.04      0.03     0.00
## sd(mreldis_m_wrkstrscw)                          0.17      0.03     0.11
## cor(freldis_Intercept,freldis_time7c)           -0.05      0.31    -0.64
## cor(freldis_Intercept,freldis_f_wrkstrscw)       0.39      0.19     0.00
## cor(freldis_time7c,freldis_f_wrkstrscw)         -0.18      0.35    -0.77
## cor(freldis_Intercept,mreldis_Intercept)         0.25      0.10     0.05
## cor(freldis_time7c,mreldis_Intercept)            0.13      0.30    -0.49
## cor(freldis_f_wrkstrscw,mreldis_Intercept)      -0.00      0.20    -0.40
## cor(freldis_Intercept,mreldis_time7c)           -0.04      0.33    -0.68
## cor(freldis_time7c,mreldis_time7c)              -0.07      0.38    -0.75
## cor(freldis_f_wrkstrscw,mreldis_time7c)          0.10      0.37    -0.63
## cor(mreldis_Intercept,mreldis_time7c)           -0.04      0.33    -0.66
## cor(freldis_Intercept,mreldis_m_wrkstrscw)      -0.03      0.15    -0.32
## cor(freldis_time7c,mreldis_m_wrkstrscw)          0.11      0.33    -0.60
## cor(freldis_f_wrkstrscw,mreldis_m_wrkstrscw)     0.30      0.25    -0.24
## cor(mreldis_Intercept,mreldis_m_wrkstrscw)       0.16      0.15    -0.13
## cor(mreldis_time7c,mreldis_m_wrkstrscw)          0.04      0.36    -0.66
##                                              u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(freldis_Intercept)                            1.14 1.00     2010     3961
## sd(freldis_time7c)                               0.17 1.00     1757     4890
## sd(freldis_f_wrkstrscw)                          0.19 1.00     3206     3371
## sd(mreldis_Intercept)                            1.21 1.00     4970     8237
## sd(mreldis_time7c)                               0.12 1.00     4200     6970
## sd(mreldis_m_wrkstrscw)                          0.23 1.00     6375     7027
## cor(freldis_Intercept,freldis_time7c)            0.58 1.00    17964     9147
## cor(freldis_Intercept,freldis_f_wrkstrscw)       0.74 1.00     8650     8469
## cor(freldis_time7c,freldis_f_wrkstrscw)          0.57 1.00     1988     4350
## cor(freldis_Intercept,mreldis_Intercept)         0.44 1.00     4298     7248
## cor(freldis_time7c,mreldis_Intercept)            0.69 1.02      333      545
## cor(freldis_f_wrkstrscw,mreldis_Intercept)       0.39 1.00      934     1985
## cor(freldis_Intercept,mreldis_time7c)            0.61 1.00    20721    11166
## cor(freldis_time7c,mreldis_time7c)               0.68 1.00     9351    11188
## cor(freldis_f_wrkstrscw,mreldis_time7c)          0.75 1.00    13586    12355
## cor(mreldis_Intercept,mreldis_time7c)            0.62 1.00    19812    11199
## cor(freldis_Intercept,mreldis_m_wrkstrscw)       0.28 1.00    12212    11845
## cor(freldis_time7c,mreldis_m_wrkstrscw)          0.70 1.00     1194     1903
## cor(freldis_f_wrkstrscw,mreldis_m_wrkstrscw)     0.74 1.00     2829     3619
## cor(mreldis_Intercept,mreldis_m_wrkstrscw)       0.44 1.00    14675    13864
## cor(mreldis_time7c,mreldis_m_wrkstrscw)          0.70 1.00     2327     6231
## 
## Population-Level Effects: 
##                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## freldis_Intercept       4.65      0.10     4.44     4.85 1.00     1309     2570
## mreldis_Intercept       5.09      0.11     4.88     5.30 1.00     4016     6597
## freldis_time7c         -0.02      0.03    -0.08     0.03 1.00    19847    12168
## freldis_f_wrkstrscw     0.16      0.03     0.11     0.21 1.00    12405    10671
## freldis_f_wrkstrscb     0.58      0.45    -0.34     1.43 1.00     1716     2906
## mreldis_time7c          0.01      0.02    -0.03     0.06 1.00    20004    12199
## mreldis_m_wrkstrscw     0.11      0.03     0.06     0.16 1.00    13575    12142
## mreldis_m_wrkstrscb    -0.14      0.45    -1.03     0.74 1.00     4315     7184
## 
## Family Specific Parameters: 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_freldis     1.00      0.02     0.97     1.03 1.00    12065    11320
## sigma_mreldis     0.87      0.01     0.85     0.90 1.00    15533    11944
## 
## Residual Correlations: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## rescor(freldis,mreldis)     0.07      0.02     0.03     0.12 1.00    19601
##                         Tail_ESS
## rescor(freldis,mreldis)    11617
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).


As seen from the model output, coefficients with the prefix freldis_ correspond to effects on the female partner’s outcome, and those with the prefix mreldis_ correspond to effects on the male partner’s outcomes. Normally, effects for female and male partners would be indicated through the use of dummy variables, but because of brms can estimate multivariate models, with two (or more), outcomes, we can get these separate effects automatically from our model set-up.

We also see that we have estimates of model parameters that are noteworthy for dyadic intensive longitudinal models, such as AR(1) errors, separate residual standard deviations, and a residual correlation between dyad members’ outcomes. Of note, other programs, such as SAS, that also allow one to model these parameters typically generate one estimate of autocorrelation that is pooled across partners. So an advantage of brms is that we can get separate AR(1) estimates for each partner.

We also see that, thanks to the |p| part of our code in the bf() objects, we also get correlations among random effects across partners. For instance, cor(freldis_f_wrkstrscw,mreldis_m_wrkstrscw) gives us the correlation between the female partner’s effect of work stressors on her relationship dissatisfaction and the male partner’s effect of work stressors on his relationship dissatisfaction.


#Fit Model with Maximum Likelihood (nlme) to compare How do our results compare to what we get using traditional maximum likelihood estimation?

Here is the model using the stacked data approach, with dummy variables, in the nlme package (code drawn from this excellent tutorial from the QuantDev group at Penn State). We can get separate error terms for each partner and a residual correlation, but not an AR(1) error structure.

Another difference is that the model did not converge with random slopes for time included, so they have been omitted here.

library(nlme)
fit_nlme <- lme(fixed=reldis ~ male + female + male:time7c + female:time7c +
               male:wrkstrscw + female:wrkstrscw +
               male:wrkstrscb + female:wrkstrscb + -1, 
             control=list(maxIter=1000), data=c,
             random=~male + female + male:wrkstrscw + female:wrkstrscw + -1| coupleid, 
             weights=varIdent(form = ~1 | gender), 
             corr=corCompSymm(form = ~1 | coupleid/time))
summary(fit_nlme)
## Linear mixed-effects model fit by REML
##   Data: c 
##        AIC      BIC    logLik
##   12096.16 12229.31 -6027.078
## 
## Random effects:
##  Formula: ~male + female + male:wrkstrscw + female:wrkstrscw + -1 | coupleid
##  Structure: General positive-definite, Log-Cholesky parametrization
##                  StdDev    Corr                
## male             1.0157904 male   female ml:wrk
## female           0.9600378  0.263              
## male:wrkstrscw   0.1678950  0.178 -0.019       
## female:wrkstrscw 0.1250147  0.007  0.485  0.502
## Residual         0.9981039                     
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | coupleid/time 
##  Parameter estimate(s):
##        Rho 
## 0.07320727 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | gender 
##  Parameter estimates:
##        1        0 
## 1.000000 0.873824 
## Fixed effects:  reldis ~ male + female + male:time7c + female:time7c + male:wrkstrscw +      female:wrkstrscw + male:wrkstrscb + female:wrkstrscb + -1 
##                      Value Std.Error   DF  t-value p-value
## male              5.085720 0.1038242 4093 48.98397  0.0000
## female            4.647561 0.0989240 4093 46.98114  0.0000
## male:time7c       0.011615 0.0222292 4093  0.52251  0.6013
## female:time7c    -0.024921 0.0252926 4093 -0.98532  0.3245
## male:wrkstrscw    0.109398 0.0255077 4093  4.28885  0.0000
## female:wrkstrscw  0.159484 0.0251352 4093  6.34507  0.0000
## male:wrkstrscb   -0.141904 0.4388417 4093 -0.32336  0.7464
## female:wrkstrscb  0.623305 0.4276930 4093  1.45737  0.1451
##  Correlation: 
##                  male   female ml:tm7 fml:t7 ml:wrkstrscw fml:wrkstrscw
## female            0.253                                                
## male:time7c       0.015  0.001                                         
## female:time7c     0.001  0.018  0.071                                  
## male:wrkstrscw    0.115 -0.012 -0.004 -0.001                           
## female:wrkstrscw  0.003  0.234  0.001 -0.014  0.163                    
## male:wrkstrscb   -0.095 -0.003  0.002  0.001 -0.004        0.001       
## female:wrkstrscb  0.003  0.097  0.000 -0.002  0.001       -0.004       
##                  ml:wrkstrscb
## female                       
## male:time7c                  
## female:time7c                
## male:wrkstrscw               
## female:wrkstrscw             
## male:wrkstrscb               
## female:wrkstrscb -0.029      
## 
## Standardized Within-Group Residuals:
##           Min            Q1           Med            Q3           Max 
## -3.3784814878 -0.6500702762  0.0004838121  0.6501003315  3.3326568390 
## 
## Number of Observations: 4200
## Number of Groups: 100


How do our results using nlme compare to those we got using brms? We generally see that the fixed effects are largely the same across the two analyses. The residual standard deviations and residual correlation are also very similar. As noted above, random effects for time could not be included in the nlme version due to convergence issues, and we don’t have the AR(1) error structure. We could have specified the AR(1) error structure but would have had to sacrifice the residual correlation component in order to do so.

A trade-off is that Bayesian models can take a very long time to run (i.e., several hours, or for very complex models, even a whole day). But in my opinion, the added benefits of the Bayesian analysis make it worth the extra time and effort, and this is the route I would most likely go myself for analyzing dyadic intensive longitudindal data.


View .Rmd source code
updated June 14, 2020

Back to top

footer2.knit

 

The material above reflects the best of my knowledge on this topic. Please be sure to check your results and code carefully.