---
output:
 # html_document
  pdf_document
title: "Ch 7: Vector Autoregressive Models with Time-Varying AR Coefficients in dynr"
author: "Sy-Miin Chow"
date: "June 5, 2021"
toc: true
toc_depth: 2
number_sections: true
---
  
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE)
```

```{r, echo = TRUE}
#Start fresh, set display options and load required libraries
rm(list=ls())
set.seed(12345678)
options(scipen = 999, digits=5)
library(mvtnorm)
library(dynr)
library(mgcv)
library(quantmod)
```

# Background
## Introduction
This example uses simulated data inspired by the empirical illustration in Chen, Chow, Hammal, Cohn, \& Messinger (2021). In their example, a vector autoregressive model (VAR(1)) of order 1 with time-varying autoregressive (AR) coefficients was used to represent the interactive dynamics between mother and infant during the face-to-face/still-face (FFSF) protocol. Baby's and mother's AR coefficients are modeled as varying within person  over time and across individuals based on the operating FFSF episode at the moment, whether the mother was detected to be smiling, and the infant's level of attachment. Here, we create a simpler simulation example with one time-covariate, $x_{it}$, that drives the changes in the dyad members' AR coefficients.

## For first-time users of dynr
The dynr package compiles C code in response to user input, so more setup is required for the dynr package than for many others, including installation of the GSL library and C compiler. Steps for installation and automated installer for Windows can be downloaded here:
\href{https://dynrr.github.io/predownload.html}{https://dynrr.github.io/predownload.html}

## Reference
Chen, M., Chow, S-M., Hammal, Z., Cohn, J. & Messinger, D. (2021). A Person- and time-varying vector autoregressive model to capture interactive infant-mother head movement dynamics. Multivariate Behavioral Research. DOI:10.1080/00273171.2020.1762065

Chow, S-M. *Zu, J., Shifren, K. & Zhang, G. (2011). Dynamic factor analysis models with time-varying parameters. Multivariate Behavioral Research, 46(2), 303-339.

Gates, K., Chow, S-M., & Molenaar, P.C.M. Analysis of Intraindividual Variation. Systems Approaches to Human Process Analysis. New York, NY: Taylor & Francis.

Molenaar PC, Sinclair KO, Rovine MJ, Ram N, Corneal SE. Analyzing developmental processes on an individual level using nonstationary time series modeling. Developmental Psychology. 45: 260-71. PMID 19210007 DOI: 10.1037/A0014170

Ou, L., Hunter, M. D., & Chow, S.-M. (2019). What’s for dynr: A Package for Linear and Nonlinear Dynamic Modeling in R. The R Journal. https://journal.r-project.org/archive/2019/RJ-2019-012/index.html

Dynr development team. Dynamic modeling in R (dynr) website. \href{https://dynrr.github.io/}{https://dynrr.github.io/}

Dynr development team. User Installation Instructions Manual for dynr.

\href{https://cran.r-project.org/web/packages/dynr/vignettes/InstallationForUsers.pdf}{https://cran.r-project.org/web/packages/dynr/vignettes/InstallationForUsers.pdf}

# Simulated Data Generation
## Initial set-up
Here we define the number of time points, participants, true parameter values, other data simulation settings, and various place holders. The true data generation model is expressed as:
$$Baby_{it} = \alpha_{b,i,t-1}Baby_{it} + \alpha_{bm} Mom_{i,t-1} + \zeta_{b,it}$$
$$Mom_{it} = \alpha_{m,i,t-1}Mom_{it} + \alpha_{mb} Baby_{i,t-1} + \zeta_{m,it}$$
$$\alpha_{m,it}=\alpha_{10} + \alpha_{11}x_{it}$$
$$\alpha_{b,it}=\alpha_{20} + \alpha_{21}x_{it}.$$

```{r, echo = TRUE}
nt      <- 200 #Total number of time points
npad      <- 0 #Occasions to throw out to wash away the effects of initial condition
np        <- 50 #Total number of participants
ne        <- 4 #Number of latent variables
ny        <- 2 #Number of manifest variables
nx        <- 1 #Number of covariates
psi       <- matrix(c(1, .5, 0,0,  # Process noise variance-covariance matrix
                      .5, 1.2,0,0,
                      0,0,.0001,0,
                      0,0,0,.0001),   
                    ncol = ne, byrow = T)
alpha10 = 0.5; alpha11 = .3
alpha20 = 0.4; alpha21 = .2 

#Factor loading matrix
lambda    <- matrix(c(1,0,0,0,
                      0,1,0,0), 
                    ncol = ne, byrow = TRUE)
theta     <- diag(.0001, ncol = ny, nrow = ny)  # Measurement error variances. 
cr12 = 0
cr21 = -0.2
a0        <- c(0,0,alpha10,alpha20)
P0        <- matrix(c(1,0,0,0,
                      0,1,0,0,
                      0,0,.0001,0,
                      0,0,0,.0001),
                    ncol=ne,byrow=TRUE) #Initial covariance matrix for the trend elements
yall <- matrix(0,nrow = nt*np, ncol = ny+nx)
etaall <- matrix(0,nrow = nt*np, ncol = ne)

```
## Data generation
Data generation starts here. 
```{r, echo = TRUE}
for (p in 1:np){
# Set up matrix for contemporaneous variables. 
  etaC      <- matrix(0, nrow = ne, ncol = nt + npad) 
  etaC[,1]  <- a0 + chol(P0)%*%rnorm(ne)
  
# Latent variable residuals.
  zeta      <- mvtnorm::rmvnorm(nt+npad, mean = rep(0,ne), sigma = psi)
  
# Measurement errors.
  epsilon   <- rmvnorm(nt+npad, mean = rep(0,ny), sigma = theta)  

 TV_cov = c(rep(0,nt/2), rep(1,nt-nt/2))
  intercept = matrix(rep(0,ne),ncol=1)
# Generate latent factor scores
  for (i in 2:(nt+npad)){
    etaC[1,i] = etaC[3,i-1]*etaC[1,i-1] + cr12*etaC[2,i-1] + zeta[i,1]
    etaC[2,i] = etaC[4,i-1]*etaC[2,i-1] + cr21*etaC[1,i-1] + zeta[i,2]
    etaC[3,i] = alpha10 + alpha11*TV_cov[i] + zeta[i,3]
    etaC[4,i] = alpha20 + alpha21*TV_cov[i] + zeta[i,4]
   }#End of loop over nt

  eta <- t(etaC[,(npad+1):(npad+nt)])

  # generate observed series
  y   <- matrix(0, nrow = nt, ncol = ny)
  for (i in 1:nrow(y)){
    y[i, 1:ny] <- lambda %*% eta[i, ] + epsilon[i, ]
  }
  yall[(1+(p-1)*nt):(p*nt),1:ny] = y
  yall[(1+(p-1)*nt):(p*nt),(ny+1):(ny+nx)] = TV_cov
  etaall[(1+(p-1)*nt):(p*nt),] = eta
} #End of p loop

yall = cbind(rep(1:np,each=nt),rep(1:nt,np),yall)
etaall = cbind(rep(1:np,each=nt),rep(1:nt,np),etaall)
colnames(etaall) = c("ID","time",paste0("LV",1:ne))
etaall = data.frame(etaall)
colnames(yall) = c("ID","time",paste0("y",1:ny),paste0("cov",1:nx))
yall = data.frame(yall)
#Optionally: You could write out the data files and true latent variable scores
#file0 = paste0('TV_VAR.txt')
#file1 = paste0('TV_VARTrueLVs.txt')
#write.table(yall,file0,row.names=FALSE,col.names=FALSE,sep=",")
#write.table(etaall,file1,row.names=FALSE,col.names=FALSE,sep=",")

```
## Plots of simulated data
Plots of the latent variables corresponding to baby, mom, and the TV AR coefficients.
```{r, echo = TRUE}
par(mfrow=c(2,2))
par(cex.axis=1.3,cex.lab=2,mgp=c(2.0,.5,0),cex.main=2)
interaction.plot(etaall$time,etaall$ID,etaall$LV1,legend=F,lty=1,
                 ylab="Dyad member 1",xlab="Time",
                 main=("(A)"),xaxt = "n")

par(cex.axis=1.3,cex.lab=2,mgp=c(2.0,.5,0),cex.main=2)
interaction.plot(etaall$time,etaall$ID,etaall$LV2,legend=F,lty=1,
                 ylab="Dyad member 2",xlab="Time",
                 main=("(B)"),xaxt = "n")

par(cex.axis=1.3,cex.lab=2,mgp=c(2.0,.5,0),cex.main=2)
interaction.plot(etaall$time,etaall$ID,etaall$LV3,legend=F,lty=1,
                 ylab="ar1",
                 xlab="Time",main=("(C)"),xaxt = "n")

par(cex.axis=1.3,cex.lab=2,mgp=c(2.0,.5,0),cex.main=2)
interaction.plot(etaall$time,etaall$ID,etaall$LV4,legend=F,lty=1,
                 ylab="ar2",
                 xlab="Time",main=("(D)"),xaxt = "n")
```

## Code for dynr model - Random walk model for the TV AR coefficients
Here we start preparing the ``recipes'' for a dynr model. The TV AR coefficients are added to the latent variable vector. So now, there are four latent variables. In addition, we use a random walk model to approximate the over-time dynamics of the AR coefficients as:
$$\alpha_{m,it}=\alpha_{m,i,t-1} + \zeta_{\alpha m,it}$$
$$\alpha_{b,it}=\alpha_{b,i,t-1} + \zeta_{\alpha b,it}$$
$$\alpha_{b,it}=\alpha_{20} + \alpha_{21}x_{it}.$$
The random walk model is a convenient approximation model. It includes as a special case the time-invariant model, a model positing that a time-varying paramter (TVP) simply remains invariant over time. This is obtained when the process noise variances, $Var(\zeta_{\alpha b,it})$, and $Var(\zeta_{\alpha m,it})$, are equal to zero.
```{r}
dynrdata <- dynr.data(yall, id="ID", time="time", 
                               observed=c("y1","y2"))

# Measurement model for linking latent states to observed variables
meas <- prep.measurement(
  values.load=matrix(c(1,0,0,0,
                       0,1,0,0),ncol=4,byrow=T), 
  params.load=matrix("fixed",ncol=4,nrow=2),
  state.names=c("baby","mom","arb","arm"),
  obs.names=c("y1","y2")
)

# Initial conditions for the latent variables at time 1 befor
initial <- prep.initial(
  values.inistate=c(0,0,.5,.5),
  params.inistate=c('fixed', 'fixed','alpha10','alpha20'), 
  values.inicov=diag(c(rep(1,2),rep(.1,2))),
  params.inicov=diag('fixed',4)) 

# Process noise and measurement error variances 
mdcov <- prep.noise(
  values.latent=matrix(c(.5,0.1,0,0,
                         0.1,.5,0,0,
                         0,0,0.1,0,
                         0,0,0,0.1), ncol=4,byrow=T), 
  params.latent=matrix(c('zv_baby','cov_bm',0,0,
                         'cov_bm','zv_mom',0,0,
                         0,0,'zv_arb',0,
                         0,0,0,'zv_arm'), ncol=4,byrow=T),
  values.observed=diag(c(0,0)), 
  params.observed=diag(c('fixed','fixed'),2)) 

#Dynamic formula - using a random walk (RW) model for arb and arm
#Note that a special case of the RW model arises when
#the process noise variances for the TVPs are equal to zero. In this case we get time-invariant parameters. See
#arb_{it} = arb_{i,t-1} + processNoise_arb_{it}
#arm_{it} = arm_{i,t-1} + processNoise_arm_{it}
formula =list(
  list(
    baby~arb*baby+crmb*mom,
    mom~ arm*mom+crbm*baby,
    arb~arb,
    arm~arm 
  ))

# A dynr formula object with starting values for parameter optimization
dynm  <- prep.formulaDynamics(formula=formula,
                              startval=c(crmb=.03,crbm=.03
                               ), isContinuousTime=FALSE) 

# Combine all the model components specifed above into one dynr model object
dynrmodel <- dynr.model(dynamics=dynm, measurement=meas,
                           noise=mdcov, initial=initial, data=dynrdata,
                           outfile="SSMTVP.c")

# Run the parameter optimization with filtering and smoothing for the states
modelRes <- dynr.cook(dynrmodel,verbose=FALSE)

# Result summary
summary(modelRes)
```

## Now fitting the true model for the TVPs
Now we fit the true model for the TVPs to the same data set. The time-varying covariate, $x_{1it}$, is included as a predictor in the dynamic functions for the TVPs.
```{r}
# Specify dynr structure with correct covariate 
dynrdata.cov <- dynr.data(yall, id="ID", time="time", 
                      observed=c("y1","y2"),
                      covariates = c("cov1"))

#Free up initial means and variances for the TVPs
initial.cov <- prep.initial(
  values.inistate=c(0,0,.5,.5),
  params.inistate=c('fixed', 'fixed','alpha100','alpha200'), 
  values.inicov=diag(c(rep(1,2),rep(.1,2))),
  params.inicov=diag(c('fixed','fixed','valpha1','valpha2'))) 


#Dynamic formula
formula.cov =list(
  list(
    baby~arb*baby+crmb*mom,
    mom~ arm*mom+crbm*baby,
    arb~alpha10 + alpha11*cov1,
    arm~alpha20 + alpha21*cov1
  ))

# A dynr formula object with starting values for parameter optimization
dynm.cov  <- prep.formulaDynamics(formula=formula.cov,
                              startval=c(crmb=.03,crbm=.03,
                                         alpha10 = .5, alpha11 = .01,
                                         alpha20 = .5, alpha21 = .01
                              ), isContinuousTime=FALSE) 

#Let's take a look at the Jacobian - matrix of differentiations
#of dynamic functions with respect to each latent variable
jacob_g = matrix(as.character(unlist(dynm@jacobian[[1]])),ne,ne,byrow=TRUE)
#baby_{it} = arb_{i,t-1}*baby_{i,t-1} + crbm*mom_{i,t-1}+processNoise_{baby,it}
#arb_{it} = arb_{i,t-1} + processNoise_{arb,it}
jacob_g

#Try to get this by yourself using symbolic differention function in R:
D(expression(arb*baby+crbm*mom),'baby')

# Combine all the model components specifed above into one dynr model object
dynrmodel.cov <- dynr.model(dynamics=dynm.cov, measurement=meas,
                        noise=mdcov, initial=initial.cov, 
                        data=dynrdata.cov,
                        outfile="SSMTVPcov.c")

# Run the parameter optimization with filtering and smoothing for the states
modelRes.cov <- dynr.cook(dynrmodel.cov,verbose = FALSE)
summary(modelRes.cov)
```

##Comparisons of estimates of the TVPs from both models.
Now let's take a look at some plots!
```{r}
#Pick a random i
plot(1:nt,etaall[etaall$ID==1,"LV3"],lwd=3,lty=1,col="blue",
     type="l",
     ylab="True vs. estimated AR1 w/o covariate",xlab="Time",
     ylim=c(0.4,1))
title(main="Without use of true covariate")
for (i in 1:1){
  lines(1:nt, modelRes@eta_smooth_final[3,etaall$ID==i],lwd=2,lty=3)
  legend('topleft',c('Estimated','True'),lty=c(3,1),lwd=c(2,3),
         col = c("black","blue"),bty="n",cex=1.2)
}

plot(1:nt,etaall[etaall$ID==1,"LV3"],lwd=3,lty=1,col="blue",
     type="l",
     ylab="True vs. estimated AR1 with covariate",xlab="Time",
     ylim=c(0.4,1))
title(main="With use of true covariates")
for (i in 1:1){
  lines(1:nt, modelRes.cov@eta_smooth_final[3,etaall$ID==i],lwd=2,lty=3)
  legend('topleft',c('Estimated','True'),lty=c(3,1),lwd=c(2,3),
         col = c("black","blue"),bty="n",cex=1.2)
}
```

## Generalized Additive Modeling (GAM) Approach
Here we consider an alternative approach, generalized additive modeling (GAM), which can be used to obtain smooth estimates of the trajectories of the TVPs via penalized splines. Here we use the gam function from the mgcv package.

```{r}
# Created lag-1 variables 
yall$bL=unlist(by(yall$y1,yall$ID,Lag,k=1))
yall$mL=unlist(by(yall$y2,yall$ID,Lag,k=1))

# Run a GAM with:
# time-varying intercepts: s(Time.agg)
# time-varying AR: s(time,by=bL) for y_baby, for example
# time-varying CR: s(time,by=mL) for y_baby, for example
# 20 basis functions: k=20

#Here I omit the smooth for the cr and make it linear
gam_biv = mgcv::gam(list(y1~-1+ mL + s(time,k=20)+
                    s(time,by=bL,k=20),#+s(time,by=mL,k=20),
                    y2~-1+s(time,k=20)+
                    s(time,by=mL,k=20)+bL),#+s(time,by=bL,k=20)),
             family=mvn(d=2),data=yall)

gam.check(gam_biv) #Check if 20 basis functions is sufficient; increase as needed

# Result summary
summary(gam_biv)

# Process noise variance-covariance
solve(crossprod(gam_biv$family$data$R))
```

Let's compare results from the state-space and GAM approach. The GAM approach provides a reasonable approximation of the TVP trajectories despite not using the time-varying covariate, $x_{1it}$. The TVP estimates from the two models are generally getting at slightly different things: the state-space approach estimates the trajectory of each individual's TVP over time; the GAM estimates provide an overall trajectory for each TVP characterizing \textit{all individuals} over time. Their corresponding differences in standard error estimates reflect these differences. The confidence intervals for the TVP estimates from the state-space approach are very narrow in this case because the true mechanisms of change for the TVPs are deterministic, i.e., there are no process noises.
```{r}
#Plot summary
plot(gam_biv,pages=1,shade=TRUE)

#Overlay the GAM and state-space estimates of the AR 1
#coefficient for baby
par(mfrow=c(1,1),cex.axis=1.2,cex.lab=1.5)
plot(gam_biv,shade=TRUE,select=2,ylim=c(.3,1),
     ylab="Smooth of AR_1 coef for baby",
     xlab="Time")
lines(1:nt, modelRes.cov@eta_smooth_final[3,yall$ID==1],col='blue',
      lwd=3,lty=3)
#Get standard error of the latent variable estimate from
#the state-space approach as sqrt(var(ar_1)|all observed data)
se_stsp = sqrt(modelRes.cov@error_cov_smooth_final[3,3,yall$ID==1])
lines(1:nt, modelRes.cov@eta_smooth_final[3,yall$ID==1]-2*se_stsp,
      col='blue',lwd=1,lty=3)
lines(1:nt, modelRes.cov@eta_smooth_final[3,yall$ID==1]+2*se_stsp,
      col='blue',lwd=1,lty=3)
legend('topleft',c('GAM','State-space'),lty=c(1,3),lwd=c(1,3),
       col = c("black","blue"),bty="n",cex=1.2)
```