1 Introduction
In this paper, we aim to examine the association between mortality and the longitudinal measurements of several biomarkers among hemodialysis patients. We are specifically focusing on the data obtained from a 5year (January 2007December 2011) cohort of 109,718 hemodialysis patients who were treated for dialysis in dialysis clinics in the United States. The dataset was studied in detail by Ravel et al. (2015)
. In such studies, it is common to measure multiple longitudinal biomarkers during the study follow up. Collected longitudinal measures on each subject tend to be correlated and temporally dependent as they are taken on the same subject. One could of course model the trajectory of each biomarker independently from other collected biomarkers. However, by simultaneously modeling the trajectory of all biomarkers, one can take the correlation between the different biomarkers into account. This could lead to a more precise model of biomarker trajectories. More specifically, when some biomarkers are measured less frequently compared to the other biomarkers (e.g., if they are difficult or expensive to measure), simultaneously modeling biomarkers can be particularly useful as one can gain more precision in estimating less frequent biomarkers by taking the correlation between all biomarkers into account and by borrowing information from the higher frequency biomarkers to better predict the lower frequency ones. Finally, a joint longitudinalsurvival model with a flexible longitudinal component, which is capable of modeling the trajectory of multiple biomarkers simultaneously, can be used to test the association between the survival outcome and the longitudinal biomarkers.
Throughout this paper, we shall use the word ”joint” when we model longitudinalsurvival data simultaneously. We reserve the word ”simultaneous” to refer to modeling multiple longitudinal measures at a same time rather than modeling each biomarker trajectory independently from others.
1.1 Some Related Methods
Simultaneously modeling longitudinal biomarkers can be considered in the context of a multivariate temporal process model that can be used both for inferential purposes as well as prediction and interpolation purposes. In order to develop such model, specification of a valid crosscovariance function is necessary. A valid crosscovariance function is required to lead to a valid positivedefinite covariance matrix for any number of time points and at any choice of these time points
(Gelfand and Banerjee, 2010). A common crosscovariance function is to use separable crosscovariance function construction that shall be explained in more detail in Section 3.The question of describing the correlation between multiple longitudinal measures has been addressed from a different perspective in geostatistics literature with a focus on spatial data. Bernardo et al. (1998) and Berger et al. (2003) described the kernel convolution technique for creating stationary and nonstationary spatial processes. Majumdar and Gelfand (2007) proposed producing crosscovariance functions by using convolution of covariance functions. Multivariate models in geostatistics started with Matheron (1973) and by introduction of some new concepts including crossvariogram and coKriging. Gelfand and Banerjee (2010) defined coKriging as a spatial prediction method that uses both the information of the process being considered as well as the information from other related processes. CoKriging is commonly addressed in the context of the linear models as these models are easily interpreted. These models that are commonly known as linear model of coregionalization (LMC) have been considered widely in the literature including Grzebyk and Wackernagel (1994), Schmidt and Gelfand (2003), and Ver Hoef, Cressie and Barry (2004).
In a different work, we proposed a flexible joint longitudinalsurvival Modeling framework for quantifying the association between a longitudinal biomarker and Survival Outcomes (Akhavan et al., 2018). We are now interested in extending our method to be able to model multiple biomarkers simultaneously. In Section 2, we will show how a Gaussian process model can be extended to a multivariate Gaussian process to model multiple longitudinal biomarkers simultaneously. In Section 3, using our proposed multivariate Gaussian process, we build a joint longitudinalsurvival model capable of relating longitudinal biomarkers to survival outcome. In Section 4, we use several simulation studies to evaluate our proposed model. Section 5 presents our data analysis results.
2 Multivariate Gaussian Process
Consider the function that relates an input space to an output space . As an alternative to an explicit functional assumption on , one can assume a Gaussian process prior on . One may consider time as the input space and the space of a longitudinal measure as the output space and may use Gaussian process prior as a prior on all plausible functions relating time to the longitudinal measure at time .
In the setting above, is considered as a univariate function with one output at each time point . In general, however,
can be a multivariate function with a vector of outputs at each time
. A multivariate function will require a multivariate Gaussian process prior. One can consider a more general frame work of the following formwhere is a vector of outputs at time , is a multivariate function with a multivariate Gaussian process prior with a mean vector function and a crosscovariance matrix function .
Consider a multivariate longitudinal vector at time with the dimension . Without loss of generality and for simplicity of the notations, we assume a multivariate longitudinal random vector with mean zero, . A crosscovariance function between two generic time points and is a matrix function with dimension where the element of this matrix is defined as
(2.1) 
where and are . Equation (2.1) indicates that the crosscovariance matrix function can be defined as
(2.2) 
Consider arbitrary time points . At each time point , is a vector. Concatenating n such output vectors, one can define an vector , where . Random vector is meanzero with a covariance matrix with the dimension . is a block matrix where each block is the crosscovariance matrix corresponding to time and for all outputs at each time point.
As a covariance matrix, has to be symmetric and positive definite. As Gelfand and Banerjee (2010) showed, this requires the covariance matrix function to satisfy the two following conditions of

,

for any integer value and for any arbitrary collection of ”n” time points:
.
A multivariate process is stationary if the crosscovariance matrix function depends only on the time difference between and , where we can write . Multivariate process is called isotropic if the crosscovariance matrix function depends only on the absolute difference between and , where we can write . Yadrenko (1987) showed that under isotropic condition, a covariance function will form a valid crosscovariance matrix function if and only if the function for a positivedefinite measure has a crossspectral representation of the form
Despite the existence of many methods proposed in the literature, when crosscovariance functions are unknown, specification of a valid crosscovariance function based on the observed data is a very difficult task. One common approach to specify a valid crosscovariance function is to use a class of covariance functions known as separable crosscovariance structures that shall be introduced in the next section.
Now consider as a vector of longitudinal variables all measured at time . The crosscovariance matrix function is then a matrix where it’s element is equal to that was defined in equation (2.1). One can assume that the covariance between the longitudinal variables at each timepoint remains the same at all time points and can be specified with a positive definite covariance matrix . The correlation between measured values at time and measured values at time can then be expressed using a univariate correlation function . Given this specification, one can write
(2.3) 
where is of size and represents the covariance matrix between the elements of that remain the same at all time points , and represents the correlation between measures at time and measures at time .
Now consider time points where at each time point , we observe a stack of
longitudinal random variables
. Consider as the vertical stack of longitudinal vectors, each element of that vector is an observed at a time point . can be defined as(2.4) 
where the element of is represented with and is equal to , the notation represents the Kronecker product, and is the nontemporal covariance function between the longitudinal variables that is assumed to remain the same across time points ’s.
By using a separable crosscovariance function and given that matrix is positive definite by definition, with a positive definite matrix , it’s guaranteed that will be positive definite. Furthermore, by using a separable crosscovariance structure, the determinant of the crosscovariance function, , and the inverse of the crosscovariance function,, will become computational convenient to deal with. In particular, by using the properties of the Kronecker product, one can show that the determinant of the crosscovariance can be written as , where and are the determinant of the matrix and the determinant of the , respectively. Also, the inverse of the crosscovariance function can be written as . We shall use the class of separable crosscovariance functions to setup our proposed model.
3 Joint Model
In this section, we first start by introducing the likelihood specification of our joint model. We then continue with introducing the longitudinal component of the model and the survival component. Our proposed method can work for any number of longitudinal biomarkers, however, for the sake of simplicity of the notations, we consider only two longitudinal biomarkers. We refer to the first longitudinal biomarker with and to the second biomarker with . We denote survival outcome with . refers to how many subjects are being followed up in the study. and refer to the number of longitudinal biomarker 1 measures and biomarker 2 measures obtained for subject at time points and , respectively. Also, associated with each subject, there is an observed survival time, and event indicator , where and denote the true event time and the censoring time for subject , respectively. Further, we make the common assumption that is independent of for all , .
We define the contribution of each subject to the joint model likelihood as the multiplication of the likelihood function of the longitudinal measures for that subject and her/his timetoevent likelihood that is conditioned on her/his longitudinal measures. Let , , and denote the longitudinal likelihood contribution, the conditional survival likelihood contribution, and the joint likelihood contribution for subject . One can write the joint longitudinalsurvival likelihood function as
(3.1) 
In what follows, we explain the components of this model.
3.1 Longitudinal Component
We motivate the development of the multivariate Gaussian process model of two longitudinal biomarkers by first considering the following simple model for a single subject
(3.2) 
For simplicity of the notation, we assume that both biomarkers are measured simultaneously, that means at each time point , we get to obtain measures on both biomarkers. Our method is not limited to this assumption and once readers are introduced with the model, we shall extend the notation to a model with no such assumption. We define as the number of longitudinal measures per biomarker. These measures are obtained at an arbitrary time points . In the equation (3.2), and are vectors of longitudinal biomarker 1 measurements and longitudinal biomarker 2 measurements, each of size respectively. We shall stack and together into a column vector that is of size . is a vector of repeated random intercepts corresponding to biomarker 1 and is a vector of repeated random intercepts corresponding to biomarker 2, each of size . Also, we shall stack and into a column vector of size . The model in equation (3.2) assumes biomarker 1 and biomarker 2 are independent and each biomarker has its own measurement error matrix. We consider and , where and are shared parameters across all subjects.
By adding a stochastic component that is indexed by time to the model in equation (3.2), we can relax the independence assumption between biomarkers and we can also extend the model to capture nonlinear patterns over time. Specifically, we consider a stochastic vector, , that is a realization from a multivariate Gaussian process prior, , that is mean zero and has a separable crosscovariance function. Thus for subject , , where is a column vector that includes a stack of and , where and .
We characterize the covariance function, , using a separable crosscovariance structure as
(3.3) 
where is a 2 by 2 matrix that characterizes the the covariance between the two biomarkers and is assumed to be timeinvariant, and is a temporal covariance matrix. The matrix is shared across all subjects with elements and
characterizing marginal variances of the first and the second biomarker processes respectively, and with the
element characterizing the covariance between the two processes. In specific, one can decompose the covariance matrix into the following form(3.4) 
where and are squareroot of the within biomarker 1 and biomarker 2 variances and is a correlation matrix. Commonly, is set to equal 1 where in that case, it’s assumed that will also capture the within biomarker 1 variability and is treated as a parameter indicating the relative biomarker variability between biomarker 2 and biomarker 1. Hence, we define
We rewrite the equation (3.5) as
(3.5) 
is the covariance matrix characterizing how longitudinal measures change over time. Under the separable crosscovariance structure, we assume both biomarkers share the same covariance structure for changes in their values over time. We characterize as an matrix with elements that is defined as
(3.6) 
where the hyperparameter
controls the correlation length, and controls the height of oscillations (Banerjee et al. 2004), and and are two different time points. For notational simplicity, we define . We can extend the longitudinal model in equation (3.2) to the flexible model below(3.7) 
where is a stochastic vectors sampled from a Gaussian process prior of the form
(3.8) 
In the model defined by equation (3.7), and are assumed to be common across all subjects. Also, we assume the correlation length is fixed and hence, the subjectspecific parameter will have the role of capturing the withinsubject volatility of the longitudinal biomarkers. Finally, the longitudinal component of our proposed joint model can be written as
(3.9)  
where and are random intercepts associated with biomarker 1 and biomarker 2, respectively. is a column vectors of size that is a stack of and each repeated times. Finally, matrix was decomposed based on the equation (3.5).
3.2 Survival Component
Our goal is to quantify the association between the longitudinal biomarkers of interest and the timetoevent outcomes by directly adjusting for biomarkers measured values in a survival component of our proposed joint model. While usually biomarkers are measured on a discrete labvisit basis (ex. every month), the event of interest happens on a continuous basis. While common frequentist models use the socalled ”lastobservationcarried” forward, by jointly modeling longitudinalsurvival data, one can properly impute biomarker measures at each individual’s event time. In particular and from the Bayesian modeling perspective, in each MCMC iteration, given the sampled parameters for each individual and by using the flexible multivariate Gaussian process in the longitudinal component of the model, there exists posterior trajectories of biomarkers for each individual. Our method, then, considers the posterior mean of those trajectories as the proposed trajectory for each individual’s biomarker values over time at that iteration. The posterior mean trajectories of our biomarkers of interest, then, can be used to impute timedependent biomarker covariates inside the survival component of the model.
In order to quantify the association between two longitudinal biomarkers, which are modeled simultaneously using the longitudinal component of the model, and the timetoevent outcomes, we define our survival component by using a multiplicative hazard model with the general form of
where is the event time for subject , denotes a baseline hazard function, is a vector of baseline covariates, are longitudinal covariates from the longitudinal component of the model, and and are regression coefficients interpretable as the log relative risk of ”death” per every unit increase of their corresponding covariates.
We consider a Weibull distribution for the survival component to allow for loglinear changes in the baseline hazard function over time. Thus we assume
(3.10) 
that means
(3.11) 
Weibull distribution is available in closed form and can be evaluated computationally efficiently. Under this parameterization of the Weibull distribution, covariates can be incorporated into the model by defining , where are coefficients associated with baseline survival covariates , and are longitudinal coefficients associated with longitudinal covariates .
In particular, we are interested in a model that directly includes the two longitudinal biomarker values at time as a covariate inside the survival model. Hence, we define our model as
(3.12) 
with
where is a common shape parameter shared with all subjects. is a subject specific coefficient in the model which allows subjectspecific baseline hazard. and are baseline covariates and their corresponding coefficients, respectively. Finally, and are coefficients linking the longitudinal biomarker1 value at time and longitudinal biomarker2 value at time to the hazard for mortality, respectively.
In order to fit a fully joint longitudinalsurvival model, at each iteration of the MCMC and for a timepoint , predicted biomarker1 and biomarker2 values for individual are of the following form
where is a vector where its first element is the predicted value of the first biomarker and its second element is the predicted value for the second biomarker. is a column vectors of size that represents the observed biomarker values, where its first elements are the observed biomarker1 values and the remaining elements are the observed biomarker2 values. is column vector of size that includes all time points at which values of the biomarkers were observed. is the time at which by using the posterior trajectory of biomarkers at each MCMC iteration, we want to impute a predicted value per biomarker. Given our proposed longitudinal component setup, is distributed bivariate Normal with mean and with covariance matrix that are of the following forms
(3.13) 
(3.14) 
where
(3.15)  
where and were defined earlier in Section 3.1. is defined similarly as except that is replaced with . is a column vector size 2 by 1 with random intercept of the first biomarker, , as its first element and random intercept of the second biomarker, , as it’s second element. is a column vector of size , where the first elements are all the random intercept for the first biomarker and the remaining elements are all the random intercept for the second biomarker.
In order to avoid an explicit distributional assumption on the survival times, we specify our survival model as an infinite mixture of Weibull distributions mixed on the parameter. To do so, we use the Dirichlet process mixture of Weibull distributions defined as
(3.16)  
(3.17)  
(3.18) 
where is a fixed parameter, is a subjectspecific mean parameter from a distribution with a DP prior, is the concentration parameter of the DP and is the base distribution. By using the Dirichlet process prior on the distribution of , we allow patients with similar baseline hazards to cluster together which subsequently provides a stronger likelihood to estimate the baseline hazards. For other coefficients in the survival model, we assume a multivariate normal prior as
where is a set of coefficients associated with the baseline survival covariates, and are coefficients associated with value of the first and the second biomarkers respectively, is a prior variance for each coefficient, and
is the identity matrix.
For the shared shape parameter , we consider a logNormal prior, , and specify the prior on the concentration parameter of our DP model to be .
Finally, since the hazard function includes timevarying covariates, evaluation of the log likelihood that involves integration of the hazard function over time is done using the rectangular integration discussed in detail in our previous paper (Akhavan et al., 2018). More details on posterior sampling and implementation of our method are provided in Appendix A.
4 Simulation Studies
In this section, in order to evaluate our proposed model, we provide two types of simulations. Type I simulations (4.1) only consider the longitudinal component of the model and are aimed to show the benefit of simultaneously modeling multiple longitudinal biomakers as opposed to modeling biomakers each separately. We shall refer to the former with multi and to the latter with uni. The second type of simulations, type II (4.2), are aimed to evaluate the performance of the proposed joint longitudinalsurvival model.
4.1 Multivariate Gaussian Process Model vs. Multiple Univariate Gaussian Processes
One natural question in modeling multiple biomarkers is whether there is any benefit in modeling these biomakers simultaneously (multi) as opposed to modeling each biomaker process independently (uni). Type I simulations are design to evaluate our proposed multi model vs. an alternative uni model.
We generate synthetic data for two biomakers of albumin () and BMI (). Albumin and BMI values are simulated off of the multivariate Gaussian process model for 100 subjects and for each subject 60 albumin and 60 BMI values. The multivariate Gaussian model is of the following form (more details in the appendix A.1):
(4.1) 
The values are simulated from the distribution. and are simulated from the and the distributions, respectively. Measurement errors and are both set to be equal to .
To explore the benefit of simultaneously modeling biomakers, we compare our proposed methodology here against an alternative modeling framework (Akhavan et al., 2018) that models biomarker processes independently. Our general comparison scheme is to randomly remove some of the obtained simulated biomarker values and treat them as missing. Then we use our fitted models in order to predict missing biomaker values. Models then are compared in terms of the prediction mean error (MSE) of the missing values. we consider the following three real world scenarios:

Scenario 1: Comparison of the models in terms of the prediction MSE and as a function of correlation between biomakers and the amount of time overlap between missing biomarkers.

Scenario 2: Here we tackle the real world scenario where on biomarker is observed less frequently compared to the other biomarker (ex. one biomarker is more expensive or more difficult to measure).

Scenario 3: Is our proposed multivariate model capable of handling biomarker processes with different volatility given that our methodology assumes common values across biomarkers with as a common measure of volatility?
4.1.1 Scenario 1
Under this scenario, we remove one third of the biomarker values and treat them as missing. These 20 values are removed in the three following ways:

When one biomarker is missing, the other biomarker is also missing (i.e. 100% missing overlap).

When one biomarker is missing, the other biomarker is observed (i.e. 0% missing overlap).

Half of the times both biomarkers are missing and half of the times, when one is missing, the other biomarker is observed (i.e. 50% missing overlap)
Table 3 shows the simulation results from 100 generated datasets where the multi and uni models are compared in terms of the average prediction MSE and as a function of biomarker correlations and missing values % overlap. Based on the simulation result, by taking the betweenbiomarker correlations into account, multi model leads to a smaller overall MSE compared to the uni model. In Table 3, %Dec. indicates the percentage decrease in MSE comparing the multi model and the uni model. As expected, as the correlation between the two processes increases, the multi model leads to a smaller MSE compared to the uni model. When the percent overlap between missing values decreases, indicating when one biomarker is missing, the other biomarker is more likely to be observed, the information from one biomarker can help the multi model better predict the missing biomarker, and hence, will lead to a smaller MSE.
4.1.2 Scenario 2
Under this simulation scenario, we consider the real world scenario where one biomarker, due to the cost of the procedure or the difficulty of obtaining biomarker measures, is measured less frequently compared to the other biomarker. We generate synthetic data for two biomarkers each with 60 measurements and at five different correlation levels between the two biomarker processes. We then randomly remove measurements from the first biomarker at the 20%, 50%, and 80% rate. Models are then compared in terms of the prediction MSE of the missing values. As the results in Table 3 show, the proposed multivariate model leads to lower MSEs compared to the univariate model. As expected, as the correlation between the two biomarker processes increases, multi model better takes advantage of the information from one biomarker to predict the other and led to a smaller MSE.
Correlation  100% Overlap  50% Overlap  0% Overlap  

Multi  Uni  %Dec.  Multi  Uni  %Dec.  Multi  Uni  %Dec.  
0.1  0.237  0.243  2.5%  0.202  0.206  1.9%  0.207  0.212  2.4% 
0.3  0.201  0.207  2.9%  0.215  0.227  5.3%  0.206  0.217  5.1% 
0.5  0.209  0.217  3.7%  0.209  0.228  9.3%  0.196  0.227  13.7% 
0.7  0.207  0.217  4.6%  0.189  0.228  17.1%  0.185  0.235  21.3% 
0.9  0.194  0.204  4.9%  0.158  0.219  27.9%  0.139  0.237  41.4% 
Correlation  20% Freq.  50% Freq.  80% Freq  

Multi  Uni  %Dec.  Multi  Uni  %Dec.  Multi  Uni  %Dec.  
0.1  0.410  0.420  2.4%  0.281  0.286  1.8%  0.181  0.185  2.2% 
0.3  0.391  0.423  7.6%  0.248  0.261  5.0%  0.188  0.197  4.6% 
0.5  0.343  0.413  17.0%  0.226  0.264  14.4%  0.165  0.187  11.8% 
0.7  0.319  0.407  21.6%  0.222  0.266  16.6%  0.152  0.188  19.2% 
0.9  0.294  0.397  26.0%  0.190  0.245  22.5%  0.126  0.189  33.3% 
Correlation  100% Overlap  50% Overlap  0% Overlap  

Multi  Uni  %Dec.  Multi  Uni  %Dec.  Multi  Uni  %Dec.  
0.1  0.263  0.270  2.6%  0.250  0.260  3.9%  0.258  0.267  3.1% 
0.3  0.247  0.256  3.6%  0.252  0.263  4.1%  0.249  0.265  5.9% 
0.5  0.264  0.275  3.8%  0.232  0.250  6.9%  0.252  0.277  9.1% 
0.7  0.236  0.247  4.4%  0.254  0.282  10.2%  0.207  0.243  14.7% 
0.9  0.231  0.245  5.9%  0.211  0.254  16.8%  0.214  0.275  22.4% 
4.1.3 Scenario 3
In building our proposed multivariate longitudinal model, we assumed a separable crosscovariance structure where the model assumes that different longitudinal processes share the same temporal covariance. In particular, our model assumes a shared volatility measure across different biomarkers. Obviously, by using an independent univariate Gaussian processes model (Uni) one can estimate different values for each biomaker process. We, however, claim that despite a shared across longitudinal biomarkers in our proposed multivariate model (Multi), the model is still robust as the model can capture the withinbiomarker variances through the covariance matrix (equation (3.5)). To show this, we simulate two longitudinal biomarkers with different volatility measure values. Table 3 supports our claim as our proposed multi model still leads to smaller MSEs compared to the uni model.
4.2 Simulation Studies on the Proposed Joint Multivariate LongitudinalSurvival Model
Our second type of simulations are focused on evaluating the performance of our proposed joint multivariate longitudinalsurvival model. We simulated 200 datasets that resembled the real data on endstage renal disease patients that was obtained from the United States Renal Data System (USRDS). Each dataset included subjects. We simulated longitudinal trajectories for the two biomarkers of albumin () and BMI (), with within subject albumin and within subject BMI measures. Both biomarkers are generated from a joint multivariate Gaussian process with a highcorrelation of between the processes. In particular, biomarker measures for each subject are simulated from
where
is a stack of two subject specific random intercepts for the two processes. Subjectspecific random intercepts for the albumins are simulated from the Normal distribution
and the random intercepts for the BMIs are simulated from the Normal distribution . and are both set to . is considered to be equal to , where ’s are simulated from the uniform distribution. is the distance matrix of the form , with a fixed of 0.1 and time points ’s that are sampled uniformly from a followup time with a maximum of 15 months. The matrix is the covariance matrix of the two biomarkers with a high correlation of between the processes and with scaled variances for each biomarker process. Survival times are generated from a Weibull distribution of the formwith the shape parameter of the distribution, , set to 1.5 and the logscale parameter that is of the form
where were sampled from an equally weighted mixture of two Normal distributions of and . is fixed to 0.5 and
is fixed to 0.3. Censoring times are generated from a uniform distribution and independent of the survival times
with parameters that ensure a censoring rate.In order to evaluate the performance of our proposed model, we also fit a lastobservation carried forward proportional hazard Cox model as well as an alternative joint longitudinalsurvival model that models biomarkers independently (Akhavan et al., 2018).
Albumin and BMI random intercepts are assumed to have the priors and , respectively. ’s are assumed to have the prior. Both and had the prior. We assumed matrix to be of the form , where is a matrix with the diagonal elements of 1 and and is the correlation matrix between the two biomarkers that is of size . We consider a prior on and an prior on . We put the prior on the Weibull shape parameter . We also consider independent priors on and . are assumed to be distributed according to an unknown distribution with the Dirichlet process prior. We consider the prior on and we consider to be the standard Normal distribution .
Table 4 shows the simulation results. As expected, the last observation carried forward Cox model leads to estimates that are shrunk towards 0 as this model is blind to the differential subjectspecific baseline hazards that are induced by subjectspecific values. The estimates under the Cox model are marginalized over all subjects and due to noncollapsibility of these models, the estimates are shrunk toward the null. Further, this model carries the most recent longitudinal measures forward to the event time where as the longitudinal measures at the event time might be quite different from the most recent measures. This is also caused the estimates to shrink towards 0. Next, the joint model with univariate longitudinal biomarkers provides estimates that are closer to the true values compared to the Cox model as the model is capable of detecting baseline subjectspecific values as well as providing a good prediction of the two biomarkers at time by flexibly modeling the trajectory of the biomarkers. Third, our joint multivariate longitudinalsurvival model proposed here is capable of modeling multiple biomarkers simultaneously by taking the correlation between the biomarkers and hence, leads to even closer coefficient estimates compared to our joint univariate longitudinalsurvival model.
Covariate of  True Conditional  LOCF Cox  Uni. Joint Model  Multi Joint Model  

Interest  Estimand  Mean  SD  MSE  Mean  SD  MSE  Mean  SD  MSE 
Albumin(t)  0.5  0.242  0.089  0.077  0.443  0.124  0.012  0.481  0.101  0.009 
BMI(t)  0.3  0.135  0.055  0.027  0.278  0.127  0.003  0.287  0.109  0.003 
Coefficient estimates under different models along with the corresponding standard deviation and meansquared error values per estimated coefficient.
5 Application of the Proposed Joint Multivariate Longitudinal Survival Model to DaVita Data on Hemodialysis Patients
In this section, we apply our proposed joint multivariate longitudinalsurvival model to a specific study of hemodialysis patients discussed in introduction. Here, we focus on a subset of data where every patient has at least 8 longitudinal measures of albumin and at least 8 longitudinal measures of calcium. This subset includes hemodialysis patients with an overall censoring rate in the data is 25.6%.
Although the longitudinal albumin and calcium biomarkers are supposed to be measured during every lab visit, however, we noticed that in our study cohort, on average at 12.4% of lab visits neither of these two biomarkers were measured. We also noticed that on average in 26.8% of lab visits there were no measured albumin biomarker and in 15.1% lab visits, there were no measured calcium biomarker.
With the aim of testing the association between mortality and the value of the biomarkers of interest, albumin and calcium, among hemodialysis patients, we analyze the data once using lastobservation carried forward Cox model, another time using a univariate joint longitudinalsurvival model (Uni. Joint Model), and finally using our recent multivariate joint longitudinalsurvival model (Multi. Joint Model). In order to adjust for potential confounder factors, other than longitudinal albumin and calcium measures, we also include age, sex, race, a baseline measure of phosphorus, and a baseline measure of iron.
Unlike the last observation carried forward Cox model that uses the most recent albumin and calcium biomarker values as the values of these biomarkers at each event time, our proposed joint longitudinalsurvival models flexibly model the trajectory of these biomarkers over time and at each event time, the models impute the most relevant biomarker values according to the trajectories of those biomarkers. Further, unlike the Cox model that marginalize covariate effects across all subjects, our proposed joint models are capable of detecting differential subjectspecific baseline hazards. Our univariate joint longitudinalsurvival method models the longitudinal albumin and calcium processes independently of each other. Our proposed multivariate joint longitudinalsurvival model introduced in this paper, however, models the two processes simultaneously. Simultaneously modeling the two processes will allow a better longitudinal trajectory specification as the model can borrow information from measurements of one process when measurements of the other process are missing.
Table 5 shows the results of analyzing the data using the three models of lastobservation carried forward Cox, our proposed univariate joint longitudinalsurvival model, our proposed multivariate joint longitudinalsurvival model. The results from all three models consistently show that age and albumin value at the time of death are significant risk factors of mortality among hemodialysis patients. The results from the Cox model show that each 1 g/dL decrement in albumin level corresponds to 3.4 times higher risk of death. The risk of death per each 1 g/dL decrement in albumin is estimated to be 4.2 and 5.1 times higher under our proposed univariate joint model and multivariate joint model, respectively. Further, compared to the univariate joint model, the multivariate joint model leads to 32% and 36% reduction in the 95% credible region of the estimated effect of every one unit decrement in albumin and calcium, respectively. This observation was expected as the multivariate joint model by simultaneously modeling albumin and calcium trajectories, estimate those trajectories with higher precision compared with the univariate joint model.
LOCF Cox Model  Uni. Joint Model  Multi. Joint Model  
No. of  No. of  Relative Risk  Relative Risk  Relative Risk  
Covariates  Cases  Deaths  (95% CI)  (95% CR)  (95% CR)  
Age (10y)  929  691  1.33 (1.20,1.48)  .001  1.50 (1.33,1.68)  1.56 (1.35,1.76) 
Sex  
Men  546  412  1.0  1.0  1.0  
Women  383  279  1.02 (0.78,1.32)  0.90  1.01 (0.72,1.41)  1.01 (0.73,1.41) 
Race  
White  489  337  1.0  1.0  
Black  264  204  1.79 (0.90,3.58)  0.11  1.76 (0.82,3.69)  1.78 (0.84,3.76) 
Hispanic  118  102  0.71 (0.40,1.25)  0.23  0.71 (0.39,1.26)  0.73 (0.37,1.28) 
Other  58  48  0.55 (0.28,1.11)  0.10  0.57 (0.20,1.18)  0.58 (0.20,1.18) 
Phosphorus  
(mg/dL)  929  691  1.07 (0.99,1.17)  0.08  1.10 (0.99,1.23)  1.10 (0.99,1.23) 
Iron  
(g/dL)  929  691  0.99 (0.98,0.99)  0.002  0.98 (0.94,1.02)  0.98 (0.94,1.02) 
Serum albumin(t)  
(1g/dL decrement)  929  691  3.36 (2.64,4.29)  0.0001  4.17 (2.78,5.72)  5.11 (3.86,6.29) 
Calcium  
(mg/dL)  929  691  1.09 (0.87,1.37)  0.45  1.19 (0.72,1.86)  1.27 (0.92,1.69) 
6 Discussion
When monitoring the health of subjects, often times multiple risk factors are measured over time. Collected longitudinal risk factors are often correlated with each other as they are measures taken on the same subject. Modeling these longitudinal risk factors simultaneously where the correlation between the risk factors are taken into account can be beneficial, specially when there exists differential measuring density in the collected risk factors. Further, the association between the collected risk factors and the survival outcomes is often the practitioners’ primary interest. In this paper, we proposed a joint longitudinalsurvival modeling framework with a longitudinal component capable of modeling multiple longitudinal processes simultaneously with the correlation between those processes taken into account. Our modeling framework is robust to common distributional assumptions as by using the Bayesian nonparameteric Gaussian process and Dirichlet process techniques, we avoid common functional and distributional assumptions in the model.
We used synthetic data in order to show the benefit of simultaneously modeling the trajectories of multiple longitudinal processes using our proposed multivariate longitudinal model as opposed to separate independent longitudinal models each modeling the trajectory of one longitudinal process independently from other longitudinal processes. Our findings show that a multivariate model has more precision in estimating the underlying trajectories of the longitudinal risk factors. Next, using synthetic data we showed that our proposed joint multivariate longitudinalsurvival model performs better in terms of meansquared error of the estimated survival coefficients compared to the modeling framework we introduced elsewhere where the longitudinal biomarkers modeled independently (Akhavan et al., 2018).
Our proposed modeling framework has some limitations. Our modeling framework is limited to the proportional hazards models only. Further, our method is computationally demanding and may not be scalable as number of subjects and withinsubject measurements increase. In future, our modeling framework can be extended by relaxing the proportional hazard assumption on the survival component. Also, one by using alternatives to the conventional MCMC techniques, including variational methods can make our modeling framework more computationally efficient.
In order to test the association between the longitudinal albumin and calcium biomarkers and mortality among hemodialysis patients, we used data on 929 hemodialysis patients. We analyzed the data using three models of lastobservation carried forward Cox model, the univariate joint longitudinalsurvival model we proposed elsewhere (Akhavan et al., 2018), and the multivariate longitudinalsurvival model that was proposed in this paper. While the results are consistent across all models, our proposed multivariate joint model that is capable of modeling the trajectory of longitudinal biomarkers with higher precision, leads to stronger estimated biomarker with higher precision for the estimated effect.
Acknowledgements
Babak Shahbaba was supported by the NIH grant R01AI107034.
References
 Akhavan et al. (2018) [author] Akhavan, S.S., Holsclaw, T.T., Shahbaba, B.B. Gillen, D. L.D. L. (2018). A Flexible Joint LongitudinalSurvival Model for Analysis of EndStage Renal Disease Data. ArXiv eprints.

Berger et al. (2003)
[author] Berger, JM Bernardo MJ Bayarri JOJ. B. M. B. J., Dawid, APA., Smith, D Heckerman AFMD. H. A. West, MM. (2003). Markov chain Monte Carlobased approaches for inference in computationally intensive inverse problems. In Bayesian Statistics 7: Proceedings of the Seventh Valencia International Meeting 181. Oxford University Press.
 Bernardo et al. (1998) [author] Bernardo, JMJ., Berger, JOJ., Dawid, APA., Smith, AFMA. et al. (1998). NonStationary Spatial Modeling.
 Flaxman et al. (2015) [author] Flaxman, SethS., Gelman, AndrewA., Neill, DanielD., Smola, AlexA., Vehtari, AkiA. Wilson, Andrew GordonA. G. (2015). Fast hierarchical Gaussian processes. Manuscript in preparation.
 Gelfand and Banerjee (2010) [author] Gelfand, Alan EA. E. Banerjee, SudiptoS. (2010). Multivariate spatial process models. Handbook of Spatial Statistics 495–515.
 Gelman et al. (2014) [author] Gelman, AndrewA., Carlin, John BJ. B., Stern, Hal SH. S. Rubin, Donald BD. B. (2014). Bayesian data analysis 2. Chapman & Hall/CRC Boca Raton, FL, USA.

Grzebyk and
Wackernagel (1994)
[author] Grzebyk, MichelM. Wackernagel, HansH. (1994). Multivariate analysis and spatial/temporal scales: real and complex models. In Proceedings of the XVIIth International Biometrics Conference 1 19–33. Citeseer.
 Majumdar and Gelfand (2007) [author] Majumdar, AnandamayeeA. Gelfand, Alan EA. E. (2007). Multivariate spatial modeling for geostatistical data using convolved covariance functions. Mathematical Geology 39 225–245.

Matheron (1973)
[author] Matheron, GeorgesG. (1973). The intrinsic random functions and their applications. Advances in applied probability 439–468.
 Neal (2011) [author] Neal, Radford M.R. M. (2011). Handbook of Markov Chain Monte Carlo. Chapman & Hall CRC.
 Ravel et al. (2015) [author] Ravel, VanessaV., Streja, ElaniE., Molnar, Miklos ZM. Z., Rezakhani, SepidehS., Soohoo, MelissaM., Kovesdy, Csaba PC. P., KalantarZadeh, KamyarK. Moradi, HamidH. (2015). Association of aspartate aminotransferase with mortality in hemodialysis patients. Nephrology Dialysis Transplantation gfv310.
 Schmidt and Gelfand (2003) [author] Schmidt, Alexandra MA. M. Gelfand, Alan EA. E. (2003). A Bayesian coregionalization approach for multivariate pollutant data. Journal of Geophysical Research: Atmospheres 108.

Ver Hoef, Cressie and
Barry (2004)
[author] Ver Hoef, Jay MJ. M., Cressie, NoelN. Barry, Ronald PaulR. P. (2004). Flexible spatial models for kriging and cokriging using moving averages and the Fast Fourier Transform (FFT). Journal of Computational and Graphical Statistics 13 265–282.
 Yadrenko (1987) [author] Yadrenko, MIM. (1987). Correlation theory of stationary and related random functions, Vol I, Basic results.
Appendix A Implementation
Consider the joint longitudinalsurvival likelihood function, , introduced in equation 3.1. Let be a vector of all model parameters with the joint prior distribution . The posterior distribution of the parameter vector can be written as
(A.1) 
where and denote longitudinal and timetoevent data respectively, and is the joint model likelihood function (equation 3.1).
The posterior distribution of the parameters in our proposed joint model is not available in closed form. Hence, samples from the posterior distribution of the model parameters are obtained via Markov Chain Monte Carlo (MCMC) methods. In particular, we use the Hamiltonian Monte Carlo (Neal, 2011) to draw samples from the posterior distribution. Prior distributions on parameters of the joint model were explained in details under the longitudinal and survival component specification, and we assume independence among model parameters in the prior (ie. is the product of the prior components specified previously). We provide further detail on less standard techniques for sampling from the posterior distribution when using a multivariate GP prior and we explain how to evaluate the survival portion of the likelihood function when timevarying covariates are incorporated into the model.
a.1 Evaluation of the Longitudinal Likelihood
Consider equation (3.7) and equation (3.8) where we introduced a flexible longitudinal model to simultaneously model multiple longitudinal biomarkers by using the Gaussian process prior. By marginalizing over in equation (3.7), one can show
(A.2) 
In order to sample from the posterior distribution of the parameters of the joint longitudinalsurvival model introduced in Section 3
, at each iteration of the MCMC, we need to compute the log posterior probability. Computing the log posterior probability involves evaluation of
and . This requires a memory space of and a computation time of per subject . Consider matrix , where the element of is . can be precomputed prior to starting the MCMC process. Further, by using the eigenvalue decomposition technique, one may make the calculation of the matrix determinant and inverse of the covariance matrix more computationally efficient. Using a similar idea proposed by Flaxman et al. (2015), we propose the following fast multivariate Gaussian process computation approach. We start precomputing the matrix. Also, we can precompute the eigenvale decomposition of this matrix prior to starting the MCMC process. Consider an eigenvalue decomposition of the following formwhere is a matrix of eigenvectors and is a diagonal matrix of eigenvalues. For a scalar , the eigenvalue decomposition of is of the form
At each iteration of the MCMC, one can obtain the eigenvalue decomposition of the matrix ,that is of the form
where is a matrix of eigenvectors and is a diagonal matrix of eigenvalues. One can then compute efficiently the logdeterminant of the crosscovariance matrix, , as
(A.3) 
and the inverse of the crosscovariance matrix, , can be efficiently computed as
(A.4) 
In equation (A.4), computation of the inverse of the term in the middle is very easy as it’s a diagonal matrix. Using our proposed efficient computation technique introduced here, we noticed a 30 times faster computation speed in our simulations.
a.2 Evaluation of the Survival Likelihood
We evaluate the survival likelihood using piecewise integration. Consider the survival time for subject that is denoted by and is distributed according to a Weibull distribution with shape parameter and scale parameter , where
Comments
There are no comments yet.