Double resampling for metaanalysis with random effects
SAS Program for Sample Size Calculation in Stratified CaseCohort Designs
MatLab Codes for Estimation and Inference in Some
Semiparametric Models

Joint analysis of
longitudinal data with recurrent events and terminal event
Description
This program is to analyze longitudinal data
and recurrent events, both subject to a terminal event. Transformation
models are adopted to model intensity function for recurrent events and
hazards rate function for terminal event.
The NPMLE is calculated using recursive
formula within EM algorithm. The variance estimates are obtained by inverting the observed
information matrix.
Reference
A manuscript by Kim, Zeng, Li and Chambless (available upon request).
Data Input
A sample data set is provided. 1) Data file "Ydata.RData" consists of:
"ID"= subject id, "Y"= repeated measurements, and covariates "intercept",
"Male" and "Age".
2) Another data file "Edata.RData" consists of:
"ID"= subject id, "time"= observed recurrent and terminal event times,
"DeltaR"= indicator 1=recurrent, 0=terminal/censoring, "DeltaT"= indicator
1=terminal 0=censoring, and covariates "Male" and "Age".
Use of
Codes
"main.R" is the main program to read data, run the function programs
("func.R") , and print a reportable table along with the information on
which transformation model has been used. You can change the transformation
class (LOG or BoxCox) and parameters (rH and rG) in "func.R" before you run
"main.R". Please make sure that the source function file "func.R" is
properly recognized by the code "source('func.R')".

Analysis of crosshazard
survival data with power transformation
Description
This program is to analyze the survival data
where hazard function may cross each other. A powertransformation model is used
to model hazard rate function. The NPMLE is calculated using recursive
formula and the variance estimates are obtained by inverting the observed
information matrix.
Reference
A general theory and one example are given in Zeng and Lin (2006, JRSSB).
Data Input
The data contain the event variable, censoring indicator, the list of covariates
for baseline hazard and the list of covariates for power function.
Use of
Codes
"main.m" is the main program; "loglik.m" is the function for computing
the likelood function; "recursive.m" is for recursively calculating
the nuisance parameter; "NR.m" is the NewtonRaphson iteration for
the profile likelihood function; "simulation.m" is the program for simulating
data.
Sample Data
Available upon request.

Analysis of bivariate
survivals using transformation models
Description
This program is to analyze the bivariate (essential multivariate)
survival data
using flexible transformation models. Different transformations may be used for different types of
event and random effects are shared among events. The QEM is used for compuating the NPMLE
and the variance estimates are obtained by inverting the observed
information matrix from the Louis formula.
Reference
A general theory and one example are given in Zeng and Lin (2006, JRSSB).
Data Input
The data contain the survival time, the censoring indicator, the list of covariates for
the event of first type then following the same list for the event of second type.
Use of
Codes
"main.m" is the main program; "Estep.m" and "Mstep.m" are respectively the Estep and
Mstep of the QEM algorithm; "Covest.m" is the function for obtaining the variance estiamtes;
"Gtransformation.m" is the function defining the transformations;
and "simulation.m" is the program for generating data.
Sample Data
Available upon request.

Analysis of recurrent event
using transformation models
Description
This program is to analyze recurrent event data with
dropout. The AndersenGill intensity model is used with subjectspecific frailty in
the model. The transformation models give a wide class of flexible models capturing
the complex intensity process. A combination of the EM algorithm (treating random
effects as missing data) and the recursive calculation are used to derive the NPMLE
and
the variance estimates are obtained by inverting the observed
information matrix using the Louis formula.
Reference
A general theory is given in Zeng and Lin (2006, JRSSB).
Data Input
The data contain the ID variable, the event variable, atrisk indicator, the list of covariates
for fixed effects and the ones for random effects.
Use of
Codes
"main.m" is the main program; "Estep.m" and "Mstep.m" are respectively the Estep and
Mstep of the QEM algorithm; "covest.m" is the function for obtaining the variance estiamtes;
"Gfun.m" is the function defining the transformations; "recursive.m" is the recursive function
used in the Mstep;
and "simulation.m" is the program for generating data.
Sample Data
Available upon request.

Analysis of nonparametrically transformed repeated measurments
and survival data
Description
This program is to analyze joint model of repeated measurements
and survival data. Particularly, the repeated measurements follows a mixed effect model after
an unknown transformation.
The NPMLE is calculated using the optimization routine and the variance estimates are obtained by inverting the observed
information matrix.
Reference
One example is given in Zeng and Lin (2006, JRSSB).
Data Input
The data contain ID, repeated measurements, covariates for repeated measurements,
survival times, censoring indicator and covariates for survival event.
Use of
Codes
"main.m" is the main program; "gauher.m" is the function for obtaining GaussHermite
abscissi and weights; "fgh.m" is the function providing the objective function value and
its gradient as well as hessian matrix; "GTtransform.m" defines the transformation for
survival event; "GRtransform.m" defines the model structure for repeated measurements;
and "simulation.m" is the program for simulating
data.
Sample Data
Available upon request.

Analysis of repeated
measurements and survival event using Cox model
Description
This program is to analyze the data
with repeated measurements and terminal event. The former can be
usual longitudinal outcomes such as CD4 count, blood pressure and
quality of life etc. The latter can be survival time or dependent
censoring time. The terminal event can be further subject to right
censoring. The general method is to propose two models for these
two different outcomes: one is linear mixed effect model for
repeated measurements and the other is the proportional hazards
model for terminal event. The shared random effects are used to
account for the dependence between these two types of outcomes.
The estimation is based on the nonparametric maximum likelihood
estimation and implemented in EM algorithm. The variance estimates
are obtained either using the profile likelihood function or the
Louis formula.
Reference
One application is
in Zeng and Cai (2005, Lifetime Data Analysis). A general theory
is given in Zeng and Cai (2005, Annals of Statistics).
Data Input
Two data sets are input. The first one
includes subject ID, observed repeated measurements, covariate
matrix for fixed effect, covariate matrix for random effect; the
second data includes subject ID, the observed event, the
censoring indicator, covariates for fixed effect, covariate matrix
for random effect. "Position" records the starting position of
each subject in the first data and "nsub" records the number of
repeated measurements for each subject.
Use of
Codes
"main.m" is the main program; "Eest.m" and "Mest.m"
are the Estep and Mstep in the EM algorithm respectively;
"coxphm.m" is the Coxmodel for fitting survival event only;
"likVec.m" calculates the loglikelihood vector for given
parameters; "profEest2.m" is the Estep in calculating the profile
likelihood function and "profCov2.m" is the program for computing
variance using the profile likelihood function.
Sample
Data
Available upon request.

Analysis of correlated events using proportional
odds model
Description
A commonly used alternative to the proportional hazards model is the proportional odds model. The program
is to analyze correlated data using the latter with random effects. Suitable data often show that
the survival curves in two groups become close to each other over time. The nonparametric maximum likelihood
estimation is used and the variance estimation is based on the inverted observed information matrix. The optimum
search algorithm is used for calculation.
Reference
The method and examples are given in Zeng, Lin and Ying (2005, JASA).
Data Input
The data include subject ID, covariate matrix, events and censoring
indicator.
Use of Codes
The provided codes are from one simulation study. "main.m" is the main
program. "fgh.m" is the function evaluating the likelihood and its
derivatives. "simulation.m" is the program for simulating data.
Sample Data
Available upon request.

Analysis of cure survival events using transformation model
Description
The program is to analyze survival events using a broad class of models when some subjects are observed
to be cured. Suitable data often show that the survival curve becomes plateau after a long term followup.
The models for fitting data are the transformed promotion time model and they
can be any linear transformation models including proportional hazards model and proportional odds model. The
algorithm for computing the nonparametric
maximum likelihood estimates is based on a recursive formula to profile the loglikelihood function. The
variance estimate comes from the profile likelihood function.
Reference
The method and examples are given in Zeng, Ying and Ibrahim (2006, JASA).
Data Input
The data include observed events, censoring indicator and covariates.
The transformation for fitting data should be specified as well.
Use of Codes
The provided codes are used for analyzing a real data in cancer
study. "main.m" is the main program. "Gfun.m" is the function evaluating
transformation. "GHbetaalpha.m" are the derivatives of the profile
likelihood function. "recursive.m" is the recursive program for profiling
baseline hazards function. The details can be found in the given
reference.
Sample Data
Available upon request.
 Analysis of
recurrent events and terminal event
Description
This program is to analyze recurrent
events with terminal event. The latter occurs when subjects have
premature drop out or censored administratively. The
AndersenGill model and the proportional hazards model can used to
model these two processes. The shared random effects are used to
account for their dependence. The estimation is based on the
nonparametric maximum likelihood estimation and is implemented
using the EM algorithm. The inference is based on the profile
likelihood function or the variance estimated by the Louis
formula.
Reference
The manuscript is available
upon request.
Data Input
"ID": subject id;
"Y": observed recurrent events and terminal event; "DeltaR":
indicator 1=recurrent, 0=terminal; "X": covariate matrix; "tau":
terminal event; "DeltaT": censoring indicator for terminal event;
"ZR": covariate matrix for random effects in the recurrent event
model; "ZT": covariate matrix for random effects in the terminal
event model.
Use of Codes
The codes are from
simulation study. "main.m" is the main program; "simulate.m" is
the program for simulating data; "Estep.m" and "Mstep.m" are the
respective Estep and Mstep of the EM algorithm; "CovEst.m" is
the program for covariance estimation.
Sample
Data
Available upon request.

Description
The program is to analyze the
haplotype effect and the haplotypeenvironment interaction in
nested casecontrol or casecohort study. The proportional hazards
model is used to model the relationship between haplotype and
onset of disease/cancer. Since only genotype is observed instead
of haplotype itself, the EM algorithm is used to derive estimation
and inference for this missing covariate problem. The frequencies
of haplotypes are allowed to be either HardyWeinberg equilibrium
or disequilibrium.
Reference
The description
and application to a real data of this method are given in Lin,
Zeng and
Millikan (2005, Genetic Epidemiology).
Data
Input
Input variables include subject id, observed
events, event censoring indicator, genotype for sampled subjects,
indicator for sampled subjects, covariate matrix.
Use
of Codes
The main program is "program.m". These codes are
used for analyzing one real data so variables in "program" bear
real names of these variables. "preselect.m" is the screening
program choosing the top frequent haplotype for analysis to
reduce computational burden. "Hapfreq.m" and "HapfreqEM.m" are
used to estimate haplotype frequencies. "Estep.m", "Mstep.m" and
"profcovest.m" are the programs for EM algorithm and variance
estimation.
Sample Data
Available upon
request