| statistical analysis of haplotype-disease association | ||
| hapstat»documentation»examples |
The file cohort.dat, shown below, contains simulated data from a cohort study of 5000 individuals genotyped at five SNPs. The observation time and event indicator are specified in the columns titled “Time” and “Status”, respectively. The “Smoking” column contains environmental covariate data, and the columns SNP1-SNP5 represent the five SNP sites. Missing SNP values are indicated by ‘9’.
Select the tab labeled Frequencies in the left panel. In the
right panel, select Hardy-Weinberg disequilibrium, check both the
Cohort and Subcohort samples and click on
Calculate. Your results will display on the left; see
Select the tab labeled Additive effects. To estimate
dominant effects under Hardy-Weinberg disequilibrium, change the
Assumptions settings by highlighting the Dominant model
and Hardy-Weinberg disequilibrium from the dropdowns. Click
the icon
to activate
the Select effects dialog. Change Threshold to 0.01 and
the sample to Cohort. Click on Calculate to obtain the
display in
The results for the cohort data using the options shown in Figures 4.1 and 4.2 are given in cohort.out.
The file cross-sectional.dat, shown below, contains simulated data from a cross-sectional study of 5000 individuals genotyped at six SNPs. Approximately 5% of SNP values are missing. The column titled “Trait” contains disease-related trait data. The columns “Age”, “Gender” and “Exposure” contain environmental data and the columns SNP1-SNP6 represent the six SNP sites. Missing SNP values are denoted by ‘*’.
Click the
icon to create two genes, with SNP1-SNP3 as Gene 1 and SNP4-SNP6 as
Gene 2. Select the trait data and the three environmental variables and click
Continue.
Select the tab labeled Additive effects in the left panel. Click the
toolbar icon
to activate the Select effects dialog and add the interactions
001×101, 001×101×Age, 001×101×Gender and
001×101×Exposure. Click on Calculate to obtain
the display shown in
Results are provided in the file cross-sectional.out.
The file longitudinal.dat contains simulated data from a longitudinal study of 1000 individuals with a quantitative trait measured at five regular time points and with five SNPs. The column titled “Subject” provides the identifier of the individual. The trait is specified in the column titled “Weight”. The “Time” covariate specifies the time point at which the measurement was taken. Columns SNP1-SNP5 represent the five SNP sites.
Select the tab labeled Additive effects and click on
Calculate to obtain the result shown in
to activate the Select effects dialog and select the tab labeled
Random effects. Add the “Time” covariate to the random
effects selection as shown in
Results are provided in the file longitudinal.out.
The file
case-control.2.dat
contains the same data as in
case-control.dat
with the exception of SNP3, which we will consider as untyped. Select the menu
option File»Open»Case-control data file to open the file
case-control.2.dat.
Next, select the menu option
File»Open»External data file»Trio data file
to open the file trio.dat.
Select SNP1-SNP5 as Gene variables from trio.dat as shown in
Click the tab labeled case-control.2.dat. Select the column titled
“Status” as the disease status and the columns “Age”
and “Gender” as environmental variables. Click the
icon twice to create
three genes. Select columns “SNP1” and “SNP2” as
Gene 1 variables. As SNP3 is present in the external file but not
in the study file, we type “SNP3” in the variable box for
Gene 2. Select columns “SNP4” and “SNP5”
as Gene 3 variables. See
Click the icon
to
activate the Select effects dialog. When external data is
present, HAPSTAT will determine frequencies based on all trios or unrelated
individuals by default. The default sample may be changed via the dropdown
menu adjacent to the Threshold box. To exclude external data from
analysis, uncheck the box labeled Use external data.
In the Select effects dialog, remove the environmental effects by
clicking on the Environment heading followed by the Remove
button. Remove all effects and interactions for Gene 1 and Gene 3 by clicking
on the Gene 1 and Gene 3 headings followed by the
Remove button, respectively. Remove the interactions 0*Age and
0*Gender from Gene 2. See
We can compare the result of the untyped SNP analysis to the typed SNP analysis
of the data in
case-control.dat.
Select the tab trio.dat and close the file via the menu option
File»Close or the icon
on the toolbar. Select
the menu option
File»Open»Case-control data file
to open the file
case-control.dat.
Select the column titled “Status” as the disease status and the
columns “Age” and “Gender” as environmental variables.
Click the
icon twice to
create three genes. Select columns “SNP1” and “SNP2” as
Gene 1 variables, column “SNP3” as the Gene 2
variable and columns “SNP4” and “SNP5” as
Gene 3 variables. Click Continue to proceed. Click the
icon
to activate the
Select effects dialog and remove effects as in
The file unrelated.dat,
shown below, contains simulated data for 1000 unrelated individuals
genotyped at five SNPs.
Select the menu option File»Open»External data file»Unrelated data file
to open the file
unrelated.dat.
Select the columns SNP1-SNP5 as Gene variables.
Select the menu option File»Open»Case-control data file to open the file
case-control.2.dat
and select variables from the file as shown in
to activate the
Select effects dialog and remove variables from the selection as
shown in