hapstat»documentation»examples

Examples

Cohort data

The file cohort.dat, shown below, contains simulated data from a cohort study of 5000 individuals genotyped at five SNPs. The observation time and event indicator are specified in the columns titled “Time” and “Status”, respectively. The “Smoking” column contains environmental covariate data, and the columns SNP1-SNP5 represent the five SNP sites. Missing SNP values are indicated by ‘9’.

cohort.dat:   Example cohort data file for HAPSTAT input.

Select the tab labeled Frequencies in the left panel. In the right panel, select Hardy-Weinberg disequilibrium, check both the Cohort and Subcohort samples and click on Calculate. Your results will display on the left; see Figure 4.1.

Select the tab labeled Additive effects. To estimate dominant effects under Hardy-Weinberg disequilibrium, change the Assumptions settings by highlighting the Dominant model and Hardy-Weinberg disequilibrium from the dropdowns. Click the icon to activate the Select effects dialog. Change Threshold to 0.01 and the sample to Cohort. Click on Calculate to obtain the display in Figure 4.2.

The results for the cohort data using the options shown in Figures 4.1 and 4.2 are given in cohort.out.

Cross-sectional data

The file cross-sectional.dat, shown below, contains simulated data from a cross-sectional study of 5000 individuals genotyped at six SNPs. Approximately 5% of SNP values are missing. The column titled “Trait” contains disease-related trait data. The columns “Age”, “Gender” and “Exposure” contain environmental data and the columns SNP1-SNP6 represent the six SNP sites. Missing SNP values are denoted by ‘*’.

cross-sectional.dat:   Example cross-sectional data file for HAPSTAT input.

Click the icon to create two genes, with SNP1-SNP3 as Gene 1 and SNP4-SNP6 as Gene 2. Select the trait data and the three environmental variables and click Continue.

Select the tab labeled Additive effects in the left panel. Click the toolbar icon to activate the Select effects dialog and add the interactions 001×101, 001×101×Age, 001×101×Gender and 001×101×Exposure. Click on Calculate to obtain the display shown in Figure 4.3.

Results are provided in the file cross-sectional.out.

Longitudinal data

The file longitudinal.dat contains simulated data from a longitudinal study of 1000 individuals with a quantitative trait measured at five regular time points and with five SNPs. The column titled “Subject” provides the identifier of the individual. The trait is specified in the column titled “Weight”. The “Time” covariate specifies the time point at which the measurement was taken. Columns SNP1-SNP5 represent the five SNP sites.

longitudinal.dat:   Example longitudinal data file for HAPSTAT input.

Select the tab labeled Additive effects and click on Calculate to obtain the result shown in Figure 4.4. Click the icon to activate the Select effects dialog and select the tab labeled Random effects. Add the “Time” covariate to the random effects selection as shown in Figure 4.5. Click on Calculate to obtain the display in Figure 4.6.

Results are provided in the file longitudinal.out.

Analysis of untyped SNPs with external trio data

The file case-control.2.dat contains the same data as in case-control.dat with the exception of SNP3, which we will consider as untyped. Select the menu option File»Open»Case-control data file to open the file case-control.2.dat. Next, select the menu option File»Open»External data file»Trio data file to open the file trio.dat. Select SNP1-SNP5 as Gene variables from trio.dat as shown in Figure 1.4.

Click the tab labeled case-control.2.dat. Select the column titled “Status” as the disease status and the columns “Age” and “Gender” as environmental variables. Click the icon twice to create three genes. Select columns “SNP1” and “SNP2” as Gene 1 variables. As SNP3 is present in the external file but not in the study file, we type “SNP3” in the variable box for Gene 2. Select columns “SNP4” and “SNP5” as Gene 3 variables. See Figure 4.7 for illustration. Click Continue to proceed.

Click the icon to activate the Select effects dialog. When external data is present, HAPSTAT will determine frequencies based on all trios or unrelated individuals by default. The default sample may be changed via the dropdown menu adjacent to the Threshold box. To exclude external data from analysis, uncheck the box labeled Use external data.

In the Select effects dialog, remove the environmental effects by clicking on the Environment heading followed by the Remove button. Remove all effects and interactions for Gene 1 and Gene 3 by clicking on the Gene 1 and Gene 3 headings followed by the Remove button, respectively. Remove the interactions 0*Age and 0*Gender from Gene 2. See Figure 4.8. Click OK on the Select effects dialog. Click on Calculate to obtain the display in Figure 4.9. Results are provided in trio.out.

We can compare the result of the untyped SNP analysis to the typed SNP analysis of the data in case-control.dat. Select the tab trio.dat and close the file via the menu option File»Close or the icon on the toolbar. Select the menu option File»Open»Case-control data file to open the file case-control.dat. Select the column titled “Status” as the disease status and the columns “Age” and “Gender” as environmental variables. Click the icon twice to create three genes. Select columns “SNP1” and “SNP2” as Gene 1 variables, column “SNP3” as the Gene 2 variable and columns “SNP4” and “SNP5” as Gene 3 variables. Click Continue to proceed. Click the icon to activate the Select effects dialog and remove effects as in Figure 4.8. Click OK on the Select effects dialog followed by Calculate. The result is shown in Figure 4.10.

Analysis of untyped SNPs with external data of unrelated individuals

The file unrelated.dat, shown below, contains simulated data for 1000 unrelated individuals genotyped at five SNPs. Select the menu option File»Open»External data file»Unrelated data file to open the file unrelated.dat. Select the columns SNP1-SNP5 as Gene variables. Select the menu option File»Open»Case-control data file to open the file case-control.2.dat and select variables from the file as shown in Figure 4.7. Click Continue to proceed. Click the icon to activate the Select effects dialog and remove variables from the selection as shown in Figure 4.8. Click Calculate to obtain the result in Figure 4.11 and unrelated.out.

unrelated.dat:  External data file of unrelated individuals.
14 october 2008
new release available
hapstat 3.0
command-line executable for Linux
25 june 2008
new release available
HAPSTAT 3.0
now supporting untyped SNP analysis
17 june 2008
HAPSTAT 2.0 update
29 february 2008
new release available
HAPSTAT 2.0
now supporting longitudinal studies
11 october 2007
now available
hapstat command-line executable for Linux
ENAR 2007
spring meeting
HAPSTAT 1.0 tutorial