| statistical analysis of haplotype-disease association | ||
| hapstat»documentation»effects |
HAPSTAT estimates the effects of haplotypes and environmental covariates and haplotype-environment interactions through regression modeling. For quantitative traits, the linear regression model is employed. For binary traits, the logistic regression model is employed, and the regression parameters pertain to the log odds ratios. For age-at-onset data, the Cox proportional hazards model is employed, and the regression parameters pertain to the log hazard ratios. The mode of inheritance can be additive, dominant, recessive or codominant. Under the additive model, having two copies of a causal haplotype has twice the effect on the trait as compared to having a single copy. Under the dominant model, having one or two copies has the same effect. Under the recessive model, only having two copies of the causal haplotype will affect the trait. Under the codominant model, the effect of having two copies can be arbitrarily different from that of having a single copy. In HAPSTAT, the codominant effects are decomposed into additive and recessive components.
Estimate haplotype effects by selecting the tab in the left panel labeled
Additive effects. The options available to the user display in the
right panel. The additive genetic model is set by default; changing this
setting in the options panel will change the selected tab label accordingly.
After you click on Calculate, your results will display on the left;
see
HAPSTAT uses the EM and Newton-Raphson algorithms to estimate haplotype effects.
The convergence criteria are the same as those used for the estimation of
haplotype frequencies described in the previous section. The maximization is
taken over all parameters in the likelihood. The default tolerance is
Select the additive, recessive, dominant or codominant mode of inheritance from
the left dropdown. Use the right dropdown to estimate haplotype effects under
Hardy-Weinberg equilibrium (default) or disequilibrium. For Hardy-Weinberg
disequilibrium, HAPSTAT will return an estimate for the inbreeding coefficient
The box labeled Effects is a static display of the main effects and interactions selected for estimation. By default, HAPSTAT selects the haplotype with the highest frequency in the default sample and all covariates, as well as the interactions between them. The selected haplotypes are compared to a reference group, which includes all unselected haplotypes.
To change the default selection, click the icon
on the toolbar to activate the Select
effects dialog, shown in
The haplotypes whose frequencies are no greater than the value specified by Threshold are removed from calculation. The default threshold is given by
where n is the total sample size. For case-control and cohort studies, frequencies are determined by the sample chosen from the adjacent dropdown. For case-control studies, the control sample is chosen by default; for cohort studies, the subcohort is the default. The default values of the Select effects dialog when using external data are discussed in the External data section under Examples.
The haplotypes above the threshold along with their frequencies are listed in the Gene panel. The Reference dropdown lists haplotypes whose frequencies are below the threshold along with those haplotypes that are above the threshold but are not selected for effects estimation. Covariates are listed in the Environment panel.
To add a main effect, click on the desired variable in the Gene or Environment list followed by the Add button. The selected variable now appears in the Effects panel under the heading Gene or Environment, respectively. To add an interaction, select the appropriate variables from the Gene and Environment lists and click the Add button. You can select multiple variables from the Environment list by using the Shift/Ctrl key. To remove a specific effect from the selection, click on that effect on the Effects panel followed by the Remove button. Clicking on a heading on the Effects panel will remove all associated effects.
In
For longitudinal studies, the user can specify both fixed and random effects through the Select effects dialog. See the Longitudinal data section under Examples for further detail.
Consider the multiple gene selection illustrated in
to activate the Select effects dialog.
Frequencies are estimated over all genes and haplotypes with frequencies no
greater than the Threshold in the joint distribution are excluded
from computation. For each gene, haplotypes and their frequencies from the
marginal distribution are listed in the corresponding Gene panel.
In the Select effects dialog, select haplotype 100 from
Gene A and haplotype 11 from Gene B to add the gene-gene interaction
100×11 to the Effects selection. Clicking Calculate
gives the result in
HAPSTAT can be used to analyze the effects of individual SNPs by treating each
SNP as a separate gene. By using the linkage disequilibrium information of
multiple SNPs to infer the missing SNP values, HAPSTAT provides efficient
estimation of SNP effects in the presence of missing data.
In the left panel, HAPSTAT displays the estimates of regression parameters and
their standard errors, together with the Wald statistics and two-sided p-values.
The lower panel displays the log-likelihood
You may change the decimal precision of the displayed results via the
menu option
Settings»Precision
or the icon
on the toolbar. To change the decimal precision for an individual column,
right-click on the column header and select Precision from the
drop-down menu. In the Precision dialog box, enter the number of
digits to follow the decimal point for fixed notation (default) or the
maximum number of significant digits for scientific notation. The default
precision is 4.
Select the menu option
File»Save
to save the effects estimates. To save both frequency and effect estimates,
select the menu option
File»Save All
or click the icon
on
the toolbar. The results for the case-control data using the options shown
in Figures 3.1-3.8 are given in
case-control.out.