hapstat»documentation»frequency

Frequency estimation

Navigation

HAPSTAT estimates the joint haplotype frequencies of all the SNPs included in the analysis. Select the tab labeled Frequencies in the left panel; see Figure 2.1. The options available to the user are located in the right panel. After you click on Calculate, your results will display on the left.

Convergence criteria

HAPSTAT uses an EM algorithm to estimate haplotype frequencies. The algorithm terminates when the number of EM iterations exceeds the value specified by Iterations or when the change in parameter values between successive iterations satisfies the following inequality:

maxk | δk,i |   <   ε ,

where ε denotes the specified value of Tolerance,

δk,i   =   θk,i − θk,i-1 if   | θk,i-1 |   <   0.01
θk,i − θk,i-1 otherwise
θk,i-1

and θk,i is the estimate of parameter k at iteration i. By default, ε = 10−6 and the number of iterations is 2000.

Assumptions

Use the dropdown to estimate frequencies assuming the population is in Hardy-Weinberg equilibrium (default) or disequilibrium. For Hardy-Weinberg disequilibrium, HAPSTAT returns an estimate for the inbreeding coefficient (ρ).

Samples

For cross-sectional and longitudinal studies, HAPSTAT will automatically estimate frequencies based on all individuals. For a case-control study, choose to estimate haplotype frequencies of the combined case-control sample or consider cases and controls separately. The default is the control sample. For a cohort study, check Cohort to estimate haplotype frequencies based on all genotyped cohort members. Under case-cohort or nested case-control designs, the genotyped individuals are not representative of the entire cohort. Thus HAPSTAT also estimates haplotype frequencies based on all genotyped controls and a random sample of cases such that the proportion of cases used for estimation is the same as the proportion of controls that are genotyped. This option is referred to as Subcohort.

When incorporating external data, additional options are available. The option to estimate frequencies based on all families or unrelated individuals is referred to as Trio or Unrelated, respectively. You may also choose to estimate haplotype frequencies of the samples available for a particular study in combination with the external data. Multiple selections are permitted. The HAPSTAT frequency panel when including external file trio.dat is shown in Figure 2.2.

Summary

The results of the frequency estimation are summarized in the lower panel of the HAPSTAT display. In the rare event that the computation fails, an error status message is shown. It may then be necessary to increase the maximum iterations or decrease the error tolerance.

Sorting

Right click on the header of the column you wish to sort by and select Sort ascending or Sort descending. All columns are sorted accordingly.

Filtering

To display frequencies above a certain threshold, right click on the column header and select Filter. In the dialog box, specify the desired threshold and frequency sample. Select Show all to disable the filter.

Precision

You may change the decimal precision of the displayed frequency values via the menu option Settings»Precision or the icon on the toolbar. In the Precision dialog box, enter the number of digits to follow the decimal point for fixed notation (default) or the maximum number of significant digits for scientific notation. The default precision is 4.

Saving

To save frequency estimates, select the menu option File»Save or click the icon on the toolbar. Navigate to the desired directory and enter a file name or choose an existing one. Overwrite and append options are supported for existing files. Selecting the menu option File»Save All or the toolbar icon will save results of all open tabs. HAPSTAT result files are in text format and open with common word processing software.

14 october 2008
new release available
hapstat 3.0
command-line executable for Linux
25 june 2008
new release available
HAPSTAT 3.0
now supporting untyped SNP analysis
17 june 2008
HAPSTAT 2.0 update
29 february 2008
new release available
HAPSTAT 2.0
now supporting longitudinal studies
11 october 2007
now available
hapstat command-line executable for Linux
ENAR 2007
spring meeting
HAPSTAT 1.0 tutorial