`lm()`

, `anova()`

) in R Matrix eQTL calculations are based on the linear regression model. The following R code samples confirm that for various scenarios and randomly generated data the estimates, statistics, and p-values produced by Matrix eQTL match those produced by the existing linear regression function `lm`

in R. All five test code files are also available via R command:

• `demo( package="MatrixEQTL" )`

- Test code for simple linear regression.
- Test code for linear regression with covariates.
- Test code for heteroskedastic linear regression with covariates.
- Test code for heteroskedastic ANOVA with covariates.
- Test code for heteroskedastic 5-category ANOVA with covariates.
- Test code for heteroskedastic linear regression with covariates, testing an interaction term.

The artificial dataset used in performance testing is generated to have a strongly correlated first gene-SNP pair. Both R and Matlab version of Matrix eQTL produce for following results for it:

no covariates | 10 covariates | |||

Additive model | t = 23.816 | pv = 3.768e-096 | t = 23.927 | pv = 1.386e-096 |

ANOVA model | F = 284.479 | pv = 5.423e-095 | F = 287.112 | pv = 2.031e-095 |

We ran the analysis in Plink with and without covariates. The output file for the first gene (`Assoc.txt.P1.qassoc`

) shows results matching those above.

CHR | SNP | BP | NMISS | BETA | SE | R2 | T | P |

1 | Snp_00001 | 1 | 840 | 1.006 | 0.04224 | 0.4036 | 23.82 | 3.768e-096 |

The output for the model with covariates (`Linear.txt.P1.assoc.linear`

) also agrees with the output of Matrix eQTL.

CHR | SNP | BP | A1 | TEST | NMISS | BETA | STAT | P |

1 | Snp_00001 | 1 | C | ADD | 840 | 1.006 | 23.93 | 1.386e-096 |

The analysis with no covariates was performed using the following command in R:

`snpMatrix::snp.rhs.estimates( d ~ 1, family = "Gaussian", snp.data = snps1 )`

The output for the first SNP agrees with the previous results (although the statistic is called z-value, not t-statistic).

Model | Y-variable | Parameter | Estimate | S.E. | z-value |

Snp_00001 | d | Snp_00001 | 1.006 | 0.042242 | 23.816 |

The analysis with 10 covariates was performed using a similar command:

`snpMatrix::snp.rhs.estimates( d ~ cvrt, family = "Gaussian", snp.data = snps1 )`

Again, the results agree with those presented before.

Model | Y-variable | Parameter | Estimate | S.E. | z-value |

Snp_00001 | d | Snp_00001 | 1.0063 | 0.042057 | 23.927 |

Although FastMap does not support covariates, it can estimate both additive and ANOVA models.
It output files contain `-log`

._{10}(p-value)

For the additive linear model, the p-value is 10^{ -95.423908} = 3.76784E-96.

For the ANOVA model, the p-value is 10^{ -94.265731} = 5.42337E-95.

R/qtl and Merlin use other models so I can not directly match their output.