\name{anova2}
\alias{anova2}
\title{ two way ANOVA analysis }
\description{
    ANOVA analysis for one continuous response and two factorial predictors.
    It is pretty fast in handling a large number of responses and predictors.
}
\usage{
anova2(me, mm, cuts, cuts.direction, output, minLen = 5, binary.marker=FALSE, coef=FALSE)
}
\arguments{
  \item{me}{ matrix of responses, each row is one response. This matrix is gene
   expression data in eQTL analysis. Only numerical values and NA allowed.
  }
  \item{mm}{ matrix of predictors, each row is one predictor. This matrix is
   marker genotype data in eQTL analysis. Only values -1, 0, 1 and NA allowed. 
  }
  \item{cuts}{ a vector of four cut-off values. Sequentially, ANOVA P-value
    cut-offs of first main effect, second main effect, interaction effect
    and cut-off of Chi-square test P-value that test the independence of the
    two markers.
  }
  \item{cuts.direction}{ a vector of length 4, which indicates the direction of 
    in-equality as we filter out result with p-value cutoff in the argument 
    cuts. Taking only two values, -1 or 1. -1 means``smaller than'' and 1 
    means``bigger than''. For example, if cuts = c(0.01, 0.01. 0.001, 0.05) and 
    cuts.direction = c(1, 1, -1, 1), the filtering rule is p-value of two main 
    effects are bigger than 0.01, p-value of interaction effect is smaller than 
    0.001, and p-value of chi-square test is bigger than 0.05.
  }
  \item{output}{ name  of output file }
  \item{minLen}{ minimum number of observations in one factor level. It must 
    be positive integer. Cases that there is no observation in any one factor
    level(the model matrix is singlar) are skipped. }
  \item{binary.marker}{ whether the genotype of one marker is binary or not }
  \item{coef}{ wether output the coefficients of linear model }
}
\details{
  this algorithm is similar to the one used in R. But in order to save time,
  some unnecessary steps are skipped. The algorithm decompose variance by QR
  decomposition. First, do QR decomposition of the model matrix, and then
  project gene expression y to successive orthogonal subspaces generated by the
  QR decomposition: t(Q) \%*\% y, and group the projections into 3 groups: main
  effect 1, main effect 2, and interaction. Also use QR decomposition to
  calculate residuals. At the end, calculate F-statistics and p-value.
}
\value{
    return 1 if succeeds, 0 otherwise. The ANOVA computation result is written
    into output file. 
    There are at least seven columns in the output file:
    \item{M1\_ID}{ The corresponding row number of the 1st marker in the input 
      marker data matrix mm }
    \item{M2\_ID}{ The corresponding row number of the 2nd marker in the input 
      marker data matrix mm }
    \item{GENE\_ID}{ The corresponding row number of the gene in the input gene 
      expression matrix me }
    \item{MSE\_MLE}{ The maximum likelihood estimate of mean square error, which
      used in get.lod function }
    \item{P1}{ ANOVA P-value of 1st main effect, corresponding to marker M1\_ID }
    \item{P1}{ ANOVA P-value of 2nd main effect, corresponding to marker M2\_ID }
    \item{P12}{ ANOVA P-value of interaction effect }
    If coef=TRUE, additional columns specifying coefficients will be stored in
    the output file too. 
}
\references{  }
\author{ Wei Sun sunwei@stat.ucla.edu }
\note{ 
    This anova test actually uses sequential variance decomposition. That
    means the order of predictors does matter if the two predictors have
    correlation. Chisq-test is used to filter out those marker pairs that
    are similar. 
}

\seealso{ \code{\link{anova1}} \code{\link{get.lod}} }
\examples{
data(yeast.me)
data(yeast.mm)

me = yeast.me
mm = yeast.mm
cuts = c(1e-2, 1e-2, 1e-3, 1e-5)
cuts.direction = c(1, 1, -1, 1)
a = anova2(me, mm, cuts, cuts.direction, "1g2mbAnova.txt", 5, TRUE)
}
\keyword{ methods }
