genoCNV {genoCN} | R Documentation |
extract genotype and copy number infomration for copy number variation, which are inheritable DNA polymorphisms and are observed in normal tissues
genoCNV(snpNames, chr, pos, LRR, BAF, pBs, sampleID, Para=NULL, fixPara=FALSE, cnv.only=NULL, outputSeg = TRUE, outputSNP = 1, outputTag = sampleID, Ds = c(1e+09, 1e+10, rep(1e+06, 7)), pBs.alpha = 0.001, loh = FALSE, min.tp = 0.01, max.diff = 0.1, epsilon = 0.002, K = 10, maxIt = 300, traceIt = 5)
snpNames |
a vector of SNP names. SNPs must be ordered by chromosme locations |
chr |
chromosomes of all the SNPs specified in snpNames |
pos |
positions of all the SNPs specified in snpNames |
LRR |
Log R Ratio of all the SNPs specified in snpNames |
BAF |
B Allele Frequency of all the SNPs specified in snpNames |
pBs |
population frequency of of all the SNPs specified in snpNames |
sampleID |
symbol/name of the studied sample. Only one sample is studied each time |
Para |
a list of initial parameters for the HMM. If Para is NULL, The default initial parameters: init.Para.CNA is used |
fixPara |
if fixPara is TRUE, the parameters in Para are fixed, and are used directly to calculate posterior probabilities |
cnv.only |
a vector indicating those CNV-only probes, for which we only consider their Log R ratio. If it is NULL, there is no CNV-only probes |
outputSeg |
wether to output the informaiton of copy number altered segments |
outputSNP |
if outputSNP is 0, do not output SNP specific information (genotype, copy number and the corresponding posterior probability); if outputSNP is 1, output the information of the SNPs that are within copy number altered regions; if outputSNP is 2, output the information of all the SNPs |
outputTag |
the prefix of the output files, output of copy number altered segments is written into file outputTag_segment.txt, and output of SNP information is written into file outputTag_SNP.txt |
Ds |
Parameter to for trnansition probability of the HMM. A vector of length N, where N is the number of states in the HMM |
pBs.alpha |
pBs.alpha is the lower limit of population B allele frequency, and the upper limit is 1 - pBs.alpha |
loh |
Whether we use the copy-number-neutral loss of heterozygosity state for CNV studies. |
min.tp |
the minimum of transition probability. |
max.diff |
Due to normalizaiton procedure, the BAF may not be symmetric. Let's use state (AAA, AAB, ABB, BBB) as an example. Ideally, mean values of normal components AAB and ABB, denoted by mu1 and mu2, repectively, should have the relation mu1 = 1-mu2 if BAF is symmetric. However, this may not be true due to normalization procedures. We restrict the difference of mu1 and (1-mu2) by this parameter max.diff. |
epsilon |
see explanation of K |
K |
epsilon and K are used to specify the convergence criteria. We say the estimate.para is converged if for K consecutive updates, the maximum change of parameter estimates in every adjacent step is smaller than epsilon |
maxIt |
the maximum number of iterations of the EM algrithm to estimate parameters |
traceIt |
if traceIt is a integer n, then the runing time is printed out in every n iterations of the EM algorithm. if traceIt is 0 or negative, no tracing information is printed out. |
results are written into output files
Wei Sun and Zhengzheng Tang