[BioC] imputing missing data for 70mer array platform, need advice

Thu Jan 18 01:34:03 CET 2007

Hello,
If this has been discussed in the archives, my apologies but I 
couldn't find it. I am comparing two array CGH datasets, one 
generated by Nimblegen which is very complete and one generated by 
myself on a 70mer array with over 10,000 elements which has 3-4 
replicates for three species I have Nimblegen data for. I have 
calculated corrected pvalues for the nimblegen set using multtest and 
would like to do so for the 70mer set but have issues with missing 
data. I used t-tests, testing for variance, that filter out or 
disregard the missing data for the 70mer set already using the 
program ACUITY to calculate p-values.

I wanted to compare the corrected p-values after using a method to 
impute the missing data to see how different the results are from 
filtered dataset.

My question: For a 70mer array with one oligo per open reading frame 
what method of data imputation is best statistically. I looked over 
the knn method in the package impute (mostly recommended for 
expression data) and impute.lowess in the package aCGH which may be 
optimized for high density arrays from what i can tell and my 
apologies if that is not the case.

Does anyone have any recommendations about which method for imputing 
data I should try for a 70mer  platform? Thank you for your time.

Sincerely,
Betty Gilbert
-- 
Betty Gilbert
lgilbert at berkeley.edu
Taylor Lab
Plant and Microbial Biology
321 Koshland Hall
U.C. Berkeley
Berkeley, Ca 94720