[BioC] how to import some specific columns into R

Martin Morgan mtmorgan at fhcrc.org
Thu Apr 12 14:35:07 CEST 2012


On 04/12/2012 03:05 AM, narges [guest] wrote:
>
> hi
> I have a huge counts file containing tags and all the related samples (129 samples) and I want to load only some columns(samples) to DGEList object not the result(counts) of all samples.
> I have used scan but it does not work.
> Thanks alot
> Best Regards
>
>
>   -- output of sessionInfo():
>
>> scan ("count.txt", what = list (0,NULL,0),skip=2,flush=TRUE)
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>    scan() expected 'a real', got 'ENSG00000000005'
> #############################

If you had a text file with a character column, then 1000 numeric 
columns, and you wanted to read the character column and the 2nd, 10th 
and 100th numeric columns you might say

reals <- rep(list(NULL), 1000)
reals[c(2, 10, 100)] <- list(numeric())
what <- c(list(character()), reals)

you'd then end up with something like

 > head(what)
[[1]]
character(0)

[[2]]
NULL

[[3]]
numeric(0)

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

and these would be used in the 'what' argument to scan. An additional 
'trick' is to name the elements of 'what'; these are then included as 
the names of the elements returned by scan.

>   d<- read.table("count.txt", colClasses = c(rep("character",1),rep("integer",2), rep("null",126)),header=TRUE)
> Error in methods::as(data[[i]], colClasses[i]) :
>    no method or default for coercing "character" to "null"

"NULL" rather than "null", so along the lines of

reals <- rep("NULL", 1000)
reals[c(2, 10, 100)] <- "numeric"
colClasses <- c("character", reals)

and

 > head(colClasses)
[1] "character" "NULL"      "numeric"   "NULL"      "NULL"      "NULL"

Martin

>> d<- read.table("count.txt", colClasses = c(rep("character",1),rep("integer",2), rep("NA",126)),header=TRUE)
> Error in methods::as(data[[i]], colClasses[i]) :
>    no method or default for coercing "character" to "NA"
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list