[R] TCGA expression data: plotting ....

baumeist mark.baumeister7 at gmail.com
Thu Oct 6 20:32:41 CEST 2011


Hi,

I am new to R.
I am trying to figure out how to graph expression data from the TCGA
database.
If I understand correctly the expression data I have downloaded is from a
microarray using the AgilentG4502A.

I've had trouble reading into R in the level I, level II, and the gene
expression analysis data using 

>dat<-read.table("C:\\file.txt",  header=T, row.names=1)

for example:

> dat1<-read.table("C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt", 
> header=T, row.names=1)

> dat<-read.table("C:\\unc.edu__AgilentG4502A_07_3__TCGA-A6-2674-01A-02R-0821-07__gene_expression_analysis.txt", 
> header=T, row.names=1)

in all cases I get the error
 "more columns than column names"

I have only been able to read in the level II data with the code:

> dat2<-read.table("C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt_lmean.out.logratio.probe.tcga_level2.data.txt",header
> = TRUE, as.is = TRUE, sep="\t")

So this is what I am working with.

I can see that the dimensions of this data are 

> dim(dat2)
[1] 90798     2

When I print "dat2" to the screen it looks like this:
I assume that this is one patient with expression (intensity) data for a
large number of genes, but don't know.

49995        A_23_P67323                          -0.427
49996        A_23_P67330                         -0.3275
49997        A_23_P67332                          -0.409
49998        A_23_P67339                          3.2955
49999        A_23_P67355                           1.205

If I try to plot the data with the following below 

> names(dat2)
[1] "Hybridization.REF"            "TCGA.A6.2674.01A.02R.0821.07"


> x<-c("Hybridization.REF")
> y<-("TCGA.A6.2674.01A.02R.0821.07")


> plot(x,y,type='p',xlab='Hybridization.REF',ylab='TCGA.A6.2674.01A.02R.0821.07',main='plot')  


I get the error:

Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf
6: In max(x) : no non-missing arguments to max; returning -Inf
> 

I am really not sure how to plot this data, partly because I'm not sure what
the level II data represents.

Can anyone tell me what the level II data represents and what type of
plotting functions I might use?

Thanks in advance,
MAB

--
View this message in context: http://r.789695.n4.nabble.com/TCGA-expression-data-plotting-tp3879484p3879484.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list