[BioC] splicing in Affymetrix

Christos Hatzis christos.hatzis at nuverabio.com
Mon May 5 22:17:21 CEST 2008


Hi Balazs,

Does this do what you need?

dat <- read.table("clipboard", header=TRUE)
dat$pset <- gsub("([0-9])+$", "", rownames(dat))

# probe set means and number of probes
dat.sum <- with(dat, aggregate(dat[, 1:2], by=list(pset), FUN="mean"))
dat.sum$n <- with(dat, aggregate(dat[, 1], by=list(pset), FUN="length"))[,
2]

# scaled intensities
dat$SAMPLE1.SCALED <- dat$SAMPLE1/with(dat.sum, rep(SAMPLE1, n))
dat$SAMPLE2.SCALED <- dat$SAMPLE2/with(dat.sum, rep(SAMPLE2, n))
dat 

-Christos

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
 


> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch 
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of 
> Dr Balazs Gyorffy
> Sent: Monday, May 05, 2008 3:34 PM
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] splicing in Affymetrix
> 
> Hi All!
>    
>   I have a table (see attached shortened example). Here the 
> columns represent the samples (two here) and the rows the 
> probe pairs. 16 probe pair represent a probe set (however, 
> some probe sets contain only 11 or 14 probe pairs!). I would 
> like to divide the probe pairs by the average of a probe set. 
> 
>   How can I do this?
>    
> (I am looking for splice variants, therefore the actual 
> expression of the gene is not important, but the expression 
> of a given probe pair compared to the whole gene set. I 
> cannot go back to the raw data as the table represents 
> already pre-processed data.)
> 
>   Thank you:
>   Balazs
>   
> ----------------------
>    
>    SAMPLE1 SAMPLE2
> 1007_s_at1 20 119.1413
> 1007_s_at2 20 20
> 1007_s_at3 20 20
> 1007_s_at4 20 20
> 1007_s_at5 1122.83448 1503.4413
> 1007_s_at6 477.13448 1868.9413
> 1007_s_at7 29.83448 458.9413
> 1007_s_at8 20 136.6413
> 1007_s_at9 20 305.3413
> 1007_s_at10 373.53448 1567.3413
> 1007_s_at11 340.83448 596.6413
> 1007_s_at12 68.83448 293.6413
> 1007_s_at13 184.83448 673.9413
> 1007_s_at14 20 20
> 1007_s_at15 20 72.6413
> 1007_s_at16 20 124.1413
> 1487_at1 65.33448 20
> 1487_at2 20 20
> 1487_at3 20 20
> 1487_at4 1664.53448 993.9413
> 1487_at5 1981.13448 1566.8413
> 1487_at6 20 20
> 1487_at7 38.53448 20
> 1487_at8 20 20
> 1487_at9 20 20
> 1487_at10 20 20
> 1487_at11 91.33448 20
> 1487_at12 581.33448 800.4413
> 1487_at13 49.03448 200.9413
> 1487_at14 20 80.6413
>    
>   --------------------------------------
> 
> 
> -------------------------
> Balazs GYVRFFY MD, PhD
> Children's Hospital Boston Informatics Program Harvard-MIT 
> Health Sciences and Technology 300 Longwood Avenue, Boston, 
> USA Enders 150.6
> Tel: +1 617 919 2654
> 
>        
> ---------------------------------
> [[elided Yahoo spam]]
> 	[[alternative HTML version deleted]]
> 
>



More information about the Bioconductor mailing list