[R] R functions

sujitha viritha.k at gmail.com
Fri Sep 23 15:16:43 CEST 2011


Hi group,
code:

>m<-read.table("test.txt",sep='\t',header=TRUE,colClasses=c('character','integer','integer','numeric','numeric')) 
>s<-data.frame(c(rle(m$Sample1)[[2]],rle(m$Sample2)[[2]]),c(rle(m$Sample1)[[1]],rle(m$Sample2)[[1]])) 
> names(s)=c("Values","Probes")
>G=1
> for(i in 1:length(s$Probes)){
+ if(G==1){first<-unique(m$Chr[G:s$Probes[i]])
+ second<-min(m$Start[G:s$Probes[i]])
+ third<-max(m$End[G:s$Probes[i]])
+ c<-cbind(first,second,third,s$Values[i],s$Probes[i])
+ print (c)
+ G=(G+s$Probes[i])}
+ else if((G-1) < length(m$Sample1)) {
+ first<-unique(m$Chr[G:(G+s$Probes[i]-1)])
+ second<-min(m$Start[G:(G+s$Probes[i]-1)])
+ third<-max(m$End[G:(G+s$Probes[i]-1)])
+ c<-cbind(first,second,third,s$Values[i],s$Probes[i])
+ print (c)
+ G=(G+s$Probes[i])}
+ else {
+ G=1
+ first<-unique(m$Chr[G:s$Probes[i]])
+ second<-min(m$Start[G:s$Probes[i]])
+ third<-max(m$End[G:s$Probes[i]])
+ c<-cbind(first,second,third,s$Values[i],s$Probes[i])
+ print (c)
+ G=(G+s$Probes[i])}
+ }
so the out put is:
     first  second    third             
[1,] "chr2" "9896633" "14404502" "0" "4"
     first  second     third                 
[1,] "chr2" "14421718" "16048724" "-0.43" "4"
     first  second     third             
[1,] "chr2" "37491676" "37703009" "0" "2"
     first  second    third            
[1,] "chr2" "9896633" "9896690" "0" "2"
     first  second     third                 
[1,] "chr2" "14314039" "16048724" "-0.35" "6"
     first  second     third             
[1,] "chr2" "37491676" "37703009" "0" "2"

So I need 2 modifications to this code:
1)since this is just a small part of the file (with 2 samples), but my
actual file has 150 samples, so how do I write rle fuction for that?
2)how do I store all the executed c values as a dataframe? 
Thanks,
Suji



"Hi group, 
I am trying to right a code to do the following 
This is how the test file looks like: 
Chr start end sample1 sample2 
chr2 9896633 9896683 0 0 
chr2 9896639 9896690 0 0 
chr2 14314039 14314098 0 -0.35 
chr2 14404467 14404502 0 -0.35 
chr2 14421718 14421777 -0.43 -0.35 
chr2 16031710 16031769 -0.43 -0.35 
chr2 16036178 16036237 -0.43 -0.35 
chr2 16048665 16048724 -0.43 -0.35 
chr2 37491676 37491735 0 0 
chr2 37702947 37703009 0 0 

Now I want to summarize the values like 
Sample Chr Start End Values Probes 
1 chr2 9896633 14404502 0 4 
1 chr2 14421718 16048724 -0.43 4 
1 chr2 37491676 37703001 0 2 
2 chr2 9896633 9896690 0 2 
2 chr2 14314039 16048724 -0.35 6 
2 chr2 37491676 37703009 0 2 

Here the start for the first line would be the least value until values are
similiar (4) then the end would be highest value. The values is the unique
value among the common values. 
Can I get some ideas or suggestions to perform this because I am new to hard
core program in R? "


--
View this message in context: http://r.789695.n4.nabble.com/R-functions-tp3816748p3836806.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list