[R] FW: Index out SNP position

David L Carlson dcarlson at tamu.edu
Fri Jan 4 21:50:15 CET 2013


I think you mean between column 1 and 2 of A? Why is 36003918 not
included? It is clearly between 35838396 and 36151202 in the first row of A.

My earlier solution should work fine. Just create a new matrix AX that has
the columns switched so that the start is always column 1 and use that to
identify the ones you want to select. That way you are not modifying B. This
will be faster than checking the order of the columns in A each time you
process a line from B.
 
> Ax <- t(apply(A, 1, function(x) c(min(x), max(x))))
> indx <- sapply(1:nrow(B), function(i) any(B[i]>Ax[,1] &
     B[i]<Ax[,2]))
> SNP <- B[indx]
> SNP
[1] 36003918 35838399 35838589

 --------------------
 David C
 
> From: JiangZhengyu [mailto:zhyjiang2006 at hotmail.com]
> Sent: Friday, January 04, 2013 9:03 AM
> To: dcarlson at tamu.edu
> Subject: RE: [R] Index out SNP position
> 
> Hi David,
> 
> Given B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1)
> 
> 36262559 and 36003918 do not fall between row 1 and 2 of A & that's
> what I want to exclude.
> 
> 35838399 is in [1,] 35838396 36151202
> 35838589 is in [2,] 35838674 35838584
> 
> 
> > From: dcarlson at tamu.edu
> > To: zhyjiang2006 at hotmail.com; r-help at r-project.org
> > CC: sarah.goslee at gmail.com
> > Subject: RE: [R] Index out SNP position
> > Date: Fri, 4 Jan 2013 08:31:05 -0600
> >
> > So given B
> >
> > > cbind(B, apply(B, 1, diff))
> > [,1] [,2] [,3]
> > [1,] 35838396 36151202 312806
> > [2,] 35838674 35838584 -90
> > [3,] 36003908 35838674 -165234
> > [4,] 36004090 36003908 -182
> > [5,] 36150188 36003992 -146196
> >
> > Row 1 is start/end and rows 2 through 5 are end/start> so you only
> want to exclude nucleotides that fall
> > between start/end in row 1, ignoring rows 2 through 5
> > which are end/start? Given your sample matrix A, which
> > rows do you want to include/exclude?
> >
> > David C
> >
> > From: JiangZhengyu [mailto:zhyjiang2006 at hotmail.com]
> > Sent: Thursday, January 03, 2013 6:36 PM
> > To: dcarlson at tamu.edu; r-help at r-project.org
> > Cc: sarah.goslee at gmail.com
> > Subject: RE: [R] Index out SNP position
> >
> > Hi David,
> >
> > Thanks for your reply!
> >
> > But what if I cannot change the positions of each row pairs in
> A. Sorry I
> > did not make it very clear.
> >
> > The two columns in A represent start-and-end or end-and-start
> positions of a
> > gene. The one column in B is the single nucleotide position .  I
> am trying
> > to index out all the  single nucleotide s that fall between the start
> and end
> > region of a gene.
> >
> > Jiang
> >
> >
> > > From: dcarlson at tamu.edu
> > > To: dcarlson at tamu.edu; zhyjiang2006 at hotmail.com; r-help at r-
> project.org
> > > CC: sarah.goslee at gmail.com
> > > Subject: RE: [R] Index out SNP position
> > > Date: Thu, 3 Jan 2013 16:35:30 -0600
> > >
> > > I missed the fact that the columns are not consistently
> smaller/larger:
> > >
> > > > A <- t(apply(A, 1, function(x) c(min(x), max(x))))
> > > > A
> > > [,1] [,2]
> > > [1,] 35838396 36151202
> > > [2,] 35838584 35838674
> > > [3,] 35838674 36003908
> > > [4,] 36003908 36004090
> > > [5,] 36003992 36150188
> > > > indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] &
> B[i]<A[,2]))
> > > > SNP <- B[indx]
> > > > SNP
> > > [1] 36003918 35838399 358 38589
> > >
> > > -------
> > > David
> > >
> > >
> > > > -----Original Message-----
> > > > From: David L Carlson [mailto:dcarlson at tamu.edu]
> > > > Sent: Thursday, January 03, 2013 4:23 PM
> > > > To: 'JiangZhengyu'; 'r-help at r-project.org'
> > > > Subject: RE: [R] Index out SNP position
> > > >
> > > > Something like this?
> > > >
> > > > > indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] &
> B[i]<A[,2]))
> > > > > SNP <- B[indx]
> > > > > SNP
> > > > [1] 36003918 35838399 35838589
> > > >
> > > > ----------------------------------- -----------
> > > > David L Carlson
> > > > Associate Professor of Anthropology
> > > > Texas A&M University
> > > > College Station, TX 77843-4352
> > > >
> > > > > -----O riginal Message-----
> > > > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > > > > project.org] On Behalf Of JiangZhengyu
> > > > > Sent: Thursday, January 03, 2013 3:55 PM
> > > > > To: r-help at r-project.org
> > > > > Subject: [R] Index out SNP position
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Dear R experts,
> > > > >
> > > > > I have 2 matix: A& B. I am trying to index B against A - (1)
> find out
> > > > B
> > > > > rows that fall between the col 1 and 2 of A& put them into a
> new
> > > > > vector SNP.I made code as below, but I cannot think of a right
> way to
> > &g t; > > do it. Could anyone help me with the code? Thanks,Jiang----
> > > > >
> > > &g t; > A <-
> > > > >
> > > >
> matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
> > > > > ,35838674,36003908,36003992), ncol = 2)
> > > > > B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1)
> > > > nr=nrow(A)
> > > > > rn=nrow(B) for (i in 1:nr)
> > > > > {
> > > > > for (j in 1:rn){if (B[i,1]<=A[j,1] &&
> B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
> > > > > }
> > > > >
> > > > > [[alternative HTML version deleted]]
> > > > >
> > > > > ______________________________________________
> > > > > R-help at r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-
> project.org/posting-
> > > > > guide.html
> > > > > and provide c ommented, mi nimal, self-contained, reproducible
> code.
> > >
> >




More information about the R-help mailing list