[R] For column values-Quality control

Bansal, Vikas vikas.bansal at kcl.ac.uk
Sat Jul 9 00:46:29 CEST 2011


Yes sir.you are right.after this I use this code to convert ASCII values in column V10 to decimal numbers-

dfa$V10=lapply(dfa[,4], function(c) as.numeric(charToRaw(c)))    

now u will get output something like this-

V7 V8             V9                                                       V10
  0  1              G                                                        82
  0  1              CGT                                             c(90, 92, 96)
  0  1              GA                                                 c(78, 92)
  0  1              GAG                                             c(90, 92, 92)
  0  1              G                                                        88
  0  1              A                                                        96
  0  1              ATT                                             c(90, 96, 92)
  0  1              T                                                        94
  0  1              C                                                        97

now after this I am facing the problem-

the values in column V10 corresponds to A,C,G T in column V9.I want only those, whose score is more than 91.so output of above should be-

V7 V8             V9                                                       V10
  0  1              GT                                             c(90, 92, 96)
  0  1              A                                                 c(78, 92)
  0  1              AG                                             c(90, 92, 92)
  0  1              A                                                        96
  0  1              TT                                             c(90, 96, 92)
  0  1              T                                                        94
  0  1              C                                                        97

First row should be deleted because it contains 82 which is less than 91.In second row C should deleted because it has less than 91 score in col V10.


Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London
________________________________________
From: David Winsemius [dwinsemius at comcast.net]
Sent: Friday, July 08, 2011 11:37 PM
To: Bansal, Vikas
Cc: r-help at r-project.org
Subject: Re: [R] For column values-Quality control

I get something entirely different when I execute that input command
with the attached file:

This is what I see as the first 14 lines for a displayed value for dfa:

 > dfa
     V7 V8  V9  V10
1    0  1   G    `
2    0  1   T    a
3    0  1   C    a
4    0  1   A    a
5    0  1   G    _
6    0  1   G    Z
7    0  1   C    ^
8    0  1   C   \\
9    0  1   A    Z
10   0  1   T    a
11   0  1   g    ^
12   0  1   A   \\
13   0  1   C    _
14   0  1   G    a

If this is different than what you see when you type dfa after input
of that file in that manner then you should consider alternative
methods of communicating an unambiguous representation of your dfa
object.... as I have detailed in prior private messages.

--

David.

On Jul 8, 2011, at 6:10 PM, Bansal, Vikas wrote:

>
> Dear all,
>
> I am really sorry for not giving the input file because in my mail,I
> did not explain my problem in a best way.
>
> I have a file that is summary.txt(I have attached it) .we can read
> this file using-
>
> dfa=read.table("summar.txt",fill=T,colClasses = "character",header=T)
>
> In V10 column I have  ASCII values which I converted into decimal
> numbers using this code-
>
> dfa$V10=lapply(dfa[,4], function(c) as.numeric(charToRaw(c)))
>
> Now I have a dataframe dfa with these columns something like this-
>
> V7 V8
> V9                                                       V10
>  0  1
> G                                                        82
>  0  1              CGT
> c(90, 92, 96)
>  0  1
> GA                                                 c(78, 92)
>  0  1              GAG
> c(90, 92, 92)
>  0  1
> G                                                        88
>  0  1
> A                                                        96
>  0  1              ATT
> c(90, 96, 92)
>  0  1
> T                                                        94
>  0  1
> C                                                        97
>
> the values in column V10 corresponds to A,C,G T in column V9.I want
> only those whose score is more than 91.so output of above should be-
>
> V7 V8
> V9                                                       V10
>  0  1              GT
> c(90, 92, 96)
>  0  1              A
> c(78, 92)
>  0  1              AG
> c(90, 92, 92)
>  0  1
> A                                                        96
>  0  1              TT
> c(90, 96, 92)
>  0  1
> T                                                        94
>  0  1
> C                                                        97
>
> Can you please tell me the solution.
>
> Thanking you,
> Warm Regards
> Vikas Bansal
> Msc Bioinformatics
> Kings College
> London<summary.txt>______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list