[BioC] no method for coercing this S4 class to a vector

Martin Morgan mtmorgan at fhcrc.org
Mon Mar 11 18:34:10 CET 2013


Hi --

On 03/11/2013 07:50 AM, Martin Morgan wrote:
> Hello Martin
>
>       Thanks a lot for your help. For me, if I try without as.character it
> doesnt work.

Please show an entire session! There is something that you are doing that I have 
not understood. Here is the content of the file that I have, test.R

library(Biostrings)

fn1 <- function(N) {  ## better: function(N, seq2)
     i <- 1:N          ## better: i <- seq_len(N)
     substr(seq2[i], nchar(seq2[i]) - 2, nchar(seq2[i]))
}

seq2 <- readDNAStringSet("rosalind_grph.txt")
fn1(10)

sessionInfo()

I can run it from within R with

   source("test.R", echo=TRUE, max=Inf)

provided the path to to test.R and rosalind_grph.txt is correct, and get


 > source("test.R", echo=TRUE, max=Inf)

 > library(Biostrings)

 > fn1 <- function(N) {  ## better: function(N, seq2)
+     i <- 1:N          ## better: i <- seq_len(N)
+     substr(seq2[i], nchar(seq2[i]) - 2, nchar(seq2[i]))
+ }

 > seq2 <- readDNAStringSet("rosalind_grph.txt")

 > fn1(10)
Rosalind_4489 Rosalind_0393 Rosalind_3575 Rosalind_5369 Rosalind_0109
         "GGC"         "CTC"         "AAT"         "ATT"         "GAG"
Rosalind_6578 Rosalind_8695 Rosalind_9187 Rosalind_2515 Rosalind_0826
         "TGG"         "TGG"         "CGA"         "CGC"         "AGT"

 > sessionInfo()
R version 2.15.2 Patched (2012-12-23 r61401)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Biostrings_2.26.3  IRanges_1.16.6     BiocGenerics_0.4.0

loaded via a namespace (and not attached):
[1] parallel_2.15.2 stats4_2.15.2

What do you get?

>       But how you show, for you it works
>       I realize that you are using format="fastq" and I'm using format="fasta"
>       I attached the file. But I dont want to use your time any more, you
> already help me a lot.
>       I'm going to try to convert to fastq to see if I can get the same result
> as you.

you have successfully read in your data, so the input format is not an issue.

You indicated that

   nchar(seq2[1])

works at the command line. There is no reason why the same command would not 
work inside a function

   fn1 = function() nchar(seq2[1])

Maybe you are writing the function in a package and have not specified the 
NAMESPACE file corrrectly, or running the function in a new R session when you 
have not loaded Biostrings, or loading another package that is somehow 
interfering with how nchar() works, or defining seq2 to be something other than 
the data used at the command line?

Martin

>
> thank you very much
>
> Thiago Maia
>
> On 03/11/2013 07:06 AM, Thiago Maia wrote:
>> function () {
>>     print(nchar(as.character(seq2[1])))
>> }
>>
>> to execute
>> fn1()
>>
>>
>> actually the function became like this
>> function (N) {
>>      for(i in as.numeric(1:N)) {
>>
>> print(substr(seq2[i],nchar(as.character(seq2[i]))-2,nchar(as.character(seq2[i]))))
>>
>>      }
>> }
>
> Hi Thiago -- but the as.character() are not necessary
>
> function (N) {
>        for(i in as.numeric(1:N)) {
>            print(substr(seq2[i], nchar(seq2[i])-2, nchar(seq2[i])))
>        }
> }
>
>
> and remember that R is 'vectorized', so
>
> fn1 <- function(N) {  ## better: function(N, seq2)
>        i <- 1:N          ## better: i <- seq_len(N)
>        substr(seq2[i], nchar(seq2[i]) - 2, nchar(seq2[i]))
> }
>
> so after
>
>      library(Biostrings)
>      example(readDNAStringSet)
>
> we have
>
>    > seq2 <- readDNAStringSet(filepath, format="fastq")
>    > fn1(10)
> HWI-EAS88_1_1_1_1001_499  HWI-EAS88_1_1_1_898_392  HWI-EAS88_1_1_1_922_465
>                       "TGT"                    "GAC"                    "GGT"
>     HWI-EAS88_1_1_1_895_493  HWI-EAS88_1_1_1_953_493  HWI-EAS88_1_1_1_868_763
>                       "AAA"                    "CTT"                    "GCG"
>     HWI-EAS88_1_1_1_819_788  HWI-EAS88_1_1_1_801_123  HWI-EAS88_1_1_1_885_419
>                       "CTT"                    "CTT"                    "GTT"
>     HWI-EAS88_1_1_1_941_477
>                       "AAT"
>
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list