[BioC] no method for coercing this S4 class to a vector
Martin Morgan
mtmorgan at fhcrc.org
Mon Mar 11 18:34:10 CET 2013
Hi --
On 03/11/2013 07:50 AM, Martin Morgan wrote:
> Hello Martin
>
> Thanks a lot for your help. For me, if I try without as.character it
> doesnt work.
Please show an entire session! There is something that you are doing that I have
not understood. Here is the content of the file that I have, test.R
library(Biostrings)
fn1 <- function(N) { ## better: function(N, seq2)
i <- 1:N ## better: i <- seq_len(N)
substr(seq2[i], nchar(seq2[i]) - 2, nchar(seq2[i]))
}
seq2 <- readDNAStringSet("rosalind_grph.txt")
fn1(10)
sessionInfo()
I can run it from within R with
source("test.R", echo=TRUE, max=Inf)
provided the path to to test.R and rosalind_grph.txt is correct, and get
> source("test.R", echo=TRUE, max=Inf)
> library(Biostrings)
> fn1 <- function(N) { ## better: function(N, seq2)
+ i <- 1:N ## better: i <- seq_len(N)
+ substr(seq2[i], nchar(seq2[i]) - 2, nchar(seq2[i]))
+ }
> seq2 <- readDNAStringSet("rosalind_grph.txt")
> fn1(10)
Rosalind_4489 Rosalind_0393 Rosalind_3575 Rosalind_5369 Rosalind_0109
"GGC" "CTC" "AAT" "ATT" "GAG"
Rosalind_6578 Rosalind_8695 Rosalind_9187 Rosalind_2515 Rosalind_0826
"TGG" "TGG" "CGA" "CGC" "AGT"
> sessionInfo()
R version 2.15.2 Patched (2012-12-23 r61401)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.26.3 IRanges_1.16.6 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] parallel_2.15.2 stats4_2.15.2
What do you get?
> But how you show, for you it works
> I realize that you are using format="fastq" and I'm using format="fasta"
> I attached the file. But I dont want to use your time any more, you
> already help me a lot.
> I'm going to try to convert to fastq to see if I can get the same result
> as you.
you have successfully read in your data, so the input format is not an issue.
You indicated that
nchar(seq2[1])
works at the command line. There is no reason why the same command would not
work inside a function
fn1 = function() nchar(seq2[1])
Maybe you are writing the function in a package and have not specified the
NAMESPACE file corrrectly, or running the function in a new R session when you
have not loaded Biostrings, or loading another package that is somehow
interfering with how nchar() works, or defining seq2 to be something other than
the data used at the command line?
Martin
>
> thank you very much
>
> Thiago Maia
>
> On 03/11/2013 07:06 AM, Thiago Maia wrote:
>> function () {
>> print(nchar(as.character(seq2[1])))
>> }
>>
>> to execute
>> fn1()
>>
>>
>> actually the function became like this
>> function (N) {
>> for(i in as.numeric(1:N)) {
>>
>> print(substr(seq2[i],nchar(as.character(seq2[i]))-2,nchar(as.character(seq2[i]))))
>>
>> }
>> }
>
> Hi Thiago -- but the as.character() are not necessary
>
> function (N) {
> for(i in as.numeric(1:N)) {
> print(substr(seq2[i], nchar(seq2[i])-2, nchar(seq2[i])))
> }
> }
>
>
> and remember that R is 'vectorized', so
>
> fn1 <- function(N) { ## better: function(N, seq2)
> i <- 1:N ## better: i <- seq_len(N)
> substr(seq2[i], nchar(seq2[i]) - 2, nchar(seq2[i]))
> }
>
> so after
>
> library(Biostrings)
> example(readDNAStringSet)
>
> we have
>
> > seq2 <- readDNAStringSet(filepath, format="fastq")
> > fn1(10)
> HWI-EAS88_1_1_1_1001_499 HWI-EAS88_1_1_1_898_392 HWI-EAS88_1_1_1_922_465
> "TGT" "GAC" "GGT"
> HWI-EAS88_1_1_1_895_493 HWI-EAS88_1_1_1_953_493 HWI-EAS88_1_1_1_868_763
> "AAA" "CTT" "GCG"
> HWI-EAS88_1_1_1_819_788 HWI-EAS88_1_1_1_801_123 HWI-EAS88_1_1_1_885_419
> "CTT" "CTT" "GTT"
> HWI-EAS88_1_1_1_941_477
> "AAT"
>
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list