[Rd] resampling from string when it runs across multiple lines
Dimitris Rizopoulos
Dimitris.Rizopoulos at med.kuleuven.be
Mon Mar 24 19:10:36 CET 2008
try this:
y <- as.matrix(read.table(textConnection(
"A C G T T G C A G C
A C G F F F F F F G
A C G S S S S S G A
A C G T T G C A G G
A B B B B B B A G T"
), stringsAsFactors = FALSE))
ind <- sample(length(y), 20, TRUE)
y[ind]
I hope it helps.
Best,
Dimitris
ps, it would be best that you send that kind of e-mails in R-help not
R-devel; check http://www.r-project.org/mail.html for more info
regarding the different R-mailing-lists.
----
Dimitris Rizopoulos
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
Quoting Suraaga Kulkarni <suraaga.kulkarni at gmail.com>:
> Hi,
>
> I need to resample from a long string, which is written in many lines with
> carriage-return marks at the end of each line. Perhaps because the data
> looks like a matrix, using the code: sample(data, 25, replace=T) gives me 25
> columns of characters from the data because it is resampling whole columns.
> What I would like it to do is to treat the data as a vector that has just
> been spread across many lines, and pick single characters from random
> positions in randomly chosen lines.
>
> I am reproducing a sample dataset, the command and the output here:
>
>> y
> X..1. X..2. X..3. X..4. X..5. X..6. X..7. X..8. X..9. X..10.
> [1,] A C G T T G C A G C
> [2,] A C G F F F F F F G
> [3,] A C G S S S S S G A
> [4,] A C G T T G C A G G
> [5,] A B B B B B B A G T
>
>> sample(y, 20, replace=T)
> X..9. X..4. X..2. X..7. X..9..1 X..3. X..3..1 X..9..2 X..9..3 X..4..1
> X..3..2 X..8. X..9..4 X..3..3 X..6. X..7..1
> [1,] G T C C G G G G G
> T G A G G G C
> [2,] F F C F F G G F F
> F G F F G F F
> [3,] G S C S G G G G G
> S G S G G S S
> [4,] G T C C G G G G G
> T G A G G G C
> [5,] G B B B G B B G G
> B B A G B B B
>
> X..6..1 X..3..4 X..7..2 X..10.
> [1,] G G C C
> [2,] F G F G
> [3,] S G S A
> [4,] G G C G
> [5,] B B B T
>
> I wanted to try the bootstrap approach (since that's what I am doing -
> resampling with replacement) but that requires a "statistic" and I don't
> know what sense that makes for character data.
>
> Any help will be greatly appreciated.
>
> Thanks,
>
> S.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
More information about the R-devel
mailing list