[R] coding logic and syntax in R
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Dec 24 09:09:18 CET 2003
On Wed, 24 Dec 2003, Pravin wrote:
> I am a beginner in R programming and recently heard about this mailing list.
> Currently, I am trapped into a simple problem for which I just can't find a
> solution. I have a huge dataset (~81,000 observations) that has been
BTW, that is quite a small dataset these days: not even 10 million is `huge'.
> analyzed and the final result is in the form of 0 and 1(one column).
>
> I need to write a code to process this column in a little complicated way.
> These 81,000 observations are actually 9,000 sets (81,000/9).
> So, in each set whenever zero appears, rest all observations become zero.
>
> For example;
>
> If the column has:
>
> 111110111111011111111111111111111....
>
> The output should look like:
>
> 111110000111000000111111111111111...
Let me see if I understand you. This was really
111110111
111011111
111111111
111111...
and you want
111110000
111000000
111111111
111111...
So let's treat it as a matrix (extending to 4 complete sets):
x <- as.numeric(strsplit("111110111111011111111111111111111011", NULL)[[1]])
xx <- matrix(x, ncol=9, byrow=TRUE)
Then a simple loop
for(i in 2:9) xx[,i] <- xx[,i] & xx[,i-1]
give me the second matrix, which I can read out as a vector as
as.vector(t(xx))
[1] 1 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
or in what I understand as your format
paste(t(xx), collapse="")
[1] "111110000111000000111111111111111000"
Doing this with 81000 random 0/1's took a fraction of a second.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list