[R] beginner programming question

Thu Dec 18 16:53:20 CET 2003

On Wed, 17 Dec 2003, Tony Plate wrote:

> Another way to approach this is to first massage the data into a more
> regular format.  This may or may not be simpler or faster than other
> solutions suggested.

You could also use the reshape() command to do the massaging

	-thomas

>  > x <- read.table("clipboard", header=T)
>  > x
>    rel1 rel2 rel3 age0 age1 age2 age3 sex0 sex1 sex2 sex3
> 1    1    3   NA   25   23    2   NA    1    2    1   NA
> 2    4    1    3   35   67   34   10    2    2    1    2
> 3    1    4    4   39   40   59   60    1    2    2    1
> 4    4   NA   NA   45   70   NA   NA    2    2   NA   NA
>  > nn <- c("rel","age0","age","sex0","sex")
>  > xx <- rbind("colnames<-"(x[,c("rel1","age0","age1","sex0","sex1")], nn),
> +  "colnames<-"(x[,c("rel2","age0","age2","sex0","sex2")], nn),
> +  "colnames<-"(x[,c("rel3","age0","age3","sex0","sex3")], nn))
>  > xx
>     rel age0 age sex0 sex
> 1    1   25  23    1   2
> 2    4   35  67    2   2
> 3    1   39  40    1   2
> 4    4   45  70    2   2
> 11   3   25   2    1   1
> 21   1   35  34    2   1
> 31   4   39  59    1   2
> 41  NA   45  NA    2  NA
> 12  NA   25  NA    1  NA
> 22   3   35  10    2   2
> 32   4   39  60    1   1
> 42  NA   45  NA    2  NA
>  >
>  > rbind(subset(xx, xx$rel==1 & (xx$sex0==1 |
> xx$sex0==xx$sex))[,c("age0","age")], subset(xx, xx$rel==1 & xx$sex==1 &
> xx$sex0!=xx$sex)[,c("age","age0")])
>     age0 age
> 1    25  23
> 3    39  40
> 21   35  34
>  >
>
> hope this helps,
>
> Tony Plate
>
> PS.  To advanced R users: Is the above usage of the "colnames<-" function
> within an expression regarded as acceptable or as undesirable programming
> style? -- I've rarely seen it used, but it can be quite useful.
>
> At Wednesday 09:28 PM 12/17/2003 +0200, Adrian Dusa wrote:
> >Hi all,
> >
> >
> >
> >The last e-mails about beginners gave me the courage to post a question;
> >from a beginner's perspective, there are a lot of questions that I'm
> >tempted to ask. But I'm trying to find the answers either in the
> >documentation, either in the about 15 free books I have, either in the
> >help archives (I often found many similar questions posted in the past).
> >
> >Being an (still actual) user of SPSS, I'd like to be able to do
> >everything in R. I've learned that the best way of doing it is to
> >struggle and find a solution no matter what, refraining from doing it
> >with SPSS. I've became more and more aware of the almost unlimited
> >possibilities that R offers and I'd like to completely switch to R
> >whenever I think I'm ready.
> >
> >
> >
> >I have a (rather theoretical) programming problem for which I have found
> >a solution, but I feel it is a rather poor one. I wonder if there's some
> >other (more clever) solution, using (maybe?) vectorization or
> >subscripting.
> >
> >
> >
> >A toy example would be:
> >
> >
> >
> >rel1       rel2       rel3       age0     age1     age2     age3
> >sex0     sex1     sex2     sex3
> >
> >1          3          NA        25         23         2          NA
> >1          2          1          NA
> >
> >4          1          3          35         67         34         10
> >2          2          1          2
> >
> >1          4          4          39         40         59         60
> >1          2          2          1
> >
> >4          NA        NA        45         70         NA        NA
> >2          2          NA        NA
> >
> >
> >
> >where rel1...3 states the kinship with the respondent (person 0)
> >
> >code 1 meaning husband/wife, code 4 meaning parent and code 3 for
> >children.
> >
> >
> >
> >I would like to get the age for husbands (code 1) in a first column and
> >wife's age in the second:
> >
> >
> >
> >ageh     agew
> >
> >25         23
> >
> >34         35
> >
> >39         40
> >
> >
> >
> >My solution uses *for* loops and *if*s checking for code 1 in each
> >element in the first 3 columns, then checking in the last three columns
> >for husband's code, then taking the corresponding age in a new matrix.
> >I've learned that *for* loops are very slow (and indeed with my dataset
> >of some 2000 rows and 13 columns for kinship it takes quite a lot).
> >
> >I found the "Looping" chapter in "S poetry" very useful (it did saved me
> >from *for* loops a couple of times, thanks!).
> >
> >
> >
> >Any hints would be appreciated,
> >
> >Adrian
> >
> >
> >
> >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >Adrian Dusa (adi at roda.ro)
> >Romanian Social Data Archive (www.roda.ro <http://www.roda.ro/> )
> >1, Schitu Magureanu Bd.
> >76625 Bucharest sector 5
> >Romania
> >
> >
> >Tel./Fax:
> >
> >+40 (21) 312.66.18\
> >
> >+40 (21) 312.02.10/ int.101
> >
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help at stat.math.ethz.ch mailing list
> >https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle