[R] index question
Bob Green
bgreen at dyson.brisnet.org.au
Fri Dec 28 13:24:28 CET 2007
I was hoping for some advice regarding indexing,
From a dataframe there are 27 variables of interest, with the prefix of "pre".
[7] "Decision" "MHCDate" "pre01" "pre01111" "pre012" "pre013"
[13] "pre02" "pre02111" "pre02114" "pre0211" "pre0212" "pre029"
[19] "pre03a" "pre0311" "pre0312" "pre03" "pre04" "pre05"
[25] "pre06" "pre07" "pre08" "pre09" "pre10" "pre11"
[31] "pre12" "pre13" "pre14" "pre15" "pre16"
I want to combine these variables into new variables, using the
following criteria :
(1) create a single variable PRE, when any of the 27 'pre' variables
have a value >= '1'
(2) create a variable HOM, when any of the pre01, pre01111, pre012,
pre013 variables have a value >= '1'
(3) create a variable ASS, when any of the pre02, pre02111, pre02114,
pre0211, pre0212, pre029 variables have a value >= '1'
(4) create a variable SEX, when any of the pre03a, pre0311, pre0312,
pre03 variables have a value >= '1'
(5) create a variable VIO, when any of the pre01 to pre06 variables
have a value >= '1'
(6) create a variable SERASS. If pre02111 or pre2114 >= '1', assign a
value of 1, if there is a value of 1 or greater for pre0211 assign a
value of 2; & if there is a value of
1 or greater for pre0212: assign a value of 3; if there is a value
of 1 or greater for pre2029 assign a value of 4; everything else = 0.
If a case has multiple values, 02111 prevails over 2114, 2114
prevails over 0211, 0211 prevails over 0212; 0212 prevails over 2029.
I believe I can generate new variables (1) - (5) using code such
as: ASS <- (reoffend$pre02 | reoffend$pre02111 | reoffend$pre02114 |
reoffend$pre0211 | reoffend$pre0212 | reoffend$pre029 >= '1')
I have three questions:
1. If this is correct, what is the most efficient way to generate (1)
without having to type all the variable names. The following does not
work: PRE <- reoffend [,9:35], >= '1'
2. I am unsure as to how to generate Example 6.
3. I wanted to exclude cases with a reoffend$Decision of value of 3,
using the code below. However, I received a message saying there were
NAs produced, however, the raw variable did not have NAs.
> MHT.decision <- reoffend[reoffend$Decision >= '2',]
> table(MHT.decision)
Error in vector("integer", length) : vector size cannot be NA
In addition: Warning messages:
1: NAs produced by integer overflow in: pd * (as.integer(cat) - 1L)
2: NAs produced by integer overflow in: pd * nl
> table(reoffend$Decision)
1 2 3
1136 445 66
Any assistance is much appreciated,
Bob Green
More information about the R-help
mailing list