[R] Extracting part of a factor
Sarah Goslee
sarah.goslee at gmail.com
Fri Mar 4 22:32:37 CET 2016
You're not saving the result of mutate(). You're just printing it to the screen.
Try instead:
test <- mutate(testdata, place = substr(testdata$subject, 1,3))
test$place <- as.factor(test$place) # or factor() if you'd rather
This is why we ask for reproducible examples with data and code.
Look through the following and see if you understand.
test <- structure(list(subject = structure(1:6, .Label = c("001-002",
"002-003", "003-004", "004-005", "005-006", "006-007"), class = "factor"),
group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("boys",
"girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L, 2L,
1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names = c("subject",
"group", "wk1", "wk2", "wk3", "wk4"), class = "data.frame", row.names = c(NA,
-6L))
> str(test)
'data.frame': 6 obs. of 6 variables:
$ subject: Factor w/ 6 levels "001-002","002-003",..: 1 2 3 4 5 6
$ group : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
$ wk1 : int 2 7 9 5 2 1
$ wk2 : int 3 6 4 7 6 4
$ wk3 : int 4 5 6 8 3 7
$ wk4 : int 5 4 1 9 8 4
> mutate(test, place = substr(testdata$subject, 1,3))
subject group wk1 wk2 wk3 wk4 place
1 001-002 boys 2 3 4 5 001
2 002-003 boys 7 6 5 4 002
3 003-004 boys 9 4 6 1 003
4 004-005 girls 5 7 8 9 004
5 005-006 girls 2 6 3 8 005
6 006-007 girls 1 4 7 4 006
> str(test)
'data.frame': 6 obs. of 6 variables:
$ subject: Factor w/ 6 levels "001-002","002-003",..: 1 2 3 4 5 6
$ group : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
$ wk1 : int 2 7 9 5 2 1
$ wk2 : int 3 6 4 7 6 4
$ wk3 : int 4 5 6 8 3 7
$ wk4 : int 5 4 1 9 8 4
test <- mutate(testdata, place = substr(testdata$subject, 1,3))
test$place <- as.factor(test$place)
> str(test)
'data.frame': 6 obs. of 7 variables:
$ subject: Factor w/ 6 levels "001-002","002-003",..: 1 2 3 4 5 6
$ group : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
$ wk1 : int 2 7 9 5 2 1
$ wk2 : int 3 6 4 7 6 4
$ wk3 : int 4 5 6 8 3 7
$ wk4 : int 5 4 1 9 8 4
$ place : Factor w/ 6 levels "001","002","003",..: 1 2 3 4 5 6
On Fri, Mar 4, 2016 at 4:13 PM, KMNanus <kmnanus at gmail.com> wrote:
> Here’s where I’m stumped -
>
> when I call mutate(test, place = substr(test$subject, 1,3)) to create a
> place variable, I get this, with place as a character variable.
>
> subject group wk1 wk2 wk3 wk4 place
> (fctr) (fctr) (int) (int) (int) (int) (chr)
> 1 001-002 boys 2 3 4 5 001
> 2 002-003 boys 7 6 5 4 002
> 3 003-004 boys 9 4 6 1 003
> 4 004-005 girls 5 7 8 9 004
> 5 005-006 girls 2 6 3 8 005
> 6 006-007 girls 1 4 7 4 006
>
> When I call test$place <- factor(test$place), I receive the msg - "Error in
> `$<-.data.frame`(`*tmp*`, "place", value = integer(0)) :
> replacement has 0 rows, data has 6.
>
> If I call mutate this way - mutate(test, place =
> factor(substr(test$subject,1,3))), I get the same output as above but when I
> call class(test$place), I get NULL and the variable disappears.
>
> I can’t figure out why.
>
> Ken
> kmnanus at gmail.com
> 914-450-0816 (tel)
> 347-730-4813 (fax)
>
>
> On Mar 4, 2016, at 3:46 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>
> I much prefer the factor function over the as.factor function for converting
> character to factor, since you can set the levels in the order you want them
> to be.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 4, 2016 10:07:27 AM PST, Sarah Goslee <sarah.goslee at gmail.com>
> wrote:
>>
>> As everyone has been telling you, as.factor().
>> If you like the mutate approach, you can call as.factor(test$subject)
>> to convert it.
>>
>> Here's a one-liner with reproducible data.
>>
>>
>> testdata <- structure(list(subject = structure(1:6, .Label = c("001-002",
>> "002-003", "003-004", "004-005", "005-006", "006-007"), class = "factor"),
>> group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("boys",
>> "girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L, 2L,
>> 1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
>> 8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names = c("subject",
>> "group", "wk1", "wk2", "wk3", "wk4"), class = "data.frame", row.names =
>> c(NA,
>> -6L))
>>
>> testdata$subject <- as.factor(substring(as.character(testdata$subject), 1,
>> 3))
>>
>>>
>>> testdata
>>
>> subject group wk1 wk2 wk3 wk4
>> 1 001 boys 2 3 4 5
>> 2 002 boys 7 6 5 4
>> 3 003 boys 9 4 6 1
>> 4 004 girls 5 7 8 9
>> 5 005 girls 2 6 3 8
>> 6 006 girls 1 4 7 4
>>>
>>> str(testdata)
>>
>> 'data.frame': 6 obs. of 6 variables:
>> $ subject: Factor w/ 6 levels "001","002","003",..: 1 2 3 4 5 6
>> $ group : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
>> $ wk1 : int 2 7 9 5 2 1
>> $ wk2 : int 3 6 4 7 6 4
>> $ wk3 : int 4 5 6 8 3 7
>> $ wk4 : int 5 4 1 9 8 4
>>
>> Sarah
>>
>> On Fri, Mar 4, 2016 at 1:00 PM, KMNanus <kmnanus at gmail.com> wrote:
>>>
>>>
>>> Here’s the dataset
>>> I’m working with, called test -
>>>
>>> subject group wk1 wk2 wk3 wk4 place
>>> 001-002 boys 2 3 4 5
>>> 002-003 boys 7 6 5 4
>>> 003-004 boys 9 4 6 1
>>> 004-005 girls 5 7 8 9
>>> 005-006 girls 2 6 3 8
>>> 006-007 girls 1 4 7 4
>>>
>>>
>>> if I call mutate(test, place = substr(subject,1,3), “001 is the first
>>> observation in the place column
>>>
>>> But it’s a character and “subject” is a factor. I need place to be a
>>> factor, too, but I need the observations to be ONLY the first three numbers
>>> of “subject.”
>>>
>>> Does that make my request more understandable?
>>
More information about the R-help
mailing list