[R] Extracting part of a factor

KMNanus kmnanus at gmail.com
Fri Mar 4 22:13:54 CET 2016


Here’s where I’m stumped - 

when I call mutate(test, place = substr(test$subject, 1,3)) to create a place variable, I get this, with place as a character variable. 

 subject  group   wk1   wk2   wk3   wk4 place
   (fctr) (fctr) (int) (int) (int) (int) (chr)
1 001-002   boys     2     3     4     5   001
2 002-003   boys     7     6     5     4   002
3 003-004   boys     9     4     6     1   003
4 004-005  girls     5     7     8     9   004
5 005-006  girls     2     6     3     8   005
6 006-007  girls     1     4     7     4   006

When I call test$place <- factor(test$place), I receive the msg  - "Error in `$<-.data.frame`(`*tmp*`, "place", value = integer(0)) : 
  replacement has 0 rows, data has 6.

If I call mutate this way - mutate(test, place = factor(substr(test$subject,1,3))), I get the same output as above but when I call class(test$place), I get NULL and the variable disappears.

I can’t figure out why.

Ken
kmnanus at gmail.com
914-450-0816 (tel)
347-730-4813 (fax)



> On Mar 4, 2016, at 3:46 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
> 
> I much prefer the factor function over the as.factor function for converting character to factor, since you can set the levels in the order you want them to be. 
> -- 
> Sent from my phone. Please excuse my brevity.
> 
> On March 4, 2016 10:07:27 AM PST, Sarah Goslee <sarah.goslee at gmail.com> wrote:
> As everyone has been telling you, as.factor().
> If you like the mutate approach, you can call as.factor(test$subject)
> to convert it.
> 
> Here's a one-liner with reproducible data.
> 
> 
> testdata <- structure(list(subject = structure(1:6, .Label = c("001-002",
> "002-003", "003-004", "004-005", "005-006", "006-007"), class = "factor"),
>     group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("boys",
>     "girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L, 2L,
>     1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
>     8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names = c("subject",
> "group", "wk1", "wk2", "wk3", "wk4"), class = "data.frame", row.names = c(NA,
> -6L))
> 
> testdata$subject <- as.factor(substring(as.character(testdata$subject), 1, 3))
> 
> 
> testdata
>   subject group wk1 wk2 wk3 wk4
> 1     001  boys   2   3   4   5
> 2     002  boys   7   6   5   4
> 3     003  boys   9   4   6   1
> 4     004 girls   5   7   8   9
> 5     005 girls   2   6   3   8
> 6     006 girls   1   4   7   4
>  str(testdata)
> 'data.frame': 6 obs. of  6 variables:
>  $ subject: Factor w/ 6 levels "001","002","003",..: 1 2 3 4 5 6
>  $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
>  $ wk1    : int  2 7 9 5 2 1
>  $ wk2    : int  3 6 4 7 6 4
>  $ wk3    : int  4 5 6 8 3 7
>  $ wk4    : int  5 4 1 9 8 4
> 
> Sarah
> 
> On Fri, Mar 4, 2016 at 1:00 PM, KMNanus <kmnanus at gmail.com> wrote:
> 
>  Here’s the dataset
> I’m working with, called test -
> 
>  subject group wk1 wk2 wk3 wk4 place
>  001-002 boys 2 3 4 5
>  002-003 boys 7 6 5 4
>  003-004 boys 9 4 6 1
>  004-005 girls 5 7 8 9
>  005-006 girls 2 6 3 8
>  006-007 girls 1 4 7 4
> 
> 
>  if I call mutate(test, place = substr(subject,1,3), “001 is the first observation in the place column
> 
>  But it’s a character and “subject” is a factor.  I need place to be a factor, too, but I need the observations to be ONLY the first three numbers of “subject.”
> 
>  Does that make my request more understandable?
> 
> 
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal,
> self-contained, reproducible code.



More information about the R-help mailing list