[R-SIG-Mac] Unable to change some members of a vector

Sun Feb 14 05:34:13 CET 2021

This question is not related to Macs, but here is the answer anyway.

First up, anyone who uses R should understand what a factor is. Find an introductory text on R and read the section on factors. Even better, read the whole book.  It seems that it has been decided that R will not automatically add a factor level, which is probably a good idea.

There is an option in the read.table command of stringsAsFactors. You can set it to FALSE to obtain character strings. You can also convert the factor to character using as.character().

If Excel decides to put strange characters into your data file then that is Excel’s problem. I suggest looking at the file with a text editor. The free version of BBEdit will work fine. You can read Excel files directly using teh readxl package.


> On 14 Feb 2021, at 9:54 am, Parkhurst, David F. <parkhurs using indiana.edu> wrote:
> I have a problem I don’t know how to deal with.  I’ve used read.table to create the data frame called msPdf.  It contains three vectors:  month, site, conc.  Some of the site values look like this:  MLH\xca.  I was able to change that with this command: msPdf$site[13]="MLH”.  That seems to have been allowed because some of the other sites are simply MLH.  But the system won’t let me make the same kind of change in another case:
> msPdf$site[29]="MUH”.  When I ask for that, I get 
> Warning message:
> In `[<-.factor`(`*tmp*`, 29, value = c(5L, 2L, 1L, 11L, 5L, 12L,  :
>   invalid factor level, NA generated
>> #13  May MLH\xca 0.007
> Then if I enter msPdf$site[29], I get this response:
> [1] <NA>
> In other words, although other sites that were just plain MLH, there were not other sites that were just plain MUH.
> How can I fix this problem?  I have no idea why excel added those \xca bits to a few of the site values.
