[R] inserting into data frame gives "invalid factor level, NAs generated"

Scott Sherrill-Mix shescott at mail.med.upenn.edu
Wed Aug 12 18:02:45 CEST 2009


Your running into the pretty common factor vs character problem in R.
By default data.frame turns character vectors into factor (sort of
like ENUM in mysql) vectors. Since you only have 1 factor (empty
string '') in your starting dataframe, when you go to insert new data
R sees a new value and complains. You'd probably be pretty safe using
character columns instead of factors for now (by adding
stringsAsFactors=FALSE to data.frame()) e.g.:

goframe<-data.frame(goA = character(10), goB = character(10), value
=numeric(10),stringsAsFactors=FALSE)

Scott

Scott Sherrill-Mix
Department of Microbiology
University of Pennsylvania
402B Johnson Pavilion
3610 Hamilton Walk
Philadelphia, PA  19104-6076



On Wed, Aug 12, 2009 at 11:49 AM, <karinlag at ifi.uio.no> wrote:
> I am calculating some values that I am inserting into a data frame. From
> what I have read, creating the dataframe ahead of time is more efficient,
> since rbind (so far the only solution I have found to appending to a data
> frame) is not very fast.
>
> What I am doing is the following:
>
> # create data frame
>
> goframe = data.frame(goA = character(10), goB = character(10), value =
> numeric(10))
> goframe[1,] = c("AA", "BB", 0.4)
>
> Result is:
>
>> goframe[1,] = c("AA", "BB", 0.4)
> Warning messages:
> 1: In `[<-.factor`(`*tmp*`, iseq, value = "AA") :
>  invalid factor level, NAs generated
> 2: In `[<-.factor`(`*tmp*`, iseq, value = "BB") :
>  invalid factor level, NAs generated
>>
>
> Is there another/better/more recomended way of doing this? If not, how do
> I do this without getting all the warnings?
>
> Thanks!
>
> Best,
>
> Karin Lagesen
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list