[R] Variable Class "numeric" instead recognized by dplyr as a 'factor'

james.vordtriede at att.net james.vordtriede at att.net
Sun Sep 27 09:58:07 CEST 2015

Hi--I’m new to R.  For a dissertation, my panel data is for 48 Sub-Saharan countries (cross-sectional index=’i’) over 55 years 1960-2014 (time-series index=’t’).  The variables read into R from a text file are levels data.  The 2SLS regression due to reverse causality will be based on change in the levels data, so will need to difference the data grouped by cross-sectional index ‘i’.  

There are nearly 50 total variables, but the model essentially will regress the differenced Yit ~ X1it+X2it+X3it+X4it+X5it+X6it, with a dummy variable attached to each of the change-X(s).

Due to missing data, R originally classified each X and Y variable as a ‘factor’, subsequently changed to ‘numeric’ via ‘as.numeric’ command.  

However, when I write the following command for dplr solely to difference Yit (=Yit-Yi[t-1]) mutated to new variable dYit, I receive error messages to the effect that Yit and each of the X variables are ‘factors’.

>library (dplr)

>dt = CSUdata2 %>% group_by (i) %>% (dYit=Yit-lag(Yit))

‘CSUdata2’ is the object in which the tab-delimited text file dataset is stored.  


 Any idea why dplyr reads the variables as ‘factors’?  A class(*) command per variable shows R to know each Y and X as ‘numeric’.

Is the command to difference Yit done correctly?  I plan to use the same command for each variable requiring change until I understand the commands better.

Thank you.

Sent from Windows Mail
	[[alternative HTML version deleted]]

More information about the R-help mailing list