[R] Separating columns, and sorting by rows

Mon Feb 15 08:09:55 CET 2010

On Feb 15, 2010, at 1:22 AM, milton ruser wrote:

> Hi Raging Jim
>
> may be this is a starting point.
>
> myDF<-read.table(stdin(),head=T,sep=",")

Those "yyyymm" entries will become factors, which can lead to  
confusion for newbies. Might be more straightforward to always use  
stringsAsFactors=FALSE in the read.table arguments.  I see that the  
yyymm column later gets thrown away so it may not matter here.

> yyyymm,Rainfall
> 1977-02,17.4
> 1977-03,34.0
> 1977-04,26.2
> 1977-05,42.6
> 1977-06,58.6
> 1977-07,23.2
> 1977-08,26.8
> 1977-09,48.4
> 1977-10,47.0
> 1977-11,37.2
> 1977-12,15.0
> 1978-01,2.6
> 1978-02,6.8
> 1978-03,9.0
> 1978-04,46.6
>

When I did a very similar maneuver, I added an extra NA entry at the  
beginning:

myDF <- rbind(list(yyyymm="1977-01", Rainfall=NA), myDF)

... so the columns would start with January. (The warning is harmless.)

> myDF$yyyy<-substr(myDF$yyyymm,1,4)
> myDF$mm<-substr(myDF$yyyymm,6,7)
> myDF<-subset(myDF, select=c(yyyy,mm,Rainfall))
> myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy",
> timevar="mm",direction="wide")
> myDF.reshape
> best regards

When the time comes to rename those columns, knowing that there is a  
system constant called month.names may come in handy. Perhaps  
(untested):

names(myDF.reshape) <- c("Year", month.names[1:12])

>
> milton
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT