[R] Help Preparing Data for Heatmap creation
Derek Dees
djdees at gmail.com
Wed May 19 21:12:13 CEST 2010
All -
Below are samples of the data, a description of my approach and the
work I've done so far. My goal is two-fold, first learn more about R
and the creation of heatmaps; second, to create a heat map where the
vertical access is a series of days and the horizontal is the hour of
a day with the color of the cells being determined by the number of
texts in that hour. While the project doesn't have a lot of general
applicability to anything really significant for my job, I'm curious
about the distribution of texts sent by myself, my wife and my
children. So, a project that can scratch two itches with one stick.
My general approach is to:
0: Read in the data from the file downloaded from my cell phone
provider into a 6 column variable.
1: Add a 7th column combining the Date and Time columns, formatted as
a POSIXct value.
2: Create a data frame with 5 columns - day, hour, number sent, number
received, total number
4: Create the heatmap using the first two columns plus whichever of
the remaining 3 I'm interested in at the moment.
The issues I'm currently having revolve around creating the data frame
to hold the information that I will use for the heatmap. I cannot seem
to get round or round.POSIXt to properly round to the date hour
format, and I'm not sure I'm understanding working with POSIXct data.
I also am floundering on how to sum sent/received counts by hour. Not
having a statistics background and being an R newbie, I suspect I'm
asking the wrong questions in the documentation.
Any suggestions regarding how to achieve my aims or points to useful
material beyond "Learning R" and the R man pages would be appreciated
greatly.
I'm using R 2.10.1 on Windows XP.
The code I'm using is:
#!/usr/bin/R
# rdf:
# dc:title textingHeatMap.R
# dc:date 2010.05.14
# dc:creator http://www.mm.com/user/djdees/knows/who#derek-dees
# dc:language R
# dc:rights Copyright ©
# dc:description Creates a heatmap showing # of texts per hour per day.
# doap:SVNRepository
# doap:browse
# doap:homepage
# doap:wiki
# doap:program-language
# doap:version 1.0
# cvs:date $Date$
# Libraries/Packages
# Functions
# Setup
file <- "\\tmp\\R\\UnbilledMessaging.action"
# Read Data
rawData <- read.csv(file, header=TRUE, sep="\t",quote="\"")
# Work Data
rawData$time.stamp <- paste(rawData$Date, rawData$Time)
rawData$time.stamp <- as.POSIXct(rawData$time.stamp, format="%m/%d/%Y %I:%M %p")
format(rawData$time.stamp, "%m/%d/%Y %I")
# Plot
=================================================
Sanitized data looks like:
"Date" "Time" "To" "From" "Direction" "Message Type"
"05/03/2010" "9:49 AM" "0123456789" "0123456789" "Sent" "--"
"05/03/2010" "9:46 AM" "0123456789" "0123456789"
"Received" "--"
--
Derek
=========
djdees at gmail.com
The three-legged stool of understanding is held up by history,
languages, and mathematics. Equipped with these three you can learn
anything you want to learn. But if you lack any one of them you are
just another ignorant peasant with dung on your boots. — Robert A.
Heinlein
More information about the R-help
mailing list