[R] factors and characters when attaching data...

Gary Collins gco at eortc.be
Wed Apr 4 10:07:43 CEST 2001

Can someone help me with the following problem...
I have a dataframe with 62 columns a number of these are as.character and a
number of these are as.double, I read these into R-1.2.0 as...

> Version3.Studies_read.table("c:\\Version3.Studies.dat",sep="\t",
as.is=TRUE, header=TRUE,strip.white=TRUE)

This is fine up to here, I've checked to see if the data has been correctly
read and everything is ok. I've forced as.is=TRUE, so as not to create
factors and I've stripped any leading white space before character strings,
as I will use the character strings as identifiers for analysis and
programming later.

To make programming slightly easier and slightly more readable I attach this
to the search path.

> attach(Version3.Studies)

>From here though the character fields in the attached data are forced into
factors, which is not what I want.
Is this what should happen? Or is it a bug?

My problem is that when I want to identify a number of rows with a common
"character" identifier, when the identifier is forced to become a factor I
have to mess around counting white space and dealing appropriately.
So for example, take one particular field.
this is what I have to account for when forced to a factor,

Site=="        Lung"

when what I ideally want...

Gary S. Collins, PhD
Statistics Research Fellow,
Quality of Life Unit, 
European Organisation for Research and Treatment of Cancer, 
EORTC Data Center, 
Avenue E. Mounier 83, bte. 11,
B-1200 Brussels, Belgium.

Tel: +32 2 774 1 606
Fax: +32 2 779 4 568
Email: gco at eortc.be

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list