[R] Different behaviour of data()
Jan_Svatos@eurotel.cz
Jan_Svatos at eurotel.cz
Thu Jan 3 12:16:20 CET 2002
Dear List,
I frequently use the
data()
function to load csv files (with separator ";") into R session,
typically
data(myfile)
loads myfile.csv from my working/data directory into R.
Now, in 1.4.0 version, everything works as expected, but with one
difference:
The values readed in older versions in "num" mode are now readed as "int"
mode,
converting the values larger than 2147483647 (2^{31}-1) into that value.
This has a consequence when reading such kind of data:
<example>
File
alerts.csv
looks like:
"IMSI";"DialedDigits";"Cnt";"Pri";"Dur"
"230020100010125";"+28491628975809";3;332;2391
"230020100010125";"+28491723744868";1;12;75
etc...
with first row being the colnames of resulting dataframe.
<R-1.3.1>
In 1.3.1 session:
>data(alerts); str(alerts$IMSI)
gives
num [1:2793] 2.3e+14 2.3e+14 2.3e+14 2.3e+14 2.3e+14 ...
>str(as.character(alerts$IMSI))
gives
chr [1:2793] "230020100010125" "230020100010125" "230020100010125" ...
and
>n<-length(unique(alerts$IMSI)); n
gives 125, (i.e. reads the data as they are)
</R-1.3.1>
<R-1.4.0>
while the same on 1.4.0 gives
int [1:2793] 2147483647 2147483647 2147483647 ...
and
>n<-length(unique(alerts$IMSI)); n
gives 1. (i.e. reflects the conversion of the data in int mode, which
destroys the info about
IMSI numbers, which are always 15 digit numbers)
</R-1.4.0>
</example>
I was unable to find in http://cran.r-project.org/src/base/NEWS
some comment to this new behaviour of data().
What I found was:
---
read.table() has new arguments `nrows' and `colClasses'. If the
latter is NA (the default), conversion is attempted to
logical, integer, numeric or complex, not just to numeric
---
Should I use read.table() with colClasses specified (instead of data())?
Why not, but this involves lots of "hand-made" changes to my R-scripts,
which is unpleasant and involves risk of some typos and so on.
Is there some more "systematic" way to solve this problem?
>version
platform i386-pc-mingw32
arch x86
os Win32
system x86, Win32
status
major 1
minor 4.0
year 2001
month 12
day 19
language R
Thanks In Advance,
Jan
-------------------------------------------------
designed for _monospaced_ font
-------------------------------------------------
/- Jan Svatos, PhD Sokolovska 855/225 -/
/- Data Analyst, Prague 9 -/
/- Eurotel Praha 190 00 -/
/- jan_svatos at eurotel.cz Czechia -/
-------------------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list