[R] Different behaviour of data()

Jan_Svatos@eurotel.cz Jan_Svatos at eurotel.cz
Thu Jan 3 12:16:20 CET 2002


Dear List,

I frequently use the

data()

function to load csv files (with separator ";") into R session,
typically

data(myfile)

loads myfile.csv from my working/data directory into R.
Now, in 1.4.0 version, everything works as expected, but with one
difference:
The values readed in older versions in "num" mode are now readed as "int"
mode,
converting the values larger than 2147483647 (2^{31}-1) into that value.

This has a consequence when reading such kind of data:

<example>

File
alerts.csv
looks like:

"IMSI";"DialedDigits";"Cnt";"Pri";"Dur"
"230020100010125";"+28491628975809";3;332;2391
"230020100010125";"+28491723744868";1;12;75
etc...
with first row being the colnames of resulting dataframe.

<R-1.3.1>
In 1.3.1 session:
>data(alerts); str(alerts$IMSI)
gives

num [1:2793] 2.3e+14 2.3e+14 2.3e+14 2.3e+14 2.3e+14 ...

>str(as.character(alerts$IMSI))
gives
chr [1:2793] "230020100010125" "230020100010125" "230020100010125" ...

and
>n<-length(unique(alerts$IMSI)); n
gives 125, (i.e. reads the data as they are)

</R-1.3.1>

<R-1.4.0>

while the same on 1.4.0 gives

int [1:2793] 2147483647  2147483647 2147483647 ...

and
>n<-length(unique(alerts$IMSI)); n
gives 1. (i.e. reflects the conversion of the data in int mode, which
destroys the info about
IMSI numbers, which are always 15 digit numbers)

</R-1.4.0>
</example>

I was unable to find in http://cran.r-project.org/src/base/NEWS
some comment to this new behaviour of data().
What I found was:

---
read.table() has new arguments `nrows' and `colClasses'.  If the
           latter is NA (the default), conversion is attempted to
           logical, integer, numeric or complex, not just to numeric
---

Should I use read.table() with colClasses specified (instead of data())?

Why not, but this involves lots of "hand-made" changes to my R-scripts,
which is unpleasant and involves risk of some typos and so on.

Is there some more "systematic" way to solve this problem?

>version

platform i386-pc-mingw32
arch     x86
os       Win32
system   x86, Win32
status
major    1
minor    4.0
year     2001
month    12
day      19
language R

Thanks In Advance,
Jan

-------------------------------------------------
designed for _monospaced_ font
-------------------------------------------------
/- Jan Svatos,  PhD         Sokolovska 855/225 -/
/- Data Analyst,            Prague 9           -/
/- Eurotel Praha            190 00             -/
/- jan_svatos at eurotel.cz    Czechia            -/
-------------------------------------------------

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list