[R] Re : Zoo and numeric data

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Aug 12 14:50:37 CEST 2009


On Wed, 2009-08-12 at 10:03 +0000, Inchallah Yarab wrote:
> why you don't use read.csv2 (you save your file.csv) and you write
> read.csv2("path file",sep=",")

No you don't!!! Please understand what read.csv2 is for. It is for
locales where the "," is used as the decimal point, e.g. 5,2323 ==
5.2323. As such, you can't use the comma as a separator otherwise you'd
be splitting on all the decimal points.

>From ?read.csv2

     read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".",
              fill = TRUE, comment.char="", ...)

     read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",",
               fill = TRUE, comment.char="", ...)

So by setting sep = "," you are creating all sorts of trouble for
yourself. If you are in locale that uses "," as the decimal point, then
using read.csv2 with sep = "," will loose the decimal places if you
change sep to be ",".

Use the correct function for the job:

      * Use read.csv() if in a locale where CSV files are separated by
        "," and decimal point represented by ".".
      * Use read.csv2() if in a locale where decimal point is "," and
        CSV files are separated by ";".
      * If you have special requirements, use read.table() and set 'sep'
        and 'dec' etc as suits your data.

And anyway, read.zoo is just another wrapper around read.table to help
with loading time series data into zoo objects. There is nothing wrong
in using it and it has several benefits over reading data in and
converting to zoo separately.

G

> hope this helps
> 
> 
> ________________________________
> De : Mark Breman <breman.mark at gmail.com>
> À : r-help at stat.math.ethz.ch
> Envoyé le : Mercredi, 12 Août 2009, 10h46mn 43s
> Objet : [R] Zoo and numeric data
> 
> Hi,
> I have a csv file with different datatypes:
> 
> 2009-01-01, character1, 10, 20.1
> 2009-01-02, character2, 11, 21.1
> 
> (I have attached the file to this post)
> 
> I read this file with read.zoo as I want a zoo/xts timeseries:
> > t = read.zoo("./data.txt", sep=",", dec = ".", header=FALSE)
> 
> If I look at the zoo data all integer/numeric columns are read as
> character:
> > str(t)
> ‘zoo’ series from 2009-01-01 to 2009-01-02
>   Data: chr [1:2, 1:3] " character1" " character2" "10" "11" "20.1" "21.1"
> - attr(*, "dimnames")=List of 2
>   ..$ : NULL
>   ..$ : chr [1:3] "V2" "V3" "V4"
>   Index: Class 'Date'  num [1:2] 14245 14246
> 
> So I try the colClasses parameter with read.zoo but it looks like this does
> not make any difference:
> > t1 = read.zoo("./data.txt", sep=",", dec = ".", header=FALSE,
> colClasses=c("Date", "character", "integer", "numeric"))
> > str(t1)
> ‘zoo’ series from 2009-01-01 to 2009-01-02
>   Data: chr [1:2, 1:3] " character1" " character2" "10" "11" "20.1" "21.1"
> - attr(*, "dimnames")=List of 2
>   ..$ : NULL
>   ..$ : chr [1:3] "V2" "V3" "V4"
>   Index: Class 'Date'  num [1:2] 14245 14246
> 
> Why does read.zoo ignores the colClasses parameter and how do I get
> integer/numeric data into my zoo series?
> 
> Regards,
> 
> -Mark-
> 
> 
> 
>       
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list