[R] About importing CSV file

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Apr 11 13:49:46 CEST 2005


Adaikalavan Ramasamy <ramasamy at cancer.org.uk> writes:

> So you expect R to do the following :
> 
>  convert  15,15,24    to  15.00, 15.24  AND
>  convert  16,5,16,88  to  16.50, 16.88
> 
> The second conversion should be fairly easy BUT how do you expect R to
> know that the first conversion should produce 15.00, 15.24 and not
> 15.15, 24.00 ?
> 
> Without knowing which is which, it would be dangerous and hard to write
> any sort of automated script for this.
> 
> The best thing would be for you to go back and change your inputs
> manually. You can either use 15,,15,24 for consistency or better yet
> simply use 15.0, 15.24.
> 
> Regards, Adai
> 

Also, if Excel is really generating this format, something is
seriously wrong with Excel or your locale settings. Normally, locales
with comma as decimal point also use semicolon as the field separator
in CSV files, which is the format that read.csv2() handles. Or, just
export as text (TAB delimited) and use read.delim2().
 
> 
> On Mon, 2005-04-11 at 12:23 +0200, Silvia Bachetti wrote:
> > I have a problem for reading a data from excel (.csv) to R, because the 
> > numeric variables ( float) are separeted by comma (,) and not by point (.). 
> > And all the variables are separated by comma (,). The string variables are 
> > between (""). So importing to R the numeric variable (float) are reading as 
> > integer, and not with decimal part.
> > Is there a way to solve this problem?
> > 
> > The data set is like this:
> > 
> > CODE,"A1","A2","A3","A4","A5","A6","A7","A8","A9","A10","A11","A12","NOTE1","A13","A14","A15","NOTE2","A16","A17","A18","A19","A20","A21","NOTE3","A22","A23","A24","A25","A26","A27","A28","NOTE4","CONCLUSION" 
> > 
> > 991,"1","14","17","TM 
> > LUNG","19","CARCINOMA","17",,,"14","14","14",15,57,"2","17","17",17,"3","8","14","14","14","17",13,4,"4","14","17","14","14","14","17",15,15,24 
> > 
> > 992,"1","17","17","BPCO","17","TM 
> > LUNG","17",,,"17","17","17",17,"2","17","17",17,"3","17","17","17","17","17",17,"4","17","17","17","14","17","17",16,5,16,88 
> > 
> > 993,"1","17","17","TM 
> > LUNG","17","BPCO","17",,,"17","17","17",17,"2","17","17",17,"3","14","17","17","17","17",16,4,"4","17","17","17","17","14","17",16,5,16,73 
> > 
> > 994,"1","14","14","NN","17","NN","17","NN","17","14","17","17",15,88,"2","14","14",14,"3","14","17","14","17","14",15,2,"4","14","8","17","14","14","14",13,5,14,64 
> > 
> > 
> > I use this command in R: read.csv(file="prova.csv" , header=TRUE , sep="," 
> > , dec="," , fill=TRUE).
> > But for example in the first record with code 991: the last part is 
> > recorded in three integer variable 15 - 15 - 24 and not in two float 
> > variable as 15 - 15.24
> > In the second record with code 992: the last part is recorded in four 
> > integer variable 16 - 5 - 16 - 88 and not in two float variable as 16.5 - 16.88
> > In the third record with code 993: the middle part is recorded in two 
> > integer variable 16 - 4 and not in one float variable as 16.4. And the last 
> > part is recorded in four integer variable 16 - 5 - 16 - 73 and not in two 
> > float variable as 16.5 - 16.73
> > And so on.
> > 
> > I there a solution?
> > 
> > Thanks.
> > 
> > Silvia
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list