[R] iconv question: SQL Server 2005 to R

Milan Bouchet-Valat nalimilan at club.fr
Wed Oct 9 11:37:16 CEST 2013


Le mardi 08 octobre 2013 à 16:02 -0700, Ira Sharenow a écrit :
> A colleague is sending me quite a few files that have been saved with MS 
> SQL Server 2005. I am using R 2.15.1 on Windows 7.
> 
> I am trying to read in the files using standard techniques. Although the 
> file has a csv extension when I go to Excel or WordPad and do SAVE AS I 
> see that it is Unicode Text. Notepad indicates that the encoding is 
> Unicode. Right now I have to do a few things from within Excel (such as 
> Text to Columns) and eventually save as a true csv file before I can 
> read it into R and then use it.
> 
> Is there an easy way to solve this from within R? I am also open to easy 
> SQL Server 2005 solutions.
>
> I tried the following from within R.
> 
> testDF = read.table("Info06.csv", header = TRUE, sep = ",")
> 
> > testDF2 =  iconv(x = testDF, from = "Unicode", to = "")
> 
> Error in iconv(x = testDF, from = "Unicode", to = "") :
> 
> unsupported conversion from 'Unicode' to '' in codepage 1252
> 
> # The next line did not produce an error message
> 
> > testDF3 =  iconv(x = testDF, from = "UTF-8" , to = "")
> 
> > testDF3[1:6,  1:3]
> 
> Error in testDF3[1:6, 1:3] : incorrect number of dimensions
> 
> # The next line did not produce an error message
> 
> > testDF4 =  iconv(x = testDF, from = "macroman" , to = "")
> 
> > testDF4[1:6,  1:3]
> 
> Error in testDF4[1:6, 1:3] : incorrect number of dimensions
> 
> >  Encoding(testDF3)
> 
> [1] "unknown"
> 
> >  Encoding(testDF4)
> 
> [1] "unknown"
> 
> This is the first few lines from WordPad
> 
> Date,StockID,Price,MktCap,ADV,SectorID,Days,A1,std1,std2
> 
> 2006-01-03 
> 00:00:00.000, at Stock1,2.53,467108197.38,567381.144444444,4,133.14486997089,-0.0162107939626307,0.0346283580367959,0.0126471695454834
> 
> 2006-01-03 
> 00:00:00.000, at Stock2,1.3275,829803070.531114,6134778.93292,5,124.632223896458,0.071513138376339,0.0410694546850102,0.0172091268025929
What's the actual problem? You did not state any. Do you get accentuated
characters that are not printed correctly after importing the file? In
the two lines above it does not look like there would be any non-ASCII
characters in this file, so encoding would not matter.


Regards



More information about the R-help mailing list