[R] problem in reading file

Mon Nov 28 14:20:37 CET 2011

The basic problem is that read.table reads the first 5 lines to
determine the number of columns to process, so when reading 'six' it
reads up to 0.303 and then on the next read, it assumes that the '0'
at the end of line 'six' is the rownames for the next row.  Same thing
happens with the rest of the read.  Might try specifying 'colClasses'
with the maximum number of columns that you will be processing, or
make sure the largest is first.  When you do that, you get:

> x <- read.table(text = "niner,0.519,0.484,0.467,0.167,0.455,0.311,0.574,0.557,0
+ one,0
+ two,0.591,0
+ three,0.356,0.350,0
+ four,-0.098,0.072,0.380,0
+ five,0.573,0.408,0.382,0.062,0
+ six,0.156,0.232,0.517,0.424,0.303,0
+ seven,0.400,0.414,0.611,0.320,0.401,0.479,0
+ eight,0.282,0.375,0.512,0.346,0.308,0.463,0.605,0
+ nine,0.519,0.484,0.467,0.167,0.455,0.311,0.574,0.557,0"
+     , row.names = 1
+     , nrows = 100
+     , sep = ','
+     , colClasses = c('character', rep('numeric', 9))
+     , fill = TRUE
+     , flush = TRUE
+     )
>
> x
          V2    V3    V4    V5    V6    V7    V8    V9 V10
niner  0.519 0.484 0.467 0.167 0.455 0.311 0.574 0.557   0
one    0.000    NA    NA    NA    NA    NA    NA    NA  NA
two    0.591 0.000    NA    NA    NA    NA    NA    NA  NA
three  0.356 0.350 0.000    NA    NA    NA    NA    NA  NA
four  -0.098 0.072 0.380 0.000    NA    NA    NA    NA  NA
five   0.573 0.408 0.382 0.062 0.000    NA    NA    NA  NA
six    0.156 0.232 0.517 0.424 0.303 0.000    NA    NA  NA
seven  0.400 0.414 0.611 0.320 0.401 0.479 0.000    NA  NA
eight  0.282 0.375 0.512 0.346 0.308 0.463 0.605 0.000  NA
nine   0.519 0.484 0.467 0.167 0.455 0.311 0.574 0.557   0
>

On Mon, Nov 28, 2011 at 7:02 AM, chakri <chakri2sai at yahoo.co.in> wrote:
> Hi,
>
> I have a file that looks like this :
>
> one,0
> two,0.591,0
> three,0.356,0.350,0
> four,-0.098,0.072,0.380,0
> five,0.573,0.408,0.382,0.062,0
> six,0.156,0.232,0.517,0.424,0.303,0
> seven,0.400,0.414,0.611,0.320,0.401,0.479,0
> eight,0.282,0.375,0.512,0.346,0.308,0.463,0.605,0
> nine,0.519,0.484,0.467,0.167,0.455,0.311,0.574,0.557,0
>
> I want to create a data matrix out of it, so I tried this :
>
> x<-read.table('test.csv',fill=TRUE, sep=",",dec=".",row.names=1)
> row.names(x) # print row names
> output
>  [1] "one"   "two"   "three" "four"  "five"  "six"   "0"     "seven" "0.479"
> [10] "eight" "0.463" "nine"  "0.311"
>
> Even numerics are taken in as row names. What am I missing here ?
>
> Thanks in Advance
> Chakri
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/problem-in-reading-file-tp4114955p4114955.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.