[R] problem in reading file
jim holtman
jholtman at gmail.com
Mon Nov 28 14:20:37 CET 2011
The basic problem is that read.table reads the first 5 lines to
determine the number of columns to process, so when reading 'six' it
reads up to 0.303 and then on the next read, it assumes that the '0'
at the end of line 'six' is the rownames for the next row. Same thing
happens with the rest of the read. Might try specifying 'colClasses'
with the maximum number of columns that you will be processing, or
make sure the largest is first. When you do that, you get:
> x <- read.table(text = "niner,0.519,0.484,0.467,0.167,0.455,0.311,0.574,0.557,0
+ one,0
+ two,0.591,0
+ three,0.356,0.350,0
+ four,-0.098,0.072,0.380,0
+ five,0.573,0.408,0.382,0.062,0
+ six,0.156,0.232,0.517,0.424,0.303,0
+ seven,0.400,0.414,0.611,0.320,0.401,0.479,0
+ eight,0.282,0.375,0.512,0.346,0.308,0.463,0.605,0
+ nine,0.519,0.484,0.467,0.167,0.455,0.311,0.574,0.557,0"
+ , row.names = 1
+ , nrows = 100
+ , sep = ','
+ , colClasses = c('character', rep('numeric', 9))
+ , fill = TRUE
+ , flush = TRUE
+ )
>
> x
V2 V3 V4 V5 V6 V7 V8 V9 V10
niner 0.519 0.484 0.467 0.167 0.455 0.311 0.574 0.557 0
one 0.000 NA NA NA NA NA NA NA NA
two 0.591 0.000 NA NA NA NA NA NA NA
three 0.356 0.350 0.000 NA NA NA NA NA NA
four -0.098 0.072 0.380 0.000 NA NA NA NA NA
five 0.573 0.408 0.382 0.062 0.000 NA NA NA NA
six 0.156 0.232 0.517 0.424 0.303 0.000 NA NA NA
seven 0.400 0.414 0.611 0.320 0.401 0.479 0.000 NA NA
eight 0.282 0.375 0.512 0.346 0.308 0.463 0.605 0.000 NA
nine 0.519 0.484 0.467 0.167 0.455 0.311 0.574 0.557 0
>
On Mon, Nov 28, 2011 at 7:02 AM, chakri <chakri2sai at yahoo.co.in> wrote:
> Hi,
>
> I have a file that looks like this :
>
> one,0
> two,0.591,0
> three,0.356,0.350,0
> four,-0.098,0.072,0.380,0
> five,0.573,0.408,0.382,0.062,0
> six,0.156,0.232,0.517,0.424,0.303,0
> seven,0.400,0.414,0.611,0.320,0.401,0.479,0
> eight,0.282,0.375,0.512,0.346,0.308,0.463,0.605,0
> nine,0.519,0.484,0.467,0.167,0.455,0.311,0.574,0.557,0
>
> I want to create a data matrix out of it, so I tried this :
>
> x<-read.table('test.csv',fill=TRUE, sep=",",dec=".",row.names=1)
> row.names(x) # print row names
> output
> [1] "one" "two" "three" "four" "five" "six" "0" "seven" "0.479"
> [10] "eight" "0.463" "nine" "0.311"
>
> Even numerics are taken in as row names. What am I missing here ?
>
> Thanks in Advance
> Chakri
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/problem-in-reading-file-tp4114955p4114955.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list