[R] read a file of text with read.table

Göran Broström goran.brostrom at umu.se
Thu Jun 26 11:35:23 CEST 2014


Carol,

while sep="" is the default, it really means 'whitespace', see the 
documentation of 'sep'.

Göran Broström

On 2014-06-26 11:21, carol white wrote:
> Hi,
>
> with read.fwf, it works.
>
>
> But I still don't understand why it doesn't work with read.table
> since the sep by default is "", which is the case and in one trial, I
> used  read.table("myfile",colClasses = "character",
> stringsAsFactors=FALSE, and stil didn't work but it should have.
>
> Regards,
>
>
>
> On Thursday, June 26, 2014 9:59 AM, Ron Crump
> <R.E.Crump at warwick.ac.uk> wrote:
>
>
>
> Hi Carol,
>> It might be a primitive question but I have a file of text and
>> there is no separator between character on each line and the
>> strings on each line have the same length. The format is like the
>> following
>>
>> absfjdslf jfdldskjff jfsldfjslk
>>
>> When I read the file with read.table("myfile",colClasses =
>> "character"), instead of putting the strings in a table of number
>> of rows x length of string, read.table saves the file in a table of
>> number of rows x 1 and each element seems to be a factor. Why does
>> read.table not account for  colClasses = "character"?
> read.table relies on a separator to differentiate between columns, so
> it is not appropriate for your file, read.fwf would do the job.
>
> Setting colClasses (in my understanding) tells read.table how to
> treat input as it comes in - so it disables some testing of data
> types and makes reading quicker, it does not disable the setting of
> character data to be factors, which is the default. You need to use
> the stringsAsFactors=FALSE option for that.
>
> So, for your example (and I have added a letter to the first row to
> make it the same length as the others):
>
> cf <- "absfjdslfx
>
> jfdldskjff jfsldfjslk"
>
> cdf <-
> read.fwf(textConnection(cf),widths=rep(1,10),colClasses="character",stringsAsFactors=FALSE)
>
>  See ?read.fwf for more information. A width is required for each
> column (in this case 1 repeated 10 times).
>
> Hope this helps.
>
> Ron. [[alternative HTML version deleted]]
>



More information about the R-help mailing list