[R] How to read only specified columns from a data file
David Winsemius
dwinsemius at comcast.net
Wed Mar 16 14:03:19 CET 2011
On Mar 16, 2011, at 8:13 AM, Sarah Goslee wrote:
> read.table() looks at the first five rows when determining how many
> columns
> there are. If there are more columns in row 7 and you do not specify
> that in
> the read.table() command directly, they will be wrapped to the next
> row.
>
> This was discussed on the list within the last couple weeks.
In addition to Sarah's comments, I also not that you did not include
your code. I don't think it could have been identical to the code I
suggested, which was in turn based on the code you had proposed.
So ... what did you do to get that result?
--
David.
>
> Sarah
>
> On Wed, Mar 16, 2011 at 7:54 AM, Luis Ridao <luridao at gmail.com> wrote:
>> David,
>>
>> Thanks for your tip but it seems I'm having problems with the number
>> of columns R manages to read in. Below it s an example of the data
>> read in:
>>
>>> inp[1:20,]
>> V1 V2 V3 V4 V5 V6 V7
>> V8 V9
>> 1 1.0000 log_fy_coff -1.007600 0.119520 1.0000 NA
>> NA NA
>> 2 2.0000 log_fy_coff -0.935010 0.112840 0.8896 1.0000
>> NA NA
>> 3 3.0000 log_fy_coff -0.876260 0.107500 0.8219 0.8847 1.0000
>> NA NA
>> 4 4.0000 log_fy_coff -0.683090 0.103030 0.7656 0.8143 0.8747
>> 1.0000 NA
>> 5 5.0000 log_fy_coff -0.623500 0.100980 0.7206 0.7636 0.8086
>> 0.8764 1.0000
>> 6 6.0000 log_fy_coff -0.583330 0.098978 0.6819 0.7214 0.7615
>> 0.8150 0.8762
>> 7 1.0000 NA NA NA NA
>> NA NA
>> 8 7.0000 log_fy_coff -0.676790 0.096608 0.6521 0.6892 0.7254
>> 0.7719 0.8148
>> 9 0.8717 1.0000 NA NA NA NA
>> NA NA
>> 10 8.0000 log_fy_coff -0.696060 0.093761 0.6297 0.6654 0.6988
>> 0.7405 0.7750
>> 11 0.8116 0.8643 1.000000 NA NA NA
>> NA NA
>> 12 9.0000 log_fy_coff -0.527060 0.089949 0.6003 0.6347 0.6667
>> 0.7060 0.7367
>>
>> as you see there are only 9 columns in inp and the rest is read in in
>> the following row(see row 7)
>> I just don't understand why this is happening (using fill=T does not
>> help either)
>>
>> Best,
>> Luis
>>
>> On Tue, Mar 15, 2011 at 5:15 PM, David Winsemius <dwinsemius at comcast.net
>> > wrote:
>>>
>>> On Mar 15, 2011, at 1:11 PM, <rex.dwyer at syngenta.com> wrote:
>>>
>>>> I think you need to read an introduction to R.
>>>> For starters, read.table returns its results as a value, which
>>>> you are not
>>>> saving.
>>>> The probable answer to your question:
>>>> Read the whole file with read.table, and select columns you need,
>>>> e.g.:
>>>> tab <- read.table(myfile, skip=2)[,1:5]
>>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org
>>>> ]
>>>> On Behalf Of Luis Ridao
>>>> Sent: Tuesday, March 15, 2011 11:53 AM
>>>> To: r-help at r-project.org
>>>> Subject: [R] How to read only specified columns from a data file
>>>>
>>>> R-help,
>>>>
>>>> I'm trying to read a data file with plenty of columns.
>>>> I just need the first 5 but it doe not work by doing something
>>>> like:
>>>>
>>>>> mycols <- rep(NULL, 430) ; mycols[c(1:4)] <- NA
>>>>> read.table(myfile, skip=2, colClasses=mycols)
>>>
>>> I would have suggested:
>>>
>>> mycols <- rep(NULL, 430) ; mycols[1:5] <- rep("numeric", 5)
>>> inp <- read.table(myfile, skip=2, colClasses=mycols)
>>> head(inp)
>>>
>>> --
>>> David.
>>>
>>>>
>>>> Any suggestions?
>>>>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list