[R] help with read.csv() for files with different number of columns
Fix Ace
acefix at rocketmail.com
Tue Aug 29 18:22:12 CEST 2017
Thank you very much! Looks like I have to know the length of each record ahead of time.
Ace
On Monday, August 28, 2017 12:56 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Ace,
With tabs as separators:
testdf<-read.table("test.txt",header=FALSE,fill=TRUE,sep="\t",
col.names=paste("V",1:19,sep=""),stringsAsFactors=FALSE)
Also note that I got the number of columns wrong the first time.
Jim
On Mon, Aug 28, 2017 at 12:56 PM, Fix Ace <acefix at rocketmail.com> wrote:
> Hi, Jim,
>
> Thank you very much for pointing out the format issue. Here is the original
> text:
>
> ===
> I have a text file (test.txt) with different number of columns:
>
> 0610007P14Rik%%% Tcf19 Gtf2i
> 0610010O12Rik%%% Ivns1abp Etv6
> 1100001G20Rik%%% Nmi
> 1500015O10Rik%%% Foxi1 Ascl3 Sirt3
> 1700003E16Rik%%% Ascl2 Ifnar2
> 1700028J19Rik%%% Musk Nfe2l3
> 1810011O10Rik%%% Ppp1r13b Bpnt1 Cdkn2c Foxc1 Sox10 Smarca2
> 1810019D21Rik%%% Asb8
> 1810037I17Rik%%% Zfp612
> 1810055G02Rik%%% Nkx2-3 Maged1 Runx1 Ugp2 Elk4 Spdef Tcf19 Isl2 Gtf2i
> Ctnnbl1 Tcea3 Ank2 Zfp612 Creb3l1 Nupr1 3632451O06Rik Creb3l4 Lass6
>
> I wold like to read it into R using
>
>> test=read.csv("test.txt",sep="\t",header=FALSE)
>
> However, when I check the r object "test", I found that all the rows have 5
> columns:
>
>> test
> V1 V2 V3 V4 V5
> 1 0610007P14Rik%%% Tcf19 Gtf2i
> 2 0610010O12Rik%%% Ivns1abp Etv6
> 3 1100001G20Rik%%% Nmi
> 4 1500015O10Rik%%% Foxi1 Ascl3 Sirt3
> 5 1700003E16Rik%%% Ascl2 Ifnar2
> 6 1700028J19Rik%%% Musk Nfe2l3
> 7 1810011O10Rik%%% Ppp1r13b Bpnt1 Cdkn2c Foxc1
> 8 Sox10 Smarca2
> 9 1810019D21Rik%%% Asb8
> 10 1810037I17Rik%%% Zfp612
> 11 1810055G02Rik%%% Nkx2-3 Maged1 Runx1 Ugp2
> 12 Elk4 Spdef Tcf19 Isl2 Gtf2i
> 13 Ctnnbl1 Tcea3 Ank2 Zfp612 Creb3l1
> 14 Nupr1 3632451O06Rik Creb3l4 Lass6
>
> Basically it breaks some rows into more than one rows. For example, row 7 in
> the original record becomes two rows. Looks like the "test" always has 5
> columns.
>
> How does this happen? How should I fix it to make one record into one two in
> R object?
>
> ==
>
> Please let me know if it is readable now. Thank you very much for your time!
>
> Kind regards,
>
> Ace
>
>
> On Sunday, August 27, 2017 7:25 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
>
>
> Hi Ace,
> As your example seems to have spaces as separators,
>
> testdf<-read.table("test.txt",header=FALSE,fill=TRUE,
> col.names=paste("V",1:14,sep=""),stringsAsFactors=FALSE)
>
> By specifying the number of columns with "col.names" and using
> "fill=TRUE" you can get a data frame with zero length strings where
> values are missing in the input file.
>
> Jim
>
> On Mon, Aug 28, 2017 at 6:25 AM, Fix Ace via R-help
> <r-help at r-project.org> wrote:
>> Dear R community,
>> I have a text file (test.txt) with different number of columns:
>> 0610007P14Rik%%% Tcf19 Gtf2i 0610010O12Rik%%% Ivns1abp Etv6
>> 1100001G20Rik%%% Nmi 1500015O10Rik%%% Foxi1 Ascl3 Sirt3 1700003E16Rik%%%
>> Ascl2 Ifnar2 1700028J19Rik%%% Musk Nfe2l3 1810011O10Rik%%% Ppp1r13b Bpnt1
>> Cdkn2c Foxc1 Sox10 Smarca2 1810019D21Rik%%% Asb8 1810037I17Rik%%% Zfp612
>> 1810055G02Rik%%% Nkx2-3 Maged1 Runx1 Ugp2 Elk4 Spdef Tcf19 Isl2 Gtf2i
>> Ctnnbl1 Tcea3 Ank2 Zfp612 Creb3l1 Nupr1 3632451O06Rik Creb3l4 Lass6
>> I wold like to read it into R using
>> > test=read.csv("test.txt",sep="\t",header=FALSE)
>> However, when I check the r object "test", I found that all the rows have
>> 5 columns:
>>> test V1 V2 V3 V4 V51
>>> 0610007P14Rik%%% Tcf19 Gtf2i 2 0610010O12Rik%%%
>>> Ivns1abp Etv6 3 1100001G20Rik%%% Nmi
>>> 4 1500015O10Rik%%% Foxi1 Ascl3 Sirt3 5 1700003E16Rik%%%
>>> Ascl2 Ifnar2 6 1700028J19Rik%%% Musk Nfe2l3
>>> 7 1810011O10Rik%%% Ppp1r13b Bpnt1 Cdkn2c Foxc18 Sox10
>>> Smarca2 9 1810019D21Rik%%% Asb8
>>> 10 1810037I17Rik%%% Zfp612 11 1810055G02Rik%%%
>>> Nkx2-3 Maged1 Runx1 Ugp212 Elk4 Spdef Tcf19 Isl2
>>> Gtf2i13 Ctnnbl1 Tcea3 Ank2 Zfp612 Creb3l114
>>> Nupr1 3632451O06Rik Creb3l4 Lass6
>> Basically it breaks some rows into more than one rows. For example, row 7
>> in the original record becomes two rows. Looks like the "test" always has 5
>> columns.
>> How does this happen? How should I fix it to make one record into one two
>> in R object?
>> Thank you very much!
>> Ace
>
>>
>>
>>
>>
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list