[R] Weird read.table error? (line `n' did not have `m' elements)

Gabor Grothendieck ggrothendieck at gmail.com
Tue Sep 22 05:18:10 CEST 2009


It has a # in it as I previously suggested.

On Mon, Sep 21, 2009 at 11:08 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
> Here are the outputs.
>
>> strsplit(scanned_file[5205],'\t')[[1]]
>  [1] "6836237"
>  [2] "8.146431"
>  [3] "8.197432"
>  [4] "8.156005"
>  [5] "7.98905"
>  [6] "8.327593"
>  [7] "7.673796"
>  [8] "8.119687"
>  [9] "8.077252"
> [10] "Asap1 "
> [11] "NM_010026 "
> [12] "RefSeq "
> [13] "Mus musculus ArfGAP with SH# domain, ankyrin repeat and PH
> domain1 (Asap1), mRNA. "
> [14] "FALSE"
> [15] "GO:0032312 "
> [16] "regulation of ARF GTPase activity "
> [17] "GO:0005737  // GO:0016020 "
> [18] "cytoplasm  // membrane "
> [19] "GO:0005096  // GO:0005515  // GO:0008060  // GO:0008270  //
> GO:0046872 "
> [20] "GTPase activator activity  // protein binding  // ARF GTPase
> activator activity  // zinc ion binding  // metal ion binding "
> [21] "---"
> [22] "---"
>> scanned_file[5205]
> [1] "6836237\t8.146431\t8.197432\t8.156005\t7.98905\t8.327593\t7.673796\t8.119687\t8.077252\tAsap1
> \tNM_010026 \tRefSeq \tMus musculus ArfGAP with SH# domain, ankyrin
> repeat and PH domain1 (Asap1), mRNA. \tFALSE\tGO:0032312 \tregulation
> of ARF GTPase activity \tGO:0005737  // GO:0016020 \tcytoplasm  //
> membrane \tGO:0005096  // GO:0005515  // GO:0008060  // GO:0008270  //
> GO:0046872 \tGTPase activator activity  // protein binding  // ARF
> GTPase activator activity  // zinc ion binding  // metal ion binding
> \t---\t---"
>
>
> On Mon, Sep 21, 2009 at 9:34 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>> Its highly unusual to use xls as the extension for a text file.
>> Use something more suggestive.
>>
>> print out the line in question.  For example, note that scan
>> and read.table have different defaults for the comment character,
>> namely, none and #.
>>
>> On Mon, Sep 21, 2009 at 10:23 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
>>> On Mon, Sep 21, 2009 at 9:12 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
>>>> Hi,
>>>>
>>>> I have the following commands. It says line 5205 does not have 22
>>>> elements. But I use my 'vim' checked that line in the file. It has 22
>>>> fields. Can somebody let me know how to further debug this case?
>>>>
>>>> Regards,
>>>> Peng
>>>>
>>>>> annotation = read.table("../EC_results/Juan_15wks_gene_core.xls", header=T, sep='\t',quote='')
>>>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>>>>  line 5204 did not have 22 elements
>>>>> annotation = count.fields("../EC_results/Juan_15wks_gene_core.xls", sep='\t',quote='')
>>>>> which(annotation!=22)
>>>> [1] 5205
>>>>
>>>
>>>
>>> I also run the following command to test, which confirms that line
>>> 5205 has 22 elements. Is it a bug in read.table?
>>>
>>>> scanned_file = scan("../EC_results/Juan_15wks_gene_core.xls", what=character(),sep='\n',quote='')
>>> Read 23333 items
>>>> length(strsplit(scanned_file[5205],'\t')[[1]])
>>> [1] 22
>>>
>>> Regards,
>>> Peng
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list