[R] read.table truncated data?
jim holtman
jholtman at gmail.com
Thu Aug 25 17:57:03 CEST 2011
But did you try the following:
x <- read.table(...., comment.char = '', quote = '')
Most cases is that there is a missing quote somewhere in your data.
use a text editor and search for single and double quotes.
On Thu, Aug 25, 2011 at 11:49 AM, zhenjiang xu <zhenjiang.xu at gmail.com> wrote:
> Thanks for your replies. I looked at those lines and didn't spot anything
> unusual.
>
>> tail(a)
> test_id gene_id gene locus sample_1 sample_2 status
> 21418 tY(GUA)J1 - SUP7 chr10:354243-354332 air1rrp6 air2rrp6 OK
> 21419 tY(GUA)J2 - SUP4 chr10:542955-543044 air1rrp6 air2rrp6 OK
> 21420 tY(GUA)M1 - SUP5 chr13:168794-168883 air1rrp6 air2rrp6 OK
> 21421 tY(GUA)M2 - SUP8 chr13:837927-838016 air1rrp6 air2rrp6 OK
> 21422 tY(GUA)O - SUP3 chr15:288191-288280 air1rrp6 air2rrp6 OK
> 21423 tY(GUA)Q - - chrmt:70823-70907 air1rrp6 air2rrp6 OK
> value_1 value_2 ln.fold_change. test_stat p_value q_value
> significant
> 21418 0.00000 0.0000 0.000000 0.00000 1.000000 1.011650
> no
> 21419 0.00000 0.0000 0.000000 0.00000 1.000000 1.011480
> no
> 21420 0.00000 0.0000 0.000000 0.00000 1.000000 1.011500
> no
> 21421 0.00000 0.0000 0.000000 0.00000 1.000000 1.011520
> no
> 21422 0.00000 0.0000 0.000000 0.00000 1.000000 1.011550
> no
> 21423 6.68356 10.7397 0.474301 -1.08614 0.277417 0.455917
> no
>
>
> tY(GUA)J1 - SUP7 chr10:354243-354332 rrp6 air1rrp6
> OK 0 0 0 0 1 1.00404 no
> tY(GUA)J2 - SUP4 chr10:542955-543044 rrp6 air1rrp6
> OK 0 0 0 0 1 1.00497 no
> tY(GUA)M1 - SUP5 chr13:168794-168883 rrp6 air1rrp6
> OK 0 0 0 0 1 1.00492 no
> tY(GUA)M2 - SUP8 chr13:837927-838016 rrp6 air1rrp6
> OK 0 0 0 0 1 1.00488 no
> tY(GUA)O - SUP3 chr15:288191-288280 rrp6 air1rrp6
> OK 0 0 0 0 1 1.00485 no
> tY(GUA)Q - - chrmt:70823-70907 rrp6 air1rrp6
> OK 4.49644 6.68356 0.396365 -0.766052 0.443645
> 0.634724 no
> 15S_rRNA - 15S_RRNA chrmt:6545-8194 WT air2rrp6
> OK 2288.88 711.697 -1.16817 2.78772 0.00530801
> 0.0167772 yes
> 21S_rRNA - 21S_RRNA chrmt:58008-62447 WT
> air2rrp6 OK 4134.59 1927.04 -0.7634 1.58991 0.111855
> 0.22339 no
> ETS1-1 - ETS1-1 chr12:457732-458432 WT air2rrp6 OK
> 3258.97 1114.76 -1.07277 2.91211 0.00359 0.0121587 yes
> ETS1-2 - ETS1-2 chr12:466869-467569 WT air2rrp6 OK
> 3258.97 1114.76 -1.07277 2.91211 0.00359 0.0121597 yes
>
>
> On Wed, Aug 24, 2011 at 2:34 PM, Sarah Goslee <sarah.goslee at gmail.com>wrote:
>
>> Hi,
>>
>> On Wed, Aug 24, 2011 at 2:18 PM, zhenjiang xu <zhenjiang.xu at gmail.com>
>> wrote:
>> > Hi R users,
>> >
>> > I was using read.table to read a file. The data.fame looked alright, but
>> I
>> > found not all rows are read by the read.table. What's wrong with it? It
>> > didn't give me any warning or error messages. Why the data are truncated?
>> > Thanks.
>> >
>> > $ wc -l all/isoform_exp.diff
>> > 42847 all/isoform_exp.diff
>> >
>> >> a=read.table('all/isoform_exp.diff', header=T, sep='\t')
>> >> nrow(a)
>> > [1] 21423
>>
>> This is a common problem. You need to take a look at the last row that
>> was imported, and the rows around 21423 in the original file.
>>
>> Common causes include stray single or double quotation marks, and
>> other special characters in your file like the default comment.char #
>>
>> Sarah
>> --
>> Sarah Goslee
>> http://www.functionaldiversity.org
>>
>
>
>
> --
> Best,
> Zhenjiang
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
More information about the R-help
mailing list