[R] select portion of text file using R

Duncan Murdoch murdoch.duncan at gmail.com
Mon Apr 20 13:17:48 CEST 2015


On 20/04/2015 3:28 AM, Luigi Marongiu wrote:
> Dear all,
> I have a flat file (tab delimited) derived from an excel file which is
> subdivided in different parts: a first part is reporting metadata,
> then there is a first spreadsheet indicated by [ ], then the actual
> data and the second spreadsheet with the same format [ ] and then the
> data.
> How can I import such file using for instance read.table()?

read.table() by itself can't recognize where the data starts, but it has
arguments "skip" and "nrows" to control how much gets read.  If you
don't know the values for those arguments, you can use readLines() to
read the entire file, then use grep() to recognize your table data, and
either re-read the file, or just extract those lines and read from them
as a textConnection.

Duncan Murdoch

> Many thanks
> regards
> Luigi
> 
> Here is a sample of the file:
> * Experiment Barcode =
> * Experiment Comments =
> * Experiment File Name = F:\array 59
> * Experiment Name = 2015-04-13 171216
> * Experiment Run End Time = 2015-04-13 18:07:57 PM PDT
> ...
> [Amplification Data]
> Well    Cycle    Target Name    Rn    Delta Rn
> 1    1    Adeno 1-Adeno 1    0.820    -0.051
> 1    2    Adeno 1-Adeno 1    0.827    -0.042
> 1    3    Adeno 1-Adeno 1    0.843    -0.025
> 1    4    Adeno 1-Adeno 1    0.852    -0.015
> 1    5    Adeno 1-Adeno 1    0.858    -0.008
> 1    6    Adeno 1-Adeno 1    0.862    -0.002
> ...
> [Results]
> Well    Well Position    Omit    Sample Name    Target Name    Task
> Reporter    Quencher    RQ    RQ Min    RQ Max    CT    Ct Mean    Ct
> SD    Quantity    Delta Ct Mean    Delta Ct SD    Delta Delta Ct
> Automatic Ct Threshold    Ct Threshold    Automatic Baseline
> Baseline Start    Baseline End    Efficiency    Comments    Custom1
> Custom2    Custom3    Custom4    Custom5    Custom6    NOAMP
> EXPFAIL
> 1    A1    false    P17    Adeno 1-Adeno 1    UNKNOWN    FAM
> NFQ-MGB                Undetermined                            false
>  0.200    true    3    44    1.000    N/A                            N
>    Y
> 2    A2    false    P17    Adeno 40/41 EH-AIQJCT3    UNKNOWN    FAM
> NFQ-MGB                Undetermined
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list