[R] Building package - tab delimited example data issue

Johannes Graumann johannes_graumann at web.de
Thu Dec 6 15:37:19 CET 2007


Johannes Graumann wrote:

> On Thursday 06 December 2007 11:52:46 Peter Dalgaard wrote:
>> Johannes Graumann wrote:
>> > Hello,
>> >
>> > I'm trying to integrate example data in the shape of a tab delimited
>> > ASCII file into my package and therefore dropped it into the data
>> > subdirectory. The build works out just fine, but when I attempt to
>> > install I get:
>> >
>> > ** building package indices ...
>> > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
>> > na.strings,  :
>> >   line 1 did not have 500 elements
>> > Calls: <Anonymous> ... <Anonymous> -> switch -> assign -> read.table ->
>> > scan Execution halted
>> > ERROR: installing package indices failed
>> > ** Removing '/usr/local/lib/R/site-library/MaxQuantUtils'
>> > ** Restoring previous '/usr/local/lib/R/site-library/MaxQuantUtils'
>> >
>> > Accordingly the check delivers:
>> >
>> > ...
>> > * checking whether package 'MaxQuantUtils' can be installed ... ERROR
>> >
>> > Can anyone tell me what I'm doing wrong? build/install witout the ASCII
>> > file works just fine.
>> >
>> > Joh
>>
>> If you had looked at help(data), you would have found a list of which
>> file formats it supports and how they are read. Hint: TAB-delimited
>> files are not among them. *Whitespace* separated files work, using
>> read.table(filename, header=TRUE), but that is not a superset of
>> TAB-delimited data if there are empty fields.
>>
>> A nice trick is to figure out how to read the data from the command line
>> and drop the relevant code into a mydata.R file (assuming that the
>> actual data file is mydata.txt). This gets executed when the data is
>> loaded (by data(mydata) or when building the lazyload database) because
>> .R files have priority over .txt.
>>
>> This is quite general and allows a nice way of incorporating data
>>
>> management while retaining the original data source:
>> >more ISwR/data/stroke.R
>>
>> stroke <-  read.csv2("stroke.csv", na.strings=".")
>> names(stroke) <- tolower(names(stroke))
>> stroke <-  within(stroke,{
>>     sex <- factor(sex,levels=0:1,labels=c("Female","Male"))
>>     dgn <- factor(dgn)
>>     coma <- factor(coma, levels=0:1, labels=c("No","Yes"))
>>     minf <- factor(minf, levels=0:1, labels=c("No","Yes"))
>>     diab <- factor(diab, levels=0:1, labels=c("No","Yes"))
>>     han <- factor(han, levels=0:1, labels=c("No","Yes"))
>>     died <- as.Date(died, format="%d.%m.%Y")
>>     dstr <- as.Date(dstr,format="%d.%m.%Y")
>>     dead <- !is.na(died) & died < as.Date("1996-01-01")
>>     died[!dead] <- NA
>> })
>>
>> >head ISwR/data/stroke.csv
>>
>> SEX;DIED;DSTR;AGE;DGN;COMA;DIAB;MINF;HAN
>> 1;7.01.1991;2.01.1991;76;INF;0;0;1;0
>> 1;.;3.01.1991;58;INF;0;0;0;0
>> 1;2.06.1991;8.01.1991;74;INF;0;0;1;1
>> 0;13.01.1991;11.01.1991;77;ICH;0;1;0;1
>> 0;23.01.1996;13.01.1991;76;INF;0;1;0;1
>> 1;13.01.1991;13.01.1991;48;ICH;1;0;0;1
>> 0;1.12.1993;14.01.1991;81;INF;0;0;0;1
>> 1;12.12.1991;14.01.1991;53;INF;0;0;1;1
>> 0;.;15.01.1991;73;ID;0;0;0;1
> 
> Thanks for your help. Very insightfull and your version of "RTFM" was not
> to harsh either ;0)
> Part of what I want to achieve with the inclusion of the file is to be
> able to showcase a read-in function for the particular data type. Is there
> a slick way - sticking to your example - to reference the 'stroke.csv'
> directly? I'd like to put in the example of some function.Rd something
> analogous to # Use function to read in file:
> result <- function(<link to 'stroke.csv' in installed ISwR package>)
> Without having to resort to accepting the example as "No Run".

Answering to myself and staying with the same example:
        system.file("data/stroke.csv",package="ISwR")
allows direct access to the example file (name).

Joh



More information about the R-help mailing list