[R] Novice question about getting data into R

Duncan Murdoch murdoch at stats.uwo.ca
Fri Sep 19 19:08:16 CEST 2008


On 9/19/2008 1:01 PM, Ted Byers wrote:
> I found it easy to use R when typing data manually into it.  Now I need to
> read data from a file, and I get the following errors:
> 
>> refdata =
>> read.table("K:\\MerchantData\\RiskModel\\refund_distribution.csv", header
>> = TRUE)

If your file is really a comma separated file, use read.csv, not 
read.table (which defaults to white space separators).

Duncan Murdoch

> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 
> : 
>   line 1 did not have 42 elements
>> refdata =
>> read.table("K:\\MerchantData\\RiskModel\\refund_distribution.csv")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 
> : 
>   line 2 did not have 42 elements
>>
> 
> (I'd tried the first version above because the first record has column
> names.)
> 
> First, I don't know why R expects 42 elements in a record.  
> There is one column for a time variable (weeks since a given week of samples
> were taken) and one for each week of sampling in the data file (Week 18
> through Week 37 inclusive).  And there is only 19 rows.
> The samples represented by the columns are independant, and the numbers in
> the columns are the fraction of events sampled that result in an event of
> another kind in the week since the sample was taken.
> 
> The samples are not the same size, and starting with week 20, the number of
> values progressively gets smaller since there have been fewer than 37  weeks
> since the samples were taken.
> 
> I can show you the contents of the data file if you wish.  It is
> unremarkable, csv, with strings used for column names enclosed in double
> quotes.
> 
> I don't have to manually separate the samples into their own files do I?  I
> was hoping to write a function that estimates the density function that best
> fits each sample individually, and then iterate of the columns, applying
> that function to each in turn.
> 
> What is the best way to handle this?
> 
> Thanks
> 
> Ted
> 
>



More information about the R-help mailing list