[R] Importing multiple text files with lapply.
Simon Kiss
simonjkiss at yahoo.ca
Tue Jan 18 02:07:33 CET 2011
Hi Jim,
Ultimately, I'm going to want to count the frequency of dates by particular time periods (months, quarters, years) for each state and then plot the data. I know there are commands in ggplots that will do that, so I'm not too worried about that, but I was stuck on getting 50 text files (one for each state) read into R. For the record, using read.table individually on a state file will get in a useable format, but wasn't working in conjunction with lapply.
To reiterate, the home file has 50 .txt files each with a column of dates in the format I sent you.
I will try readLines and see if I can get it to loop through.
Yours, Simon Kiss
On 2011-01-17, at 7:44 PM, jim holtman wrote:
> It sounds like you want to use 'readLines' and not 'read.table'
>
>> x <- readLines(textConnection("January 11, 2009
> + January 11, 2009
> + October 19, 2008
> + October 13, 2008
> + August 16, 2008
> + June 19, 2008
> + April 19, 2008
> + April 16, 2008
> + February 9, 2008
> + September 2, 2007"))
>> closeAllConnections()
>> x
> [1] "January 11, 2009" "January 11, 2009" "October 19, 2008"
> "October 13, 2008" "August 16, 2008"
> [6] "June 19, 2008" "April 19, 2008" "April 16, 2008"
> "February 9, 2008" "September 2, 2007"
>>
>
> What exactly are you going to do with the data after you read it in?
>
> On Mon, Jan 17, 2011 at 6:22 PM, Simon Kiss <simonjkiss at yahoo.ca> wrote:
>> Dear jim,
>> Yes, it's true, the data are separated onto new lines as follows:
>> January 11, 2009
>> January 11, 2009
>> October 19, 2008
>> October 13, 2008
>> August 16, 2008
>> June 19, 2008
>> April 19, 2008
>> April 16, 2008
>> February 9, 2008
>> September 2, 2007
>>
>> I tried your attempt and it didn't work either; it returned the error message:
>> Error in FUN(X[[1L]], ...) :
>> 'file' must be a character string or connection
>>
>> On 2011-01-17, at 2:02 PM, jim holtman wrote:
>>
>>> try:
>>>
>>> mylist <- lapply(a, read.table, header = TRUE, sep = '\n')
>>>
>>> also is the separator really '\n' meaning a new-line? What exactly
>>> does the data look like?
>>>
>>> On Mon, Jan 17, 2011 at 11:47 AM, Simon Kiss <simonjkiss at yahoo.ca> wrote:
>>>> Hello,
>>>> I'm trying to read in 50 text filess with dates as content to create a list of tables.
>>>>
>>>> a is the list of filenames that need to be read in.
>>>>
>>>> The following command returns the following error
>>>> mylist<-lapply(a, read.table(header=TRUE, sep="\n"))
>>>>
>>>> Error in read.table(header = TRUE, sep = "\n") :
>>>> element 1 is empty;
>>>> the part of the args list of 'is.character' being evaluated was:
>>>> (file)
>>>>
>>>> Does anyone have any suggestions?
>>>> Yours, Simon Kiss
>>>> *********************************
>>>> Simon J. Kiss, PhD
>>>> Assistant Professor, Wilfrid Laurier University
>>>> 73 George Street
>>>> Brantford, Ontario, Canada
>>>> N3T 2C9
>>>> Cell: +1 519 761 7606
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Data Munger Guru
>>>
>>> What is the problem that you are trying to solve?
>>
>> *********************************
>> Simon J. Kiss, PhD
>> Assistant Professor, Wilfrid Laurier University
>> 73 George Street
>> Brantford, Ontario, Canada
>> N3T 2C9
>> Cell: +1 519 761 7606
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
*********************************
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 519 761 7606
More information about the R-help
mailing list