[R] Value Lookup from File without Slurping
r at quantide.com
r at quantide.com
Fri Jan 16 11:52:47 CET 2009
I agree on the database solution.
Database are the rigth tool to solve this kind of problem.
Only consider the start up cost of setting up the database. This could
be a very time consuming task if someone is not familiar with database
technology.
Using file() is not a real reading of all the file. This function will
simply open a connection to the file without reading it.
countLines should do something lile "wc -l" from a bash shell
I would say that if this is a one time job this solution should work
even thought is not the fastest. In case this job is a repetitive one,
then a database solution is surely better
A.
Wacek Kusnierczyk wrote:
> if the file is really large, reading it twice may add considerable penalty:
>
> r at quantide.com wrote:
>
>> Something like this should work
>>
>> library(R.utils)
>> out = numeric()
>> qr = c("AAC", "ATT")
>> n =countLines("test.txt")
>>
>
> # 1st pass
>
>
>> file = file("test.txt", "r")
>> for (i in 1:n){
>>
>
> # 2nd pass
>
>
>> line = readLines(file, n = 1)
>> A = strsplit (line, split = " ")[[1]][1]
>> if(is.element(A, qr)) {
>> value = as.numeric(strsplit (line, split = " ")[[1]][2])
>> out = c(out, value)
>> }
>> }
>>
>
> if this is a one-go task, counting the lines does not pay, and why
> bother. if this is a repetitive task, a database-based solution will
> probably be a better idea.
>
> vQ
>
>
More information about the R-help
mailing list