[R] skip non-sequential lines using scan?

Matthew Keller mckellercran at gmail.com
Thu Nov 8 10:19:39 CET 2007


Hi all,

Is there a way to skip non-sequential lines using the "skip" argument
in the scan function?

E.g., I have a matrix with 100 rows and 1e7 columns. I open a
connection and want to read only lines 5, 7, 9, etc [i.e.,
seq(5,99,2)]

It might seem that the syntax to do this would be something like this
(if only the "skip" allowed vectors in the same way colClasses does in
read.table):

con <- file("bigfile",open="r")
rows.I.want <- seq(5,99,2)
new <- scan(con,what="character",skip=rows.I.want-1,nlines=rows.I.want)

The above doesn't work - it would read lines 5, 6, 7, ...
length(seq(5,99,2)) rather than 5, 7, 9, ... 99. Yes, I know I can
accomplish this by looping, but with the huge datasets I'll be working
with, I'd like to try to save time by doing it all at once. Any ideas?

Matt



-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com



More information about the R-help mailing list