[R] extracting quoted text from character string
Corey Moffet
cmoffet at nwrc.ars.usda.gov
Mon Oct 13 21:02:02 CEST 2003
Hello all,
I am trying to solve a problem, and my solution is rather ugly and not very
general. The posts for "[R] help with gsub and grep functions" seemed
relevent
and gave me hope for a more refined and more general solution.
The Problem:
line <- "'this text has spaces' 'thisNot' 3 4 5 6 7 8 9 10"
bad.line <- "'this text has spaces' thisNot 3 4 5 6 7 8 9 10"
The desired result of a process on 'line' or "bad.line":
> parts <- some.function(line)
> parts
[1] "this text has spaces"
[2] "thisNot"
[3] "3"
[4] "4"
[5] "5"
[6] "6"
[7] "7"
[8] "8"
[9] "9"
[10] "10"
Current function to obtain a solution for "line" but not "bad.line":
"some.function" <- function(line, quote.char = "'") {
quoted <- unlist(strsplit(line, quote.char))
quoted <- quoted[quoted != ""]
first <- quoted[1]
second <- quoted[3]
last <- quoted[4]
last.parts <-unlist(strsplit(last, " "))
last.parts <- last.parts[last.parts != ""]
out <- c(first, second, last.parts)
return(out)
}
This solution is not very good because the text parts of "line" are not
required to be enclosed in quotations unless it has a space. All the files
I currently have to process have the first two pieces enclose in "'". But
it is future files that I worry about. Is there an existing function that
I have overlooked that splits strings, ignoring the delimiter when it is
enclosed in quotes? I know that I can do some testing on the length of
"quoted" in function "some.function" but it seems there should be a more
elegent way of doing this type of thing. Any suggestions?
With best wishes and kind regards I am
Sincerely,
Corey A. Moffet, Ph.D.
Support Scientist
University of Idaho
Northwest Watershed Research Center
800 Park Blvd, Plaza IV, Suite 105
Boise, ID 83712-7716
Voice: (208) 422-0718
FAX: (208) 334-1502
More information about the R-help
mailing list