[R] extracting values from txt with regular expression
Rui Barradas
ruipbarradas at sapo.pt
Fri Jun 8 10:18:27 CEST 2012
Hello,
Just put the entire regexp between parenthesis.
extracted <-
strsplit(gsub("([+-]?(?:\\d+(?:\\.\\d*)|\\.\\d+)(?:[eE][+-]?\\d+)?)","\\1%&",txt_line),"%&")
extracted
sapply(strsplit(unlist(extracted), "="), "[", 2)
As for speed, I believe that this might take longer. It will have to
match a regular expression, then substitute, then split. A routine like
the one I've send usually gives an order of magnitude or more. The first
time I've written one was around 20 years ago, I can now write it with
my eyes closed and it consistently beats alternatives but there's no
harm in trying. Or in combining strategies.
Good luck.
Rui Barradas
Em 08-06-2012 04:52, emorway escreveu:
> Hi Dan and Rui, Thank you for the suggestions, both were very helpful.
> Rui's code was quite fast...there is one more thing I want to explore for my
> own edification, but first I need some help fixing the code below, which is
> a slight modification to Dan's suggestion. It'll no doubt be tough to beat
> the time Rui's code finished the task in, but I'm willing to try. First, I
> need to fix the following, which 'peels' the wrong bit of text from
> "txt_line". Instead of extracting as it now does (shown below), can the
> code be modified to extract the values 0.01 and -0.05, and store them in the
> variable 'extracted'?
>
> txt_line<-" PERCENT DISCREPANCY = 0.01 PERCENT DISCREPANCY =
> -0.05"
> extracted <-
> strsplit(gsub("[+-]?(?:\\d+(?:\\.\\d*)|\\.\\d+)(?:[eE][+-]?\\d+)?","\\1%&",txt_line),"%&")
> extracted
> #[1] " PERCENT DISCREPANCY = " " PERCENT DISCREPANCY =
> "
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/extracting-values-from-txt-file-that-follow-user-supplied-quote-tp4632558p4632753.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list