[R] Find String Between Characters

Sparks, John James jspark4 at uic.edu
Sun May 15 04:14:14 CEST 2011


Hi Jim,

Thanks for your note.

Unfortunately, when I attempt your solution in my exact setting, I get a
weird and slightly different answer.

First, let me be more clear.  What I am attempting to do is pull the CIK
number out of the information from the web page itself after it has loaded
to R (this may not be optimal, but I am new at this), not from the web
page reference (as you have done).

So, when I execute the following as per your suggestion:

require(scrapeR)
mmm<-scrape(url="http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40")

num <- sub("^.*CIK=([0-9]+).*", "\\1", mmm)

I get
[1] "<pointer: 0x00000000001265c0>"

Is this just a hex representation of the same number, or is something else
going on here?

Comments from any and all would be much appreciated.

--John J. Sparks, Ph.D.

On Sat, May 14, 2011 7:57 pm, jim holtman wrote:
> Is this what you want:
>
>> mmm<-"http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40"
>> num <- sub("^.*CIK=([0-9]+).*", "\\1", mmm)
>> num
> [1] "0000320193"
>>
>
>
> On Sat, May 14, 2011 at 8:20 PM, Sparks, John James <jspark4 at uic.edu>
> wrote:
>> Dear R Helpers,
>>
>> I am trying to isolate a set of characters between two other characters
>> in
>> a long string file.  I tried some of the examples on the R help pages
>> and
>> elsewhere, but I am not able to get it.  Your help would be much
>> appreciated.
>>
>> require(scrapeR)
>> mmm<-scrape(url="http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40")
>> str(mmm)
>>
>> I want to get the number 0000320193 that is between the CIK= and the &.
>>  I
>> have tried
>>
>> g <- grep( "CIK=|&", mmm )
>> and
>> temp<-grep(mmm,\CIK=\&)
>>
>> and variations on these themes, but all won't run or come bask as an
>> empty
>> object.  How can I grab this number?
>>
>> Best wishes,
>> --John J. Sparks, Ph.D.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
>
>



More information about the R-help mailing list