[R-SIG-Finance] Extract option IDs from option chain

Jeff Ryan jeff.a.ryan at gmail.com
Tue Sep 7 05:45:29 CEST 2010


Yahoo hasn't done a stellar job with the OSI initiative as far as I
can tell. They seem to be getting better, but as Marc points out, it
is far from perfect.

You could possibly go from right to left to avoid the symbol width
issue (which *should* be 6 wide).

One other comment though --- if you are just looking to get days to
expiry, you are specifically requesting that in the call to retrieve
the chains - so all the parsing isn't really needed for that part.

Best,
Jeff

On Mon, Sep 6, 2010 at 10:29 PM, Marc Delvaux <mdelvaux at gmail.com> wrote:
> Just a final word of caution if you automate this procedure across a large
> number of stocks.  From time to time, you will find option symbols that
> deviate from the standard format, i.e. where there are more than 6
> characters between the stock ticker and the option type symbol.  Code
> similar to the one presented by Gabor was failing for me for some stocks
> because of that.  Currently I used a brute force approach, I remove all
> these as they typically also would pollute other calculations like the
> implied volatility.  A current example of this type of problem is the
> October expiration for BHI, see below.  In my approach, I remove all rows
> where the number of characters is strictly more than the minimum for that
> expiration.
>
>> Options <- getOptionChain("BHI",Exp=NULL)
>> rownames(Options[[2]]$puts)
>  [1] "BHI1101016P00017000" "BHI1101016P00018000" "BHI1101016P00019000"
> "BHI101016P00020000"
>  [5] "BHI1101016P00020000" "BHI1101016P00021000" "BHI1101016P00022000"
> "BHI101016P00022500"
>  [9] "BHI1101016P00023000" "BHI1101016P00024000" "BHI101016P00025000"
> "BHI1101016P00025000"
> [13] "BHI1101016P00026000" "BHI101016P00030000"  "BHI101016P00034000"
> "BHI101016P00035000"
> [17] "BHI101016P00036000"  "BHI101016P00037000"  "BHI101016P00038000"
> "BHI101016P00039000"
> [21] "BHI101016P00040000"  "BHI101016P00041000"  "BHI101016P00042000"
> "BHI101016P00043000"
> [25] "BHI101016P00044000"  "BHI101016P00045000"  "BHI101016P00046000"
> "BHI101016P00047000"
> [29] "BHI101016P00048000"  "BHI101016P00049000"  "BHI101016P00050000"
> "BHI101016P00055000"
> [33] "BHI101016P00060000"  "BHI101016P00065000"
>> nchar(rownames(Options[[2]]$puts))
>  [1] 19 19 19 18 19 19 19 18 19 19 18 19 19 18 18 18 18 18 18 18 18 18 18 18
> 18 18 18 18 18
> [30] 18 18 18 18 18
>
>
> On Mon, Sep 6, 2010 at 6:45 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com>wrote:
>
>> On Mon, Sep 6, 2010 at 8:37 PM, rex <rex at nosyntax.net> wrote:
>> > rex <rex at nosyntax.net> [2010-09-06 16:11]:
>> >>
>> >> Format is:
>> >>
>> >>> allOpts <- getOptionChain("AAPL", Exp=optExpire)
>> >>> allOpts
>> >>
>> >> $calls
>> >>                     Strike   Last   Chg    Bid    Ask   Vol    OI
>> >> AAPL100918C00150000    150 108.50 16.50 106.80 108.85     3    13
>> >> AAPL100918C00155000    155  96.77  0.00 102.40 103.85     4    10
>> >> AAPL100918C00160000    160  95.75  4.50  97.40  98.85    10    30
>> >> [...]
>> >>
>> >> $puts
>> >>                   Strike  Last   Chg   Bid   Ask   Vol    OI
>> >> AAPL100918P00150000    150  0.01  0.00    NA  0.01     6   876
>> >> AAPL100918P00155000    155  0.02  0.00    NA  0.01    30   666
>> >> AAPL100918P00160000    160  0.02  0.00    NA  0.01    79  1535
>> >> [...]
>> >>
>> >> $symbol
>> >> [1] "AAPL"
>> >>
>> >> The obvious thing fails to produce the desired result:
>> >>>
>> >>> index(allOpts$puts)
>> >>
>> >> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
>> >> 24
>> >>
>> >> What I need are the indexes of both $calls and $puts as a set of
>> >> strings that can be split, etc. (I need the expiration date of the
>> >> options as a Date to be used to calculate the days to expiration.)
>> >
>> > A solution (fugly, for sure):
>> >
>> >> dimnames(allOpts$puts)
>> >
>> > [[1]]
>> > [1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
>> > [4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
>> > [7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
>> > [10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
>> > [13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
>> > [16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
>> > [19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
>> > [22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
>> >
>> > [[2]]
>> > [1] "Strike" "Last"   "Chg"    "Bid"    "Ask"    "Vol"    "OI"
>> >>
>> >> id <- dimnames(allOpts$puts)[1]
>> >> id
>> >
>> > [[1]]
>> > [1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
>> > [4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
>> > [7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
>> > [10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
>> > [13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
>> > [16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
>> > [19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
>> > [22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
>> >>
>> >> typeof(id)
>> >
>> > [1] "list"
>> >>
>> >> id2 <- id[[1]]
>> >> id2[2]
>> >
>> > [1] "AAPL100918P00155000"
>> >>
>> >> substr(id2[2], 5, 10)
>> >
>> > [1] "100918"
>> >>
>> >> dat <- paste("20", substr(id2[2], 5, 10), sep="")
>> >
>> > < dat
>> > [1] "20100918"
>> >>
>> >> expDate <- as.Date(paste(substr(dat,1,4), "-", substr(dat,5,6), "-",
>> >> substr(dat,7,8), sep=""))
>> >> expDate
>> >
>> > [1] "2010-09-18"
>> >
>> > The above has got to be about the most convoluted and arcane
>> > method to get the expiration date one can imagine.
>>
>> Here are a few approaches and variations:
>>
>> > x <- rep( "AAPL100918P00155000", 3)
>> >
>> > # 1 - gsub
>> >
>> > as.Date(gsub("^\\D+|P.*", "", x), "%y%m%d")
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> > # 2 - rbind and strsplit
>> >
>> > as.Date(do.call(rbind, strsplit(x, "\\D+"))[,2], "%y%m%d")
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> > # 3 - sapply and strsplit
>> >
>> > as.Date(sapply(strsplit(x, "\\D+"), "[[", 2), "%y%m%d")
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> > # 4 - strapply
>> >
>> > library(gsubfn)
>> > strapply(x, "(\\d+)P", ~ as.Date(x, "%y%m%d"), simplify = c)
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
>> _______________________________________________
>> R-SIG-Finance at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>



-- 
Jeffrey Ryan
jeff.a.ryan at gmail.com



More information about the R-SIG-Finance mailing list