[R-SIG-Finance] Extract option IDs from option chain
Marc Delvaux
mdelvaux at gmail.com
Tue Sep 7 17:08:48 CEST 2010
It is not Yahoo for BHI, this is the arguably strange method that was
specified for contract changes, and
OCC should/could have selected something that makes a little bit more
sense. The relevant text from
http://www.optionsclearing.com/components/docs/initiatives/symbology/contract_adjustments_and_the_osi.pdf
reads like this
"Under OSI, the option symbol in most cases will be the same as the
stock symbol. For
example, “MSFT” will be the option symbol for the underlying security
“MSFT”. If an
option symbol change is necessary in a contract adjustment (e.g., 3
for 2 split), option
symbol MSFT will change to MSFT1, with the numeric suffix identifying
this option as
an adjusted, “non-standard” contract."
This mostly guarantees that you need extra information about the exact
change to interpret the non standard
quotes, so IMO it is simpler to drop all of these non standard symbols
when using automated scripts. OCC choice
was poor as it may be difficult to spot the extra "1" especially for
years 201x, it literally took me 10+ minutes to
spot the difference while trying to understand why my script was
failing to correctly calculate the expiration date :-(
On Mon, Sep 6, 2010 at 8:45 PM, Jeff Ryan <jeff.a.ryan at gmail.com> wrote:
> Yahoo hasn't done a stellar job with the OSI initiative as far as I
> can tell. They seem to be getting better, but as Marc points out, it
> is far from perfect.
>
> You could possibly go from right to left to avoid the symbol width
> issue (which *should* be 6 wide).
>
> One other comment though --- if you are just looking to get days to
> expiry, you are specifically requesting that in the call to retrieve
> the chains - so all the parsing isn't really needed for that part.
>
> Best,
> Jeff
>
> On Mon, Sep 6, 2010 at 10:29 PM, Marc Delvaux <mdelvaux at gmail.com> wrote:
>> Just a final word of caution if you automate this procedure across a large
>> number of stocks. From time to time, you will find option symbols that
>> deviate from the standard format, i.e. where there are more than 6
>> characters between the stock ticker and the option type symbol. Code
>> similar to the one presented by Gabor was failing for me for some stocks
>> because of that. Currently I used a brute force approach, I remove all
>> these as they typically also would pollute other calculations like the
>> implied volatility. A current example of this type of problem is the
>> October expiration for BHI, see below. In my approach, I remove all rows
>> where the number of characters is strictly more than the minimum for that
>> expiration.
>>
>>> Options <- getOptionChain("BHI",Exp=NULL)
>>> rownames(Options[[2]]$puts)
>> [1] "BHI1101016P00017000" "BHI1101016P00018000" "BHI1101016P00019000"
>> "BHI101016P00020000"
>> [5] "BHI1101016P00020000" "BHI1101016P00021000" "BHI1101016P00022000"
>> "BHI101016P00022500"
>> [9] "BHI1101016P00023000" "BHI1101016P00024000" "BHI101016P00025000"
>> "BHI1101016P00025000"
>> [13] "BHI1101016P00026000" "BHI101016P00030000" "BHI101016P00034000"
>> "BHI101016P00035000"
>> [17] "BHI101016P00036000" "BHI101016P00037000" "BHI101016P00038000"
>> "BHI101016P00039000"
>> [21] "BHI101016P00040000" "BHI101016P00041000" "BHI101016P00042000"
>> "BHI101016P00043000"
>> [25] "BHI101016P00044000" "BHI101016P00045000" "BHI101016P00046000"
>> "BHI101016P00047000"
>> [29] "BHI101016P00048000" "BHI101016P00049000" "BHI101016P00050000"
>> "BHI101016P00055000"
>> [33] "BHI101016P00060000" "BHI101016P00065000"
>>> nchar(rownames(Options[[2]]$puts))
>> [1] 19 19 19 18 19 19 19 18 19 19 18 19 19 18 18 18 18 18 18 18 18 18 18 18
>> 18 18 18 18 18
>> [30] 18 18 18 18 18
>>
>>
>> On Mon, Sep 6, 2010 at 6:45 PM, Gabor Grothendieck
>> <ggrothendieck at gmail.com>wrote:
>>
>>> On Mon, Sep 6, 2010 at 8:37 PM, rex <rex at nosyntax.net> wrote:
>>> > rex <rex at nosyntax.net> [2010-09-06 16:11]:
>>> >>
>>> >> Format is:
>>> >>
>>> >>> allOpts <- getOptionChain("AAPL", Exp=optExpire)
>>> >>> allOpts
>>> >>
>>> >> $calls
>>> >> Strike Last Chg Bid Ask Vol OI
>>> >> AAPL100918C00150000 150 108.50 16.50 106.80 108.85 3 13
>>> >> AAPL100918C00155000 155 96.77 0.00 102.40 103.85 4 10
>>> >> AAPL100918C00160000 160 95.75 4.50 97.40 98.85 10 30
>>> >> [...]
>>> >>
>>> >> $puts
>>> >> Strike Last Chg Bid Ask Vol OI
>>> >> AAPL100918P00150000 150 0.01 0.00 NA 0.01 6 876
>>> >> AAPL100918P00155000 155 0.02 0.00 NA 0.01 30 666
>>> >> AAPL100918P00160000 160 0.02 0.00 NA 0.01 79 1535
>>> >> [...]
>>> >>
>>> >> $symbol
>>> >> [1] "AAPL"
>>> >>
>>> >> The obvious thing fails to produce the desired result:
>>> >>>
>>> >>> index(allOpts$puts)
>>> >>
>>> >> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
>>> >> 24
>>> >>
>>> >> What I need are the indexes of both $calls and $puts as a set of
>>> >> strings that can be split, etc. (I need the expiration date of the
>>> >> options as a Date to be used to calculate the days to expiration.)
>>> >
>>> > A solution (fugly, for sure):
>>> >
>>> >> dimnames(allOpts$puts)
>>> >
>>> > [[1]]
>>> > [1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
>>> > [4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
>>> > [7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
>>> > [10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
>>> > [13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
>>> > [16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
>>> > [19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
>>> > [22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
>>> >
>>> > [[2]]
>>> > [1] "Strike" "Last" "Chg" "Bid" "Ask" "Vol" "OI"
>>> >>
>>> >> id <- dimnames(allOpts$puts)[1]
>>> >> id
>>> >
>>> > [[1]]
>>> > [1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
>>> > [4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
>>> > [7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
>>> > [10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
>>> > [13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
>>> > [16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
>>> > [19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
>>> > [22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
>>> >>
>>> >> typeof(id)
>>> >
>>> > [1] "list"
>>> >>
>>> >> id2 <- id[[1]]
>>> >> id2[2]
>>> >
>>> > [1] "AAPL100918P00155000"
>>> >>
>>> >> substr(id2[2], 5, 10)
>>> >
>>> > [1] "100918"
>>> >>
>>> >> dat <- paste("20", substr(id2[2], 5, 10), sep="")
>>> >
>>> > < dat
>>> > [1] "20100918"
>>> >>
>>> >> expDate <- as.Date(paste(substr(dat,1,4), "-", substr(dat,5,6), "-",
>>> >> substr(dat,7,8), sep=""))
>>> >> expDate
>>> >
>>> > [1] "2010-09-18"
>>> >
>>> > The above has got to be about the most convoluted and arcane
>>> > method to get the expiration date one can imagine.
>>>
>>> Here are a few approaches and variations:
>>>
>>> > x <- rep( "AAPL100918P00155000", 3)
>>> >
>>> > # 1 - gsub
>>> >
>>> > as.Date(gsub("^\\D+|P.*", "", x), "%y%m%d")
>>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>>
>>> > # 2 - rbind and strsplit
>>> >
>>> > as.Date(do.call(rbind, strsplit(x, "\\D+"))[,2], "%y%m%d")
>>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>>
>>> > # 3 - sapply and strsplit
>>> >
>>> > as.Date(sapply(strsplit(x, "\\D+"), "[[", 2), "%y%m%d")
>>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>>
>>> > # 4 - strapply
>>> >
>>> > library(gsubfn)
>>> > strapply(x, "(\\d+)P", ~ as.Date(x, "%y%m%d"), simplify = c)
>>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>>
>>> --
>>> Statistics & Software Consulting
>>> GKX Group, GKX Associates Inc.
>>> tel: 1-877-GKX-GROUP
>>> email: ggrothendieck at gmail.com
>>>
>>> _______________________________________________
>>> R-SIG-Finance at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>>> -- Subscriber-posting only. If you want to post, subscribe first.
>>> -- Also note that this is not the r-help list where general R questions
>>> should go.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-SIG-Finance at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions should go.
>>
>
>
>
> --
> Jeffrey Ryan
> jeff.a.ryan at gmail.com
>
More information about the R-SIG-Finance
mailing list