[R-SIG-Finance] Extract option IDs from option chain
Jeff Ryan
jeff.a.ryan at gmail.com
Tue Sep 7 05:45:29 CEST 2010
Yahoo hasn't done a stellar job with the OSI initiative as far as I
can tell. They seem to be getting better, but as Marc points out, it
is far from perfect.
You could possibly go from right to left to avoid the symbol width
issue (which *should* be 6 wide).
One other comment though --- if you are just looking to get days to
expiry, you are specifically requesting that in the call to retrieve
the chains - so all the parsing isn't really needed for that part.
Best,
Jeff
On Mon, Sep 6, 2010 at 10:29 PM, Marc Delvaux <mdelvaux at gmail.com> wrote:
> Just a final word of caution if you automate this procedure across a large
> number of stocks. From time to time, you will find option symbols that
> deviate from the standard format, i.e. where there are more than 6
> characters between the stock ticker and the option type symbol. Code
> similar to the one presented by Gabor was failing for me for some stocks
> because of that. Currently I used a brute force approach, I remove all
> these as they typically also would pollute other calculations like the
> implied volatility. A current example of this type of problem is the
> October expiration for BHI, see below. In my approach, I remove all rows
> where the number of characters is strictly more than the minimum for that
> expiration.
>
>> Options <- getOptionChain("BHI",Exp=NULL)
>> rownames(Options[[2]]$puts)
> [1] "BHI1101016P00017000" "BHI1101016P00018000" "BHI1101016P00019000"
> "BHI101016P00020000"
> [5] "BHI1101016P00020000" "BHI1101016P00021000" "BHI1101016P00022000"
> "BHI101016P00022500"
> [9] "BHI1101016P00023000" "BHI1101016P00024000" "BHI101016P00025000"
> "BHI1101016P00025000"
> [13] "BHI1101016P00026000" "BHI101016P00030000" "BHI101016P00034000"
> "BHI101016P00035000"
> [17] "BHI101016P00036000" "BHI101016P00037000" "BHI101016P00038000"
> "BHI101016P00039000"
> [21] "BHI101016P00040000" "BHI101016P00041000" "BHI101016P00042000"
> "BHI101016P00043000"
> [25] "BHI101016P00044000" "BHI101016P00045000" "BHI101016P00046000"
> "BHI101016P00047000"
> [29] "BHI101016P00048000" "BHI101016P00049000" "BHI101016P00050000"
> "BHI101016P00055000"
> [33] "BHI101016P00060000" "BHI101016P00065000"
>> nchar(rownames(Options[[2]]$puts))
> [1] 19 19 19 18 19 19 19 18 19 19 18 19 19 18 18 18 18 18 18 18 18 18 18 18
> 18 18 18 18 18
> [30] 18 18 18 18 18
>
>
> On Mon, Sep 6, 2010 at 6:45 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com>wrote:
>
>> On Mon, Sep 6, 2010 at 8:37 PM, rex <rex at nosyntax.net> wrote:
>> > rex <rex at nosyntax.net> [2010-09-06 16:11]:
>> >>
>> >> Format is:
>> >>
>> >>> allOpts <- getOptionChain("AAPL", Exp=optExpire)
>> >>> allOpts
>> >>
>> >> $calls
>> >> Strike Last Chg Bid Ask Vol OI
>> >> AAPL100918C00150000 150 108.50 16.50 106.80 108.85 3 13
>> >> AAPL100918C00155000 155 96.77 0.00 102.40 103.85 4 10
>> >> AAPL100918C00160000 160 95.75 4.50 97.40 98.85 10 30
>> >> [...]
>> >>
>> >> $puts
>> >> Strike Last Chg Bid Ask Vol OI
>> >> AAPL100918P00150000 150 0.01 0.00 NA 0.01 6 876
>> >> AAPL100918P00155000 155 0.02 0.00 NA 0.01 30 666
>> >> AAPL100918P00160000 160 0.02 0.00 NA 0.01 79 1535
>> >> [...]
>> >>
>> >> $symbol
>> >> [1] "AAPL"
>> >>
>> >> The obvious thing fails to produce the desired result:
>> >>>
>> >>> index(allOpts$puts)
>> >>
>> >> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
>> >> 24
>> >>
>> >> What I need are the indexes of both $calls and $puts as a set of
>> >> strings that can be split, etc. (I need the expiration date of the
>> >> options as a Date to be used to calculate the days to expiration.)
>> >
>> > A solution (fugly, for sure):
>> >
>> >> dimnames(allOpts$puts)
>> >
>> > [[1]]
>> > [1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
>> > [4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
>> > [7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
>> > [10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
>> > [13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
>> > [16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
>> > [19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
>> > [22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
>> >
>> > [[2]]
>> > [1] "Strike" "Last" "Chg" "Bid" "Ask" "Vol" "OI"
>> >>
>> >> id <- dimnames(allOpts$puts)[1]
>> >> id
>> >
>> > [[1]]
>> > [1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
>> > [4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
>> > [7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
>> > [10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
>> > [13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
>> > [16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
>> > [19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
>> > [22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
>> >>
>> >> typeof(id)
>> >
>> > [1] "list"
>> >>
>> >> id2 <- id[[1]]
>> >> id2[2]
>> >
>> > [1] "AAPL100918P00155000"
>> >>
>> >> substr(id2[2], 5, 10)
>> >
>> > [1] "100918"
>> >>
>> >> dat <- paste("20", substr(id2[2], 5, 10), sep="")
>> >
>> > < dat
>> > [1] "20100918"
>> >>
>> >> expDate <- as.Date(paste(substr(dat,1,4), "-", substr(dat,5,6), "-",
>> >> substr(dat,7,8), sep=""))
>> >> expDate
>> >
>> > [1] "2010-09-18"
>> >
>> > The above has got to be about the most convoluted and arcane
>> > method to get the expiration date one can imagine.
>>
>> Here are a few approaches and variations:
>>
>> > x <- rep( "AAPL100918P00155000", 3)
>> >
>> > # 1 - gsub
>> >
>> > as.Date(gsub("^\\D+|P.*", "", x), "%y%m%d")
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> > # 2 - rbind and strsplit
>> >
>> > as.Date(do.call(rbind, strsplit(x, "\\D+"))[,2], "%y%m%d")
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> > # 3 - sapply and strsplit
>> >
>> > as.Date(sapply(strsplit(x, "\\D+"), "[[", 2), "%y%m%d")
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> > # 4 - strapply
>> >
>> > library(gsubfn)
>> > strapply(x, "(\\d+)P", ~ as.Date(x, "%y%m%d"), simplify = c)
>> [1] "2010-09-18" "2010-09-18" "2010-09-18"
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
>> _______________________________________________
>> R-SIG-Finance at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>
--
Jeffrey Ryan
jeff.a.ryan at gmail.com
More information about the R-SIG-Finance
mailing list