[R-SIG-Finance] Web Scraping of SPY Stocks

AIE ATUMA gttg@2000 @end|ng |rom y@hoo@com
Sat Sep 26 23:11:58 CEST 2020


Dear Matt,

This worked but the Date column is presented as the last column as against the second column.

Thank You and Best Regards, 
Emeka I. Atuma
Integrity - Walk Your Talk Don't Talk Your Work






On Saturday, 26 September 2020, 19:19:28 GMT+1, Matt Cleary <mmcleary6 using gmail.com> wrote: 





Hi Emeka,

When you convert the dataset to a tibble you are losing the date values stored in the rownames. I'd recommend storing the dates externally in the function before before making the conversion then adding them back with the mutate function:

library(quantmod)
library(dplyr)
library(tibble)
library(rvest)
library(lubridate)

# Web-scrape SP500 stock list
sp_500 <- read_html("https://en.wikipedia.org/wiki/List_of_S%26P_500_companies") %>%
  html_node("table.wikitable") %>%
  html_table() %>%
  select(`Symbol`, Security, `GICS Sector`, `GICS Sub Industry`) %>%
  as_tibble()

# Format names
names(sp_500) <- sp_500 %>%
  names() %>%
  str_to_lower() %>%
  make.names()

# Show results
sp_500

get_stock_prices <- function(ticker, return_format = "tibble", ...){
  
  # Get stock prices
  stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE, ...)
  
  dates <- as.Date(rownames(as.matrix(stock_prices_xts)))
  
  # Rename
  names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")
  
  # Return in xts format if tibble is not specified
  if (return_format == "tibble") {
    stock_prices <- stock_prices_xts %>%
      as_tibble() %>%
      mutate(Date = dates)
  } else {
    stock_prices <- stock_prices_xts
  }
  stock_prices
}

"MA" %>%
  get_stock_prices(return_format = 'tibble') %>%
  head()

Best, 

Matt

On Sat, Sep 26, 2020 at 1:34 PM AIE ATUMA via R-SIG-Finance <r-sig-finance using r-project.org> wrote:
> Update:
> 
> The second function and the error is below:
> 
> get_stock_prices <- function(ticker), return_format = "tibble", ...) {
>   # Get stock prices
>   stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE, ...)
>   # Rename
>   names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")
>   # Return in xts format if tibble is not specified
>   if (return_format == "tibble") {
>     stock_prices <- stock_prices_xts %>%
>       as_tibble() %>%
>       rownames_to_column(var = "Date") %>%
>       mutate(Date = mdy(Date))
>   } else {
>     stock_prices <- stock_prices_xts
>   }
>   stock_prices
> }
> 
> "MA" %>%
>   get_stock_prices(return_format = 'tibble')%>%head()
> 
> ERROR MESSAGE:
> 
> # A tibble: 6 x 7
>   Date        Open  High   Low Close   Volume Adjusted
>   <date>     <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
> 1 NA          9.96  9.97  9.56  9.64 26289000     8.98
> 2 NA          9.69 10.2   9.53 10.1  27024000     9.42
> 3 NA         10.1  10.2   9.9  10.1  29632000     9.41
> 4 NA          9.90 10.2   9.9  10.1  16006000     9.41
> 5 NA         10.1  10.6  10.1  10.6  36952000     9.85
> 6 NA         10.6  10.6  10.3  10.5  35099000     9.76
> Warning message:
> All formats failed to parse. No formats found.
> 
> 
> 
> Thank You and Best Regards, 
> Emeka I. Atuma
> Integrity - Walk Your Talk Don't Talk Your Work
> 
> 
> 
> 
> 
> 
> On Saturday, 26 September 2020, 18:09:51 GMT+1, AIE ATUMA via R-SIG-Finance <r-sig-finance using r-project.org> wrote: 
> 
> 
> 
> 
> 
> Dear All,
> 
> Please I need help. I ran the below function and got the highlighted error message. How can I correct it?
> 
> library(rvest)
> library(rvest)
> library(pbapply)
> library(TTR)
> library(dygraphs)
> library(lubridate)
> library(tidyquant)
> library(timetk)
> pacman::p_load(dygraphs,DT,quantmod)
> 
> 
> # Web-scrape SP500 stock list
> sp_500 <- read_html("https://en.wikipedia.org/wiki/List_of_S%26P_500_companies") %>%
> html_node("table.wikitable") %>%
> html_table() %>%
> select(`Ticker symbol`, Security, `GICS Sector`, `GICS Sub Industry`) %>%
> as_tibble()
> # Format names
> names(sp_500) <- sp_500 %>%
> names() %>%
> str_to_lower() %>%
> make.names()
> # Show results
> sp_500
> 
> Error Message:
> 
> Error: Can't subset columns that don't exist.
> x Column `Ticker symbol` doesn't exist.
> Run `rlang::last_error()` to see where the error occurred.
> 
> 
> The second function and the error is below:
> 
> get_stock_prices <- function(ticker), return_format = "tibble", ...) {
>   # Get stock prices
>   stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE, ...)
>   # Rename
>   names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")
>   # Return in xts format if tibble is not specified
>   if (return_format == "tibble") {
>     stock_prices <- stock_prices_xts %>%
>       as_tibble() %>%
>       rownames_to_column(var = "Date") %>%
>       mutate(Date = mdy(Date))
>   } else {
>     stock_prices <- stock_prices_xts
>   }
>   stock_prices
> }
> 
> "MA" %>%
>   get_stock_prices(return_format = 'tibble')%>%head()
> 
> ERROR MESSAGE:
> 
> Warning message:
> All formats failed to parse. No formats found. 
> 
> 
> 
> Thank You and Best Regards, 
> Emeka I. Atuma
> Integrity - Walk Your Talk Don't Talk Your Work
> 
> 
> 
> 
> 
> 
> On Friday, 25 September 2020, 21:12:24 GMT+1, AIE ATUMA <gttga2000 using yahoo.com> wrote: 
> 
> 
> 
> 
> 
> Dear All,
> 
> Please I need help. I ran the below function and got the highlighted error message. How can I correct it?
> 
> library(rvest)
> # Web-scrape SP500 stock list
> sp_500 <- read_html("https://en.wikipedia.org/wiki/List_of_S%26P_500_companies") %>%
> html_node("table.wikitable") %>%
> html_table() %>%
> select(`Ticker symbol`, Security, `GICS Sector`, `GICS Sub Industry`) %>%
> as_tibble()
> # Format names
> names(sp_500) <- sp_500 %>%
> names() %>%
> str_to_lower() %>%
> make.names()
> # Show results
> sp_500
> 
> Error Message:
> 
> Error: Can't subset columns that don't exist.
> x Column `Ticker symbol` doesn't exist.
> Run `rlang::last_error()` to see where the error occurred.
> 
> 
> The second function and the error is below:
> 
> get_stock_prices <- function(ticker), return_format = "tibble", ...) {
>   # Get stock prices
>   stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE, ...)
>   # Rename
>   names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")
>   # Return in xts format if tibble is not specified
>   if (return_format == "tibble") {
>     stock_prices <- stock_prices_xts %>%
>       as_tibble() %>%
>       rownames_to_column(var = "Date") %>%
>       mutate(Date = mdy(Date))
>   } else {
>     stock_prices <- stock_prices_xts
>   }
>   stock_prices
> }
> 
> "MA" %>%
>   get_stock_prices(return_format = 'tibble')%>%head()
> 
> ERROR MESSAGE:
> 
> Warning message:
> All formats failed to parse. No formats found. 
> 
> 
> 
> Thank You and Best Regards, 
> Emeka I. Atuma
> Integrity - Walk Your Talk Don't Talk Your Work
> 
> 
> _______________________________________________
> R-SIG-Finance using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
> 
> _______________________________________________
> R-SIG-Finance using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
> 



More information about the R-SIG-Finance mailing list