[R-SIG-Finance] Web Scraping of SPY Stocks
Matt Cleary
mmc|e@ry6 @end|ng |rom gm@||@com
Sat Sep 26 23:16:07 CEST 2020
You can use a select() function after the mutate function to rearrange the
columns if need be.
On Sat, Sep 26, 2020, 5:11 PM AIE ATUMA <gttga2000 using yahoo.com> wrote:
> Dear Matt,
>
> This worked but the Date column is presented as the last column as against
> the second column.
>
> Thank You and Best Regards,
> Emeka I. Atuma
> Integrity - Walk Your Talk Don't Talk Your Work
>
>
>
>
>
>
> On Saturday, 26 September 2020, 19:19:28 GMT+1, Matt Cleary <
> mmcleary6 using gmail.com> wrote:
>
>
>
>
>
> Hi Emeka,
>
> When you convert the dataset to a tibble you are losing the date values
> stored in the rownames. I'd recommend storing the dates externally in the
> function before before making the conversion then adding them back with the
> mutate function:
>
> library(quantmod)
> library(dplyr)
> library(tibble)
> library(rvest)
> library(lubridate)
>
> # Web-scrape SP500 stock list
> sp_500 <- read_html("
> https://en.wikipedia.org/wiki/List_of_S%26P_500_companies") %>%
> html_node("table.wikitable") %>%
> html_table() %>%
> select(`Symbol`, Security, `GICS Sector`, `GICS Sub Industry`) %>%
> as_tibble()
>
> # Format names
> names(sp_500) <- sp_500 %>%
> names() %>%
> str_to_lower() %>%
> make.names()
>
> # Show results
> sp_500
>
> get_stock_prices <- function(ticker, return_format = "tibble", ...){
>
> # Get stock prices
> stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE,
> ...)
>
> dates <- as.Date(rownames(as.matrix(stock_prices_xts)))
>
> # Rename
> names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume",
> "Adjusted")
>
> # Return in xts format if tibble is not specified
> if (return_format == "tibble") {
> stock_prices <- stock_prices_xts %>%
> as_tibble() %>%
> mutate(Date = dates)
> } else {
> stock_prices <- stock_prices_xts
> }
> stock_prices
> }
>
> "MA" %>%
> get_stock_prices(return_format = 'tibble') %>%
> head()
>
> Best,
>
> Matt
>
> On Sat, Sep 26, 2020 at 1:34 PM AIE ATUMA via R-SIG-Finance <
> r-sig-finance using r-project.org> wrote:
> > Update:
> >
> > The second function and the error is below:
> >
> > get_stock_prices <- function(ticker), return_format = "tibble", ...) {
> > # Get stock prices
> > stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE,
> ...)
> > # Rename
> > names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume",
> "Adjusted")
> > # Return in xts format if tibble is not specified
> > if (return_format == "tibble") {
> > stock_prices <- stock_prices_xts %>%
> > as_tibble() %>%
> > rownames_to_column(var = "Date") %>%
> > mutate(Date = mdy(Date))
> > } else {
> > stock_prices <- stock_prices_xts
> > }
> > stock_prices
> > }
> >
> > "MA" %>%
> > get_stock_prices(return_format = 'tibble')%>%head()
> >
> > ERROR MESSAGE:
> >
> > # A tibble: 6 x 7
> > Date Open High Low Close Volume Adjusted
> > <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
> > 1 NA 9.96 9.97 9.56 9.64 26289000 8.98
> > 2 NA 9.69 10.2 9.53 10.1 27024000 9.42
> > 3 NA 10.1 10.2 9.9 10.1 29632000 9.41
> > 4 NA 9.90 10.2 9.9 10.1 16006000 9.41
> > 5 NA 10.1 10.6 10.1 10.6 36952000 9.85
> > 6 NA 10.6 10.6 10.3 10.5 35099000 9.76
> > Warning message:
> > All formats failed to parse. No formats found.
> >
> >
> >
> > Thank You and Best Regards,
> > Emeka I. Atuma
> > Integrity - Walk Your Talk Don't Talk Your Work
> >
> >
> >
> >
> >
> >
> > On Saturday, 26 September 2020, 18:09:51 GMT+1, AIE ATUMA via
> R-SIG-Finance <r-sig-finance using r-project.org> wrote:
> >
> >
> >
> >
> >
> > Dear All,
> >
> > Please I need help. I ran the below function and got the highlighted
> error message. How can I correct it?
> >
> > library(rvest)
> > library(rvest)
> > library(pbapply)
> > library(TTR)
> > library(dygraphs)
> > library(lubridate)
> > library(tidyquant)
> > library(timetk)
> > pacman::p_load(dygraphs,DT,quantmod)
> >
> >
> > # Web-scrape SP500 stock list
> > sp_500 <- read_html("
> https://en.wikipedia.org/wiki/List_of_S%26P_500_companies") %>%
> > html_node("table.wikitable") %>%
> > html_table() %>%
> > select(`Ticker symbol`, Security, `GICS Sector`, `GICS Sub Industry`) %>%
> > as_tibble()
> > # Format names
> > names(sp_500) <- sp_500 %>%
> > names() %>%
> > str_to_lower() %>%
> > make.names()
> > # Show results
> > sp_500
> >
> > Error Message:
> >
> > Error: Can't subset columns that don't exist.
> > x Column `Ticker symbol` doesn't exist.
> > Run `rlang::last_error()` to see where the error occurred.
> >
> >
> > The second function and the error is below:
> >
> > get_stock_prices <- function(ticker), return_format = "tibble", ...) {
> > # Get stock prices
> > stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE,
> ...)
> > # Rename
> > names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume",
> "Adjusted")
> > # Return in xts format if tibble is not specified
> > if (return_format == "tibble") {
> > stock_prices <- stock_prices_xts %>%
> > as_tibble() %>%
> > rownames_to_column(var = "Date") %>%
> > mutate(Date = mdy(Date))
> > } else {
> > stock_prices <- stock_prices_xts
> > }
> > stock_prices
> > }
> >
> > "MA" %>%
> > get_stock_prices(return_format = 'tibble')%>%head()
> >
> > ERROR MESSAGE:
> >
> > Warning message:
> > All formats failed to parse. No formats found.
> >
> >
> >
> > Thank You and Best Regards,
> > Emeka I. Atuma
> > Integrity - Walk Your Talk Don't Talk Your Work
> >
> >
> >
> >
> >
> >
> > On Friday, 25 September 2020, 21:12:24 GMT+1, AIE ATUMA <
> gttga2000 using yahoo.com> wrote:
> >
> >
> >
> >
> >
> > Dear All,
> >
> > Please I need help. I ran the below function and got the highlighted
> error message. How can I correct it?
> >
> > library(rvest)
> > # Web-scrape SP500 stock list
> > sp_500 <- read_html("
> https://en.wikipedia.org/wiki/List_of_S%26P_500_companies") %>%
> > html_node("table.wikitable") %>%
> > html_table() %>%
> > select(`Ticker symbol`, Security, `GICS Sector`, `GICS Sub Industry`) %>%
> > as_tibble()
> > # Format names
> > names(sp_500) <- sp_500 %>%
> > names() %>%
> > str_to_lower() %>%
> > make.names()
> > # Show results
> > sp_500
> >
> > Error Message:
> >
> > Error: Can't subset columns that don't exist.
> > x Column `Ticker symbol` doesn't exist.
> > Run `rlang::last_error()` to see where the error occurred.
> >
> >
> > The second function and the error is below:
> >
> > get_stock_prices <- function(ticker), return_format = "tibble", ...) {
> > # Get stock prices
> > stock_prices_xts <- getSymbols(Symbols = ticker, auto.assign = FALSE,
> ...)
> > # Rename
> > names(stock_prices_xts) <- c("Open", "High", "Low", "Close", "Volume",
> "Adjusted")
> > # Return in xts format if tibble is not specified
> > if (return_format == "tibble") {
> > stock_prices <- stock_prices_xts %>%
> > as_tibble() %>%
> > rownames_to_column(var = "Date") %>%
> > mutate(Date = mdy(Date))
> > } else {
> > stock_prices <- stock_prices_xts
> > }
> > stock_prices
> > }
> >
> > "MA" %>%
> > get_stock_prices(return_format = 'tibble')%>%head()
> >
> > ERROR MESSAGE:
> >
> > Warning message:
> > All formats failed to parse. No formats found.
> >
> >
> >
> > Thank You and Best Regards,
> > Emeka I. Atuma
> > Integrity - Walk Your Talk Don't Talk Your Work
> >
> >
> > _______________________________________________
> > R-SIG-Finance using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > -- Subscriber-posting only. If you want to post, subscribe first.
> > -- Also note that this is not the r-help list where general R questions
> should go.
> >
> > _______________________________________________
> > R-SIG-Finance using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > -- Subscriber-posting only. If you want to post, subscribe first.
> > -- Also note that this is not the r-help list where general R questions
> should go.
> >
>
[[alternative HTML version deleted]]
More information about the R-SIG-Finance
mailing list