[R-SIG-Finance] Ask for assistance for the 3 littles problems i'm facing in my R web scraping function

Koffi Sessie ko|||@e@@|e1 @end|ng |rom gm@||@com
Thu Feb 10 11:08:49 CET 2022


Hi to everybody.
I'm trying to scrape data by giving as input list of stocks.
If I perform the code or the function I get the expected result but once I
insert the code into a function the result does not come.
NB: The function shall return dataframe of every stock i'll list as input
 So I'm facing three problems:

 1. How to retry connecting to gsheet2tbl when the connection is very
    weak because the main function depend on the result of the
    gsheet2tbl connexion. `Error in open.connection(5L, "rb") : Timeout
    was reached: [docs.google.com] Connection timed out after 10000
    milliseconds In addition: Warning message: In for (i in seq_len(n))
    { : closing unused connection 5
    (
https://docs.google.com/spreadsheets/export?id=1rdjGjlQg7cUzWAEJFikrxOnisk-yQQx-n652sJUL-qc&format=csv&gid=0
)
    Called from: open.connection(5L, "rb") `.

2. When I apply the separate
    function I get warnings, however this in no way hinders the
    evolution of the code (so I would like to hide or ignore these
    warnings since if I have 10 listed companies, I will get 10
    identical warnings because of the `for` loop)

 3. How to send as result
    of the function the dataframes of all the stocks listed on the BRVM?
Below is the code
```
library(gsheet)
library(httr)
library(stringr)
library(dplyr)
library(tidyr)
library(rvest)
library(formattable)
library(data.table)
library(kableExtra)
##Some examples of Stocks symbols
####Some of them are false; Just to test my function
symbol<-c("BiCc","NTLc", "XOM", "SlbC", "PUR", 25, "boas", "ontbf", "xom",
"pUr")
returns<-as.data.frame(matrix(NA, ncol = 7, nrow = 0))
names(returns)<-c("Date", "Open", "High", "Low", "Close", "Volume",
"Ticker")

BRVM_get<-function(symbol){
  quotes = gsheet::gsheet2tbl("
https://docs.google.com/spreadsheets/d/1rdjGjlQg7cUzWAEJFikrxOnisk-yQQx-n652sJUL-qc/edit#gid=0
")
  colnames(quotes)<-c("Symbole","Nom","Volume","Cours veille (FCFA)","Cours
clôture (FCFA)" ,"Cours Clôture(FCFA)","Variation(%)" )
  quotes.df<-as.data.frame(quotes)
  #### Create a definitive symbol vector
  def.symbol<-NULL
  ##Change to upper each symbol
  ##And keep only unique symbols
  ## Filter symbol in quote symbol list
  for (Symb in unique(toupper(symbol))) {
    if (Symb %in% unlist(quotes$Symbole)) {
      def.symbol<-append(def.symbol, Symb)
    }
  }
  for(Tick in def.symbol) {
    #url<-paste0("https://www.richbourse.com/common/mouvements/technique/",
Tick)
    #page<-GET(url)
    page <-httr::RETRY("GET", paste0("
https://www.richbourse.com/common/mouvements/technique/", Tick,
"/status/200"))
    Sys.sleep(5)
    page <- content(page,as="text",encoding = 'UTF-8')
    page=unlist(strsplit(page,split = '\n'))
    data1=unlist(strsplit(page[[641]],split = ':'))
    data1<-data1[2] #Show table 1 ##First 5 columns (Date, Open, High, Low,
Close)
    data1<-gsub(" ", "", data1)
    data1<-strsplit(data1, split = "],")
    data1<- as.data.frame(data1)
    i= 1
    for (i in 1 : nrow(data1)){
      data1[i,1]<- gsub("\\[|\\]", "", data1[i,1])
    }
###Now transform one column to 5 columns
    ##And change numbers in integer
    colnames(data1)<-c("unique")
    data1<-separate(data1, col = unique, into = c("Date", "Open", "High",
"Low", "Close"), sep = ",")
    data1$Open<-as.numeric(data1$Open)
    data1$High<-as.numeric(data1$High)
    data1$Low<-as.numeric(data1$Low)
    data1$Close<-as.numeric(data1$Close)
    #Transform date from character to numeric
    data1$Date<-as.numeric(data1$Date)
    ##Turn date in format "%Y-%m-%d"
    data1$Date<-as.Date(as.POSIXct((data1$Date+0.1)/1000, origin =
"1970-01-01"))

    ##Volume data Case (2 columns : Date and Volume)
    data2=unlist(strsplit(page[[651]],split = ':'))
    data2<-data2[2]
    #Data Cleaning
    data2<-gsub(" ", "", data2) #Volume
    ###Remove all second square bracket closer to a comma
    data2<-strsplit(data2, split = "],") #Volume
    data2<- as.data.frame(data2)#Volume
    ###Use a loop to remove the remaining square brackets in each row
    j= 1
    for (j in 1 : nrow(data2)){
      data2[j,1]<- gsub("\\[|\\]", "", data2[j,1])
    }
    colnames(data2)<-c("unique")
```
    ##I'm receiving a warning message after separating the dataframe in two
    ### I try suppressWarnings() but the warning is still coming
    #Warning message:
    #Expected 3 pieces. Missing pieces filled with `NA` in 2766 rows [1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

   ``` data2<-separate(data2, col = unique, into = c("Date", "Volume", sep
= ","))1:2]
    data2$Volume<-as.numeric(data2$Volume)
    data2$Date<-as.numeric(data2$Date)
    ##Turn date in format "%Y-%m-%d"
    data2$Date<-as.Date(as.POSIXct((data2$Date+0.1)/1000, origin =
"1970-01-01"))
    ##Join data by date
    final.data<-left_join(data1,data2, by ="Date")
    final.data$Ticker<- Tick ##Add ticker identifier
    assign(Tick, as_tibble(final.data) )
  }
}
```
Please i need your help.
Thanks in advance
Koffi Frederic SESSIE
Student in University Cheikh Anta Diop of Dakar.

	[[alternative HTML version deleted]]



More information about the R-SIG-Finance mailing list