---
title: "Introduction to sched package"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to sched package}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

*sched* helps sending SOAP or regular requests to web servers, while respecting
a maximum requesting frequency, as stated by web sites for the usage of their
web services.

*sched* uses [fscache](https://CRAN.R-project.org/package=fscache) package to
store returned contents of requests, reusing them automatically when the same
request is run again.

Requests are sent through the use of an instance of the `Scheduler` class.

## Initializing the scheduler

To get an instance of a scheduler, we use the `Scheduler` class as following:
```{r}
scheduler <- sched::Scheduler$new(cache_dir = NULL,
                                  user_agent = "sched ; pierrick.roger@cea.fr")
```
Be sure to set a user agent, since this is what will identify your application
to the web site. Some web site may reject requests because of an empty user
agent.

For this vignette we disable the cache folder by setting `cache_dir` to `NULL`.
By default it is set to `sched` folder inside the default user cache folder on the
system. It is however strongly recommended to set it to a folder named after your
application. Example:
`sched::Scheduler$new(cache_dir=tools::R_user_dir("my.app", which = "cache"))`.

## Sending a request to a web service

To send a request to a web service and retrieve the content of the response, we
use the `sendRequest()` method.

Inside `sendRequest()`, the scheduler will automatically limit the access
frequency to the domain name. This means that the call to `sendRequest()` may
block sometime, doing nothing. This is perfectly normal.

Before sending a request we must build a `Request` object that we will pass to
`sendRequest()`.
Using classes like `Request` and `URL` may be cumbersome for basic requests,
but is very handy for more complex ones, like POST requests.

Let us a build a `URL` object and a simple `Request` object that takes only a
URL:
```{r}
my_url <- sched::URL$new(
  url = "https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity",
  params = c(chebiId = 15440)
)

my_request <- sched::Request$new(my_url)
```

To send the request, pass the `Request` object to the `sendRequest()` method:
```{r}
content <- scheduler$sendRequest(my_request)
```

Here is the XML content returned by the ChEBI web service:
```{r}
content
```

For building a POST request, see the documentation of the `Request` class.

## Using a custom rule

If no scheduling rule exists for a host name, *sched* uses a default rule of
three requests per second (this default frequency may be changed when creating
the `Scheduler` instance). 

To define a custom rule for a host name, use the `setRule()` method:
```{r}
scheduler$setRule("www.ebi.ac.uk", n = 7, lap = 2)
```
This call defines a new rule for domain *www.ebi.ac.uk*, that limits the number
of request to 7 every 2 seconds.
Note that the time lap is a sliding window, and *sched* registers the time of
the requests.
So supposing 7 requests have already been run during the 2 seconds, the 8th
request will be blocked, but only until the first one becomes 2 seconds old.

To delete all defined rules, even the ones created automatically by *sched*,
run:
```{r}
scheduler$deleteRules()
```

## Downloading a file from a URL

With *sched* it is also possible to download file directly from URLs and write
them to disk.

For this demonstration, we will use a destination folder:
```{r}
my_temp_dir <- file.path(tempdir(), "my_temp_folder_for_sched_vignette")
```

To download a file from a URL and write it directly on disk, use the
`downloadFile()` method:
```{r}
my_url <- sched::URL$new(
  "https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md"
)
dst <- file.path(my_temp_dir, "readme.md")
scheduler$downloadFile(my_url, dest_file = dst)
```
As with the `sendRequest()` method, the scheduler will use rules to limit
access frequency to the domain name.

Removal of the temporary folder:
```{r}
unlink(my_temp_dir, recursive = TRUE)
```