[R] html into R
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Fri Aug 26 15:26:10 CEST 2022
Hello,
You are right, I haven't assigned the return value.
Start the pipe with something like
RiverTweed <- page |>
rest_of_pipe
If you have more files to download and process, post an example of 2 or
3 links and I'll see if it can be automated.
Also posting to R-help.
Hope this helps,
Rui Barradas
Às 14:18 de 26/08/2022, Nick Wray escreveu:
> Hi Rui again sorry to have to ask this but although your code prints out a
> tibble I can't seem to be able to identify it ie find its name I assumed
> that it's "y" but outside of your code R tells me that y is not found.
> I've tried various things but nothing gives me the tibble as an object with
> a name that I can use...? Thanks Nick
>
> On Fri, 26 Aug 2022 at 13:37, Nick Wray <nickmwray using gmail.com> wrote:
>
>> Hi Rui That is brilliant Thanks v much - what is even better is that I
>> have loads of data from different years, rivers and stations to download,
>> each of which entails a different set of numerical inputs and was thinking
>> about how I could loop through the URL with different inputs to that - but
>> by using paste I can create all the links I need Thanks again Nick
>>
>> On Fri, 26 Aug 2022 at 11:57, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>
>>> Sorry, there's simpler code. I used html_elements (plural) and the
>>> result is a list. Use html_element (singular) and the output is a tibble.
>>>
>>>
>>> page |>
>>> html_element("table") |>
>>> html_table(header = TRUE) |>
>>> (\(x) {
>>> hdr <- unlist(x[3, ])
>>> y <- x[-(1:3), ]
>>> names(y) <- hdr
>>> y
>>> })()
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Às 11:53 de 26/08/2022, Rui Barradas escreveu:
>>>> Hello,
>>>>
>>>> You can try the following. It worked with me.
>>>> Read from the link and post-process the html data extracting the
>>> element
>>>> "table" and then the table itself.
>>>>
>>>> This table has 3 rows before the actual table so the lapply below will
>>>> get the table and its header.
>>>>
>>>>
>>>> library(httr)
>>>> library(rvest)
>>>>
>>>>
>>>> link <-
>>>> "
>>> https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code"
>>>
>>>>
>>>>
>>>> page <- read_html(link)
>>>> page |>
>>>> html_elements("table") |>
>>>> html_table(header = TRUE) |>
>>>> lapply(\(x) {
>>>> hdr <- unlist(x[3, ])
>>>> y <- x[-(1:3), ]
>>>> names(y) <- hdr
>>>> y
>>>> })
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>> Às 09:43 de 26/08/2022, Nick Wray escreveu:
>>>>> hello - I need to download flow data for Scottish river catchments.
>>> The
>>>>> data is available from the Scottish Environmental protection Agency
>>> body
>>>>> and that doesn't present a problem. For example the API beneath will
>>>>> access the 96 flow recordings on the River Tweed on Jan 1st 2020 at one
>>>>> station:
>>>>>
>>>>>
>>> https://timeseries.sepa.org.uk/KiWIS/KiWIS?service=kisters&type=queryServices&datasource=0&request=getTimeseriesValues&ts_path=1/14972/Q/15m.Cmd&from=2020-01-01&to=2020-01-07&returnfields=Timestamp,Value,Quality%20Code
>>>>>
>>>>>
>>>>>
>>>>> But this data comes as HTML. I can copy and paste it into a text doc
>>>>> which
>>>>> can then be read into R but that's slow and time-consuming. I have
>>> tried
>>>>> using the package "rvest" to import the HTML into R but I have got
>>>>> nowhere.
>>>>>
>>>>> Can anyone give me any pointers as to how to do this?
>>>>>
>>>>>
>>>>> Thanks Nick Wray
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
More information about the R-help
mailing list