[R] help with json data from the web into data frame in R

David Winsemius dwin@emiu@ @ending from comc@@t@net
Tue May 8 22:20:01 CEST 2018


> On May 8, 2018, at 10:08 AM, Evans, Richard K. (GRC-H000) <richard.k.evans at nasa.gov> wrote:
> 
> Hi David,  .. I think I've got it :-) 
> Please let me know if you see anything glaringly wrong with this:
> 
> library(RCurl)
> zWebObj <- postForm("https://www.semantic-mediawiki.org/w/api.php",
>   "action" = "ask",
>   "query" = "[[Category:City]]|?Capital%20of|?Has%20area",
>   "format" = "json"
>   .opts = list(ssl.verifypeer = FALSE)
> )

You might need to hack fromJSON to get your verification issues fixed:

> fromJSON
function (txt, simplifyVector = TRUE, simplifyDataFrame = simplifyVector, 
    simplifyMatrix = simplifyVector, flatten = FALSE, ...) 
{
    if (!is.character(txt) && !inherits(txt, "connection")) {
        stop("Argument 'txt' must be a JSON string, URL or file.")
    }
    if (is.character(txt) && length(txt) == 1 && nchar(txt, type = "bytes") < 
        1000 && !validate(txt)) {
        if (grepl("^https?://", txt, useBytes = TRUE)) {
            loadpkg("curl")
            h <- curl::new_handle(useragent = paste("jsonlite /", 
                R.version.string))
            curl::handle_setheaders(h, Accept = "application/json, text/*, */*")
            txt <- curl::curl(txt, handle = h)
        }
        else if (file.exists(txt)) {
            txt <- file(txt)
        }
    }
    fromJSON_string(txt = txt, simplifyVector = simplifyVector, 
        simplifyDataFrame = simplifyDataFrame, simplifyMatrix = simplifyMatrix, 
        flatten = flatten, ...)
}
<environment: namespace:jsonlite>
> 

> 
> Thank you!
> -Rich
> 
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Evans, Richard K. (GRC-H000)
> Sent: Tuesday, May 08, 2018 12:51 PM
> To: David Winsemius
> Cc: r-help at r-project.org
> Subject: Re: [R] help with json data from the web into data frame in R
> 
> [non-tabular json data] -- ok.. so I think I need to figure out how to make it tabular. Thanks!
> 
> [curl] -- I was hoping there was a cleaner way to do it.. using R to evoke cURL to get the data as text and then passing it into getJSON seems to be what I need to do.
> 
> Do you by chance have an simple example of using RCurl to get a response ignoring cert errors?
> 
> ty
> -Rich
> 
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net] 
> Sent: Tuesday, May 08, 2018 12:25 PM
> To: Evans, Richard K. (GRC-H000)
> Cc: r-help at r-project.org
> Subject: Re: [R] help with json data from the web into data frame in R
> 
> 
>> On May 8, 2018, at 9:03 AM, Evans, Richard K. (GRC-H000) <richard.k.evans at nasa.gov> wrote:
>> 
>> That said, I have two issues to ask for help with:
>> 
>> 1) how to ignore cert errors with a fromJSON call
> 
> If you can do it with curl, then why aren't you doing one of a) a system call, b) installing and loading RCurl, c) installing and loading curl (the R package with that name)?
> 
>> 
>> And 
>> 
>> 2) why the json data from the example link doesn't convert to a data frame.
> 
> That was already answered in my earlier response. It's not a tabular result, so it doesn't "fit" into a tabular structure.
> 
> -- 
> David.
> 
> 
>> As seen in the following example
>> 
>> library("rjson")
>> result <- fromJSON(file = "https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Category:City]]|?Capital%20of|?Has%20area&format=json")
>> json_data_frame <- as.data.frame(result)
>> print(json_data_frame)
>> 
>> which results in:
>> 
>>> library("rjson")
>> 
>> Warning message:
>> package ‘rjson’ was built under R version 3.4.4 
>> 
>>> result <- fromJSON(file = "https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Category:City]]|?Capital%20of|?Has%20area&format=json")
>>> json_data_frame <- as.data.frame(result)
>> 
>> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
>> arguments imply differing number of rows: 0, 1
>> 
>>> print(json_data_frame)
>> 
>> Error in print(json_data_frame) : object 'json_data_frame' not found
>> 
>>> 
>> 
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Evans, Richard K. (GRC-H000)
>> Sent: Tuesday, May 08, 2018 11:52 AM
>> To: David Winsemius
>> Cc: r-help at r-project.org
>> Subject: Re: [R] help with json data from the web into data frame in R
>> 
>> Right. I'm trying to access a server within my organization which has a cert error that I cannot fix. 
>> 
>> The example link I provided was to a site on the web that does not have the cert error.
>> 
>> From the linux shell I use the "-k" switch with cURL to ignore cert errors.. is there an equivalent in the R world?
>> 
>> -Rich
>> 
>> 
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius at comcast.net] 
>> Sent: Tuesday, May 08, 2018 11:48 AM
>> To: Evans, Richard K. (GRC-H000)
>> Cc: Eric Berger; r-help at r-project.org
>> Subject: Re: [R] help with json data from the web into data frame in R
>> 
>> 
>>> On May 8, 2018, at 8:36 AM, Evans, Richard K. (GRC-H000) <richard.k.evans at nasa.gov> wrote:
>>> 
>>> I’ve been tinkering and discovered that the link I need to read json data from is ‘https’ and there is a certificate warning that I have to click through from a browser. That might be my issue. Is there any way in the json package to tell it to ignore self-signed cert errors in a url?
>> 
>> I didn't have that issue when using the link you offered:
>> 
>> library(jsonlite)
>> myJSON <- fromJSON( url("https://www.semantic-mediawiki.org/w/api.php?action=ask&query=%5B%5BCategory:City%5D%5D&format=json") )
>> 
>> # results in a complex list (not trivially reducible to a dataframe:
>> 
>> str(myJSON)
>> List of 1
>> $ query:List of 5
>> ..$ printrequests:'data.frame':	1 obs. of  5 variables:
>> .. ..$ label : chr ""
>> .. ..$ key   : chr ""
>> .. ..$ redi  : chr ""
>> .. ..$ typeid: chr "_wpg"
>> .. ..$ mode  : int 2
>> ..$ results      :List of 39
>> .. ..$ File:2166320938 5cfc9ec72a z.jpg                      :List of 6
>> .. .. ..$ printouts   : list()
>> .. .. ..$ fulltext    : chr "File:2166320938 5cfc9ec72a z.jpg"
>> .. .. ..$ fullurl     : chr "https://www.semantic-mediawiki.org/wiki/File:2166320938_5cfc9ec72a_z.jpg"
>> #-----trimmed-----------
>> 
>> David
>> 
>>> 
>>> -Rich
>>> 
>>> From: Eric Berger [mailto:ericjberger at gmail.com]
>>> Sent: Tuesday, May 08, 2018 11:31 AM
>>> To: Evans, Richard K. (GRC-H000)
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] help with json data from the web into data frame in R
>>> 
>>> Hi Rich,
>>> Take a look at the function fromJSON found in the rjson package.
>>> Note that the Usage in the help page: ?fromJSON names the second 
>>> argument 'file' but if you look at the description the argument can be a URL.
>>> 
>>> HTH,
>>> Eric
>>> 
>>> 
>>> On Tue, May 8, 2018 at 6:16 PM, Evans, Richard K. (GRC-H000) <richard.k.evans at nasa.gov<mailto:richard.k.evans at nasa.gov>> wrote:
>>> Hello
>>> 
>>> I am able to construct a url that points to some data online in the JSON format.  See an example at [0].
>>> 
>>> I would like to work with this data as a dataframe in R.
>>> 
>>> I know that there is a package for handling json data [1] but it assumes the data is in a local file but It is not clear to me how to request the data from the web in an R script and get the json data converted into a data frame in R.
>>> 
>>> Can anyone provide a basic example or some guidance please?
>>> 
>>> -Rich (revansx)
>>> 
>>> [0] 
>>> https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Catego
>>> ry:City]]&format=json<https://www.semantic-mediawiki.org/w/api.php?act
>>> ion=ask&query=%5b%5bCategory:City%5d%5d&format=json>
>>> [1] https://www.tutorialspoint.com/r/r_json_files.htm
>>> 
>>> ______________________________________________
>>> R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To 
>>> UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> 
>>> 	[[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
>> 'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law
>> 
>> 
>> 
>> 
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law
> 
> 
> 
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law



More information about the R-help mailing list