[R] help with json data from the web into data frame in R
David Winsemius
dw|n@em|u@ @end|ng |rom comc@@t@net
Tue May 8 22:20:01 CEST 2018
> On May 8, 2018, at 10:08 AM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov> wrote:
>
> Hi David, .. I think I've got it :-)
> Please let me know if you see anything glaringly wrong with this:
>
> library(RCurl)
> zWebObj <- postForm("https://www.semantic-mediawiki.org/w/api.php",
> "action" = "ask",
> "query" = "[[Category:City]]|?Capital%20of|?Has%20area",
> "format" = "json"
> .opts = list(ssl.verifypeer = FALSE)
> )
You might need to hack fromJSON to get your verification issues fixed:
> fromJSON
function (txt, simplifyVector = TRUE, simplifyDataFrame = simplifyVector,
simplifyMatrix = simplifyVector, flatten = FALSE, ...)
{
if (!is.character(txt) && !inherits(txt, "connection")) {
stop("Argument 'txt' must be a JSON string, URL or file.")
}
if (is.character(txt) && length(txt) == 1 && nchar(txt, type = "bytes") <
1000 && !validate(txt)) {
if (grepl("^https?://", txt, useBytes = TRUE)) {
loadpkg("curl")
h <- curl::new_handle(useragent = paste("jsonlite /",
R.version.string))
curl::handle_setheaders(h, Accept = "application/json, text/*, */*")
txt <- curl::curl(txt, handle = h)
}
else if (file.exists(txt)) {
txt <- file(txt)
}
}
fromJSON_string(txt = txt, simplifyVector = simplifyVector,
simplifyDataFrame = simplifyDataFrame, simplifyMatrix = simplifyMatrix,
flatten = flatten, ...)
}
<environment: namespace:jsonlite>
>
>
> Thank you!
> -Rich
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces using r-project.org] On Behalf Of Evans, Richard K. (GRC-H000)
> Sent: Tuesday, May 08, 2018 12:51 PM
> To: David Winsemius
> Cc: r-help using r-project.org
> Subject: Re: [R] help with json data from the web into data frame in R
>
> [non-tabular json data] -- ok.. so I think I need to figure out how to make it tabular. Thanks!
>
> [curl] -- I was hoping there was a cleaner way to do it.. using R to evoke cURL to get the data as text and then passing it into getJSON seems to be what I need to do.
>
> Do you by chance have an simple example of using RCurl to get a response ignoring cert errors?
>
> ty
> -Rich
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius using comcast.net]
> Sent: Tuesday, May 08, 2018 12:25 PM
> To: Evans, Richard K. (GRC-H000)
> Cc: r-help using r-project.org
> Subject: Re: [R] help with json data from the web into data frame in R
>
>
>> On May 8, 2018, at 9:03 AM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov> wrote:
>>
>> That said, I have two issues to ask for help with:
>>
>> 1) how to ignore cert errors with a fromJSON call
>
> If you can do it with curl, then why aren't you doing one of a) a system call, b) installing and loading RCurl, c) installing and loading curl (the R package with that name)?
>
>>
>> And
>>
>> 2) why the json data from the example link doesn't convert to a data frame.
>
> That was already answered in my earlier response. It's not a tabular result, so it doesn't "fit" into a tabular structure.
>
> --
> David.
>
>
>> As seen in the following example
>>
>> library("rjson")
>> result <- fromJSON(file = "https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Category:City]]|?Capital%20of|?Has%20area&format=json")
>> json_data_frame <- as.data.frame(result)
>> print(json_data_frame)
>>
>> which results in:
>>
>>> library("rjson")
>>
>> Warning message:
>> package ‘rjson’ was built under R version 3.4.4
>>
>>> result <- fromJSON(file = "https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Category:City]]|?Capital%20of|?Has%20area&format=json")
>>> json_data_frame <- as.data.frame(result)
>>
>> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
>> arguments imply differing number of rows: 0, 1
>>
>>> print(json_data_frame)
>>
>> Error in print(json_data_frame) : object 'json_data_frame' not found
>>
>>>
>>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces using r-project.org] On Behalf Of Evans, Richard K. (GRC-H000)
>> Sent: Tuesday, May 08, 2018 11:52 AM
>> To: David Winsemius
>> Cc: r-help using r-project.org
>> Subject: Re: [R] help with json data from the web into data frame in R
>>
>> Right. I'm trying to access a server within my organization which has a cert error that I cannot fix.
>>
>> The example link I provided was to a site on the web that does not have the cert error.
>>
>> From the linux shell I use the "-k" switch with cURL to ignore cert errors.. is there an equivalent in the R world?
>>
>> -Rich
>>
>>
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius using comcast.net]
>> Sent: Tuesday, May 08, 2018 11:48 AM
>> To: Evans, Richard K. (GRC-H000)
>> Cc: Eric Berger; r-help using r-project.org
>> Subject: Re: [R] help with json data from the web into data frame in R
>>
>>
>>> On May 8, 2018, at 8:36 AM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov> wrote:
>>>
>>> I’ve been tinkering and discovered that the link I need to read json data from is ‘https’ and there is a certificate warning that I have to click through from a browser. That might be my issue. Is there any way in the json package to tell it to ignore self-signed cert errors in a url?
>>
>> I didn't have that issue when using the link you offered:
>>
>> library(jsonlite)
>> myJSON <- fromJSON( url("https://www.semantic-mediawiki.org/w/api.php?action=ask&query=%5B%5BCategory:City%5D%5D&format=json") )
>>
>> # results in a complex list (not trivially reducible to a dataframe:
>>
>> str(myJSON)
>> List of 1
>> $ query:List of 5
>> ..$ printrequests:'data.frame': 1 obs. of 5 variables:
>> .. ..$ label : chr ""
>> .. ..$ key : chr ""
>> .. ..$ redi : chr ""
>> .. ..$ typeid: chr "_wpg"
>> .. ..$ mode : int 2
>> ..$ results :List of 39
>> .. ..$ File:2166320938 5cfc9ec72a z.jpg :List of 6
>> .. .. ..$ printouts : list()
>> .. .. ..$ fulltext : chr "File:2166320938 5cfc9ec72a z.jpg"
>> .. .. ..$ fullurl : chr "https://www.semantic-mediawiki.org/wiki/File:2166320938_5cfc9ec72a_z.jpg"
>> #-----trimmed-----------
>>
>> David
>>
>>>
>>> -Rich
>>>
>>> From: Eric Berger [mailto:ericjberger using gmail.com]
>>> Sent: Tuesday, May 08, 2018 11:31 AM
>>> To: Evans, Richard K. (GRC-H000)
>>> Cc: r-help using r-project.org
>>> Subject: Re: [R] help with json data from the web into data frame in R
>>>
>>> Hi Rich,
>>> Take a look at the function fromJSON found in the rjson package.
>>> Note that the Usage in the help page: ?fromJSON names the second
>>> argument 'file' but if you look at the description the argument can be a URL.
>>>
>>> HTH,
>>> Eric
>>>
>>>
>>> On Tue, May 8, 2018 at 6:16 PM, Evans, Richard K. (GRC-H000) <richard.k.evans using nasa.gov<mailto:richard.k.evans using nasa.gov>> wrote:
>>> Hello
>>>
>>> I am able to construct a url that points to some data online in the JSON format. See an example at [0].
>>>
>>> I would like to work with this data as a dataframe in R.
>>>
>>> I know that there is a package for handling json data [1] but it assumes the data is in a local file but It is not clear to me how to request the data from the web in an R script and get the json data converted into a data frame in R.
>>>
>>> Can anyone provide a basic example or some guidance please?
>>>
>>> -Rich (revansx)
>>>
>>> [0]
>>> https://www.semantic-mediawiki.org/w/api.php?action=ask&query=[[Catego
>>> ry:City]]&format=json<https://www.semantic-mediawiki.org/w/api.php?act
>>> ion=ask&query=%5b%5bCategory:City%5d%5d&format=json>
>>> [1] https://www.tutorialspoint.com/r/r_json_files.htm
>>>
>>> ______________________________________________
>>> R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To
>>> UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
>>
>>
>>
>>
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
More information about the R-help
mailing list