[R] How to loop over two files ...

Ana Marija @okov|c@@n@m@r|j@ @end|ng |rom gm@||@com
Fri Jun 19 23:07:04 CEST 2020


HI Rasmus,

I tried it:

library(base)

files <- c("1g.txt", "1n.txt")
        files <- lapply(files, readLines)
        server <- "http://rest.ensembl.org"
        population.name <- "1000GENOMES:phase_3:KHV"
        ext <- apply(expand.grid(files), 1, function(x) {
          return(paste0(server, "/ld/human/pairwise/",
            x[1], "/", x[2],
            "?population_name=", population.name))
        })

r <- readRDS(paste0(population.name, ".rds"))
        lapply(r[1:4], function(x) {
          jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
        })

and I got this error:
> r <- readRDS(paste0(population.name, ".rds"))
Error in gzfile(file, "rb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file '1000GENOMES:phase_3:KHV.rds', probable
reason 'No such file or directory'
>         lapply(r[1:4], function(x) {
+           jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
+         })
Error in lapply(r[1:4], function(x) { : object 'r' not found

Am I am doing here something wrong?
Do I need any other libraries loaded?

Thanks
Ana

On Fri, Jun 19, 2020 at 3:49 PM Rasmus Liland <jral using posteo.no> wrote:
>
> On 2020-06-19 14:34 -0500, Ana Marija wrote:
> >
> > server <- "http://rest.ensembl.org"
> > ext <- "/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
> >
> > r <- GET(paste(server, ext, sep = ""), content_type("application/json"))
> >
> > stop_for_status(r)
> > head(fromJSON(toJSON(content(r))))
> >    d_prime       r2 variation1 variation2         population_name
> > 1 0.975513 0.951626  rs6792369  rs1042779 1000GENOMES:phase_3:KHV
> >
> > What I would like to do is to do is to run this command for every SNP
> > in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP#
> > is rs# and output every line of result in list.txt
>
> Dear Ana,
>
> I tried, but for some reason I get only a
> response for the first URL you supplied.
>
> I wrote this:
>
>         files <- c("1g.txt", "1n.txt")
>         files <- lapply(files, readLines)
>         server <- "http://rest.ensembl.org"
>         population.name <- "1000GENOMES:phase_3:KHV"
>         ext <- apply(expand.grid(files), 1, function(x) {
>           return(paste0(server, "/ld/human/pairwise/",
>             x[1], "/", x[2],
>             "?population_name=", population.name))
>         })
>
>         # r <- lapply(ext, function(x) {
>         #   httr::GET(x, httr::content_type("application/json"))
>         # })
>         # names(r) <- ext
>         # file <- paste0(population.name, ".rds")
>         # saveRDS(object=r, compress="xz", file=file)
>
>         r <- readRDS(paste0(population.name, ".rds"))
>         lapply(r[1:4], function(x) {
>           jsonlite::fromJSON(jsonlite::toJSON(httr::content(x)))
>         })
>
>
> Which if you are able to run it (saving the
> output in that rds file), yields this:
>
>         $`http://rest.ensembl.org/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>           variation2         population_name  d_prime       r2 variation1
>         1  rs1042779 1000GENOMES:phase_3:KHV 0.975513 0.951626  rs6792369
>
>         $`http://rest.ensembl.org/ld/human/pairwise/rs1414517/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>         list()
>
>         $`http://rest.ensembl.org/ld/human/pairwise/rs16857712/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>         list()
>
>         $`http://rest.ensembl.org/ld/human/pairwise/rs16857703/rs1042779?population_name=1000GENOMES:phase_3:KHV`
>         list()
>
> For some reason, only the first url works ...
>
> I am a bit unfamiliar working with REST
> API's.  Or web scraping in general.  Daniel
> Cegiełka knows something in this thread some
> days ago, where it might be similar to the
> API of borsaitaliana.it, where you can supply
> headers with curl like he quickly did [2].
>
> You might be able to supply the list of SNPs
> in a header to Ensemble in httr::GET somehow
> if you read some docs on their API?
>
> Best,
> Rasmus
>
> [1] https://marc.info/?t=159249246100002&r=1&w=2
> [2] https://marc.info/?l=r-sig-finance&m=159249894208684&w=2



More information about the R-help mailing list