[R-SIG-Mac] Download data from Internet contained in a Zip file

Christofer Bogaso bogaso.christofer at gmail.com
Thu Dec 29 20:34:59 CET 2016


Hi David,

Above problem to my system appears like due to use of older R version.
I have secured a new Mac from my friend, rerun your suggestions with R
version 3.3.2 (2016-10-31)

Your instruction works perfectly fine there. Thanks,

On Tue, Dec 27, 2016 at 8:54 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
>> On Dec 26, 2016, at 10:37 AM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
>>
>> Thanks David for your detailing. However still it not working for me :
>>
>>> library(downloader)
>>> download("https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip","temp.zip")
>> 100   996  100   996    0     0    917      0  0:00:01  0:00:01
>> --:--:--  2028 0      0      0 --:--:-- --:--:-- --:--:--     0
>>> dat <- unzip("~/temp.zip")
>> Warning message:
>> In unzip("~/temp.zip") : error 1 in extracting from zip file
>>> str(dat)
>> chr(0)
>>
>> Do I need to install something to get it worked?
>
> I don't know. You haven't provided enough information (for me anyway). I knew that my working  directory was set to my user-HOME, Which was where I was asking `unzip` to look. I don't know where you zip file is because you didn't say where you working directory was. "Error 1" Is what I get get for a non-existent file.
>
> --
> David.
>
>>
>> However,
>>
>> as a workaround can you please suggest if below data can be downloaded
>> directly into R? Since my only expectation is to get Data
>> automatically, if below works then I can still happily live with that
>>
>> http://www.utimf.com/UTI-MF-Microsites/retirement/pdf/Scheme_1_NAV_since_inception.pdf
>>
>> Thanks again
>>
>> On Mon, Dec 26, 2016 at 11:10 PM, David Winsemius
>> <dwinsemius at comcast.net> wrote:
>>>
>>>> On Dec 26, 2016, at 3:19 AM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
>>>>
>>>> Hi David et al,
>>>>
>>>> Thanks for showing the pointers. With your approach, I see the
>>>> "temp.zip" file in my working folder.
>>>>
>>>> However still I could not extract the data within it. I tried using
>>>> unzip() function, however not really going through :
>>>>
>>>>> unzip("temp.zip")
>>>> Warning message:
>>>> In unzip("temp.zip") : error 1 in extracting from zip file
>>>
>>> I didn't try to use R to unzip it. Just using my system facilities worked fine.
>>>
>>> I'm not able to reproduce:
>>>
>>>> ?unzip
>>>> dat <- unzip("~/temp.zip")
>>>> str(dat)
>>> chr "./NAV_File_23122016.out"
>>>> dat_in <- read.table(dat)
>>> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
>>>  line 1 did not have 17 elements
>>>> dat_in <- read.csv(dat, header=FALSE)
>>>> str(dat_in)
>>> 'data.frame':   75 obs. of  6 variables:
>>> $ V1: Factor w/ 1 level "12/23/2016": 1 1 1 1 1 1 1 1 1 1 ...
>>> $ V2: Factor w/ 7 levels "PFM001","PFM002",..: 1 1 1 1 1 1 1 1 1 1 ...
>>> $ V3: Factor w/ 7 levels "HDFC PENSION MANAGEMENT COMPANY LIMITED",..: 6 6 6 6 6 6 6 6 6 6 ...
>>> $ V4: Factor w/ 75 levels "SM001001","SM001002",..: 5 7 8 11 12 13 1 2 3 4 ...
>>> $ V5: Factor w/ 75 levels "HDFC PENSION MANAGEMENT COMPANY LIMITED SCHEME A - TIER I",..: 62 59 63 37 56 57 54 55 60 58 ...
>>> $ V6: num  21.7 21.1 20.8 11.7 10.1 ...
>>>
>>>
>>> --
>>> David.
>>>>
>>>> When I try to access the link
>>>> "https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip"
>>>> manually, then download the zip file and then unzip it, I get a file
>>>> called "NAV_File_23122016.out". Which next I open in excel and get all
>>>> the data.
>>>>
>>>> I was just trying to perform similar task, however through R, so that
>>>> I can load data automatically directly from Web.
>>>>
>>>> Any Idea please. I am using below version of R (I know this is quite
>>>> old version, however I am not currently in a position to upgrade my
>>>> Macbook)
>>>>
>>>>> R.Version()
>>>> $platform
>>>> [1] "x86_64-apple-darwin10.8.0"
>>>>
>>>> $arch
>>>> [1] "x86_64"
>>>>
>>>> $os
>>>> [1] "darwin10.8.0"
>>>>
>>>> $system
>>>> [1] "x86_64, darwin10.8.0"
>>>>
>>>> $status
>>>> [1] ""
>>>>
>>>> $major
>>>> [1] "3"
>>>>
>>>> $minor
>>>> [1] "2.1"
>>>>
>>>> $year
>>>> [1] "2015"
>>>>
>>>> $month
>>>> [1] "06"
>>>>
>>>> $day
>>>> [1] "18"
>>>>
>>>> $`svn rev`
>>>> [1] "68531"
>>>>
>>>> $language
>>>> [1] "R"
>>>>
>>>> $version.string
>>>> [1] "R version 3.2.1 (2015-06-18)"
>>>>
>>>> $nickname
>>>> [1] "World-Famous Astronaut"
>>>>
>>>> On Mon, Dec 26, 2016 at 7:18 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>>>>>
>>>>>> On Dec 25, 2016, at 3:46 PM, Gábor Csárdi <csardi.gabor at gmail.com> wrote:
>>>>>>
>>>>>> Your R build does not support HTTPS.
>>>>>>
>>>>>> I suggest that you use the curl package if you can. HTTP support in
>>>>>> base R is very limited currently.
>>>>>
>>>>> I generally use the downloader package. It sets up the call to download.file so that it succeeds with https URLs.
>>>>>
>>>>>
>>>>> install.packages("downloader", dependencies=TRUE)
>>>>> trying URL 'http://cran.cnr.Berkeley.edu/bin/macosx/mavericks/contrib/3.3/downloader_0.4.tgz'
>>>>> Content type 'application/x-gzip' length 19459 bytes (19 KB)
>>>>> ==================================================
>>>>> downloaded 19 KB
>>>>>
>>>>>
>>>>> The downloaded binary packages are in
>>>>>       /var/folders/68/vh2f8kzn09j8954r6q9100yh0000gn/T//Rtmpq8DVG4/downloaded_packages
>>>>>> library(downloader)
>>>>>> help(pac=downloader)
>>>>> starting httpd help server ... done
>>>>>> download("https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip","temp.zip")
>>>>>
>>>>> # Requires both a source and destination file name.
>>>>>
>>>>> trying URL 'https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip'
>>>>> Content type 'application/octet-stream' length 1228 bytes
>>>>> ==================================================
>>>>> downloaded 1228 bytes
>>>>>
>>>>> --
>>>>> David.
>>>>>>
>>>>>> Gabor
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Dec 25, 2016 at 10:37 PM, Christofer Bogaso
>>>>>> <bogaso.christofer at gmail.com> wrote:
>>>>>>> Hi again,
>>>>>>>
>>>>>>> I posted this in general R thread, however it is suggested this group
>>>>>>> since I am using MAC OS 10.7.5.
>>>>>>>
>>>>>>> I was following the instruction available in
>>>>>>> "http://stackoverflow.com/questions/3053833/using-r-to-download-zipped-data-file-extract-and-import-data"
>>>>>>> to download data from Internet contained in a zip file from the
>>>>>>> address :
>>>>>>>
>>>>>>> https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip
>>>>>>>
>>>>>>> However when I tried to follow the instruction I am facing below error :
>>>>>>>
>>>>>>>> temp <- tempfile()
>>>>>>>> download.file("https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip",temp)
>>>>>>> Error in download.file("https://npscra.nsdl.co.in/download.php?path=download/&filename=NAV_File_23122016.zip",
>>>>>>> :
>>>>>>> unsupported URL scheme
>>>>>>>
>>>>>>> Can someone here please tell me what went wrong in above?
>>>>>>>
>>>>>>> Highly appreciate your feedback.
>>>>>>>
>>>>>>> Thanks for your time.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-SIG-Mac mailing list
>>>>>>> R-SIG-Mac at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-SIG-Mac mailing list
>>>>>> R-SIG-Mac at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>>>>
>>>
>



More information about the R-SIG-Mac mailing list