[R] Problems with source() function

Al aldeluis at usal.es
Fri Oct 28 14:31:14 CEST 2005


Thank you for your answer :)

I've tested your suggestion but without success. The remote load process
is truncated silently using

	source(textConnection(readLines(url(http://...)))

when look at the contents there's not a fixed point of break, is
different each time I execute the command. Therefore the dropped lines
are different every time. It seems the only constant is the time of the
interruption (1 min 55 secs in my system).

Loading the file in a browser (it loads always complete) and examining
the text, there's no apparent malformation in the rupture points.

The longest line is 669 chars and is perfectly loaded in remote and
local mode:

	> lineas <- readLines("transcripts_moe430a.R")
	> length(lineas)
	[1] 20347
	> max(nchar(lineas))
	[1] 669
	> which(nchar(lineas)==669)
	[1] 3241
	> lineas <- readLines(url
("http://10.10.10.3:83/probefinder/scripts/probegrouper.php?chip=moe430a&mode=transcript"))
	> length(lineas)
	[1] 7471
	> max(nchar(lineas))
	[1] 669
	> which(nchar(lineas)==669)
	[1] 3242

Apparently there's a timeout in the url() or some subordinated function.
I will try to use the RCurl package but, for educational purposes, I
prefer that the load process were managed in a simply way... with an
source() for example, in order to not overload alumni with tricky
methods...

Thank you again.

.....................
Alberto de Luis
Bioinformatics and Functional Genomics Lab
Cancer Research Center
Salamanca (Spain)
.....................

On Thu, 2005-10-27 at 12:35 -0700, Duncan Temple Lang wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> Does
>   source(textConnection(readLines(url(http://...)))
> 
> give the correct answer. If not, what is being dropped
> when you just use readLines() and look at the contents
> of the download.
> 
> And how long is the longest line?
> 
> 
> The RCurl package  (http://www.omegahat.org/RCurl) gives you a lot of
> control in perform and processing HTTP requests, allowing
> you to control the request, and read the body and the header of the
> response.  It may be worth a try if things are getting frustrating.
> 
>  D.
> 
> 
> Al wrote:
> > Hello list members!
> > 
> > I'm trying to enter some data in an R session using source() function
> > with an URL as argument. The data source is a PHP script located in an
> > apache web server and the data is a long list generated on-the-fly,
> > these are the initial lines:
> > 
> > groups<-list()
> > groups[['ENSMUST00000000001']]=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785) groups[['ENSMUST00000000003']]=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) groups[['ENSMUST00000000028']]=c(199311,325400,184761,241988,376845,75052,67724,404240,439543,391057,393816) groups[['ENSMUST00000000031']]=c(402587,352900,139030,186068,463553,328881,74942,277085,301431,256149,410846) groups[['ENSMUST00000000033']]=c(12700,23908,11140,122358,389908,390084,383903,354007,457965,106395,131876) groups[['ENSMUST00000000049']]=c(59336,203239,101077,382882,327374,281549,212042,275594,361523,490934,240275) groups[['ENSMUST00000000056']]=c(409571,304584,394332,379699,13785,4260,288889,42538,304075,47734,485512,52501,328509,504846,334607,82566,250088,150240,16422,446551,314484,91878,124752,341638,379512,379890,319764,8019,59221,156508,362524,74001,149400) groups[['ENSMUST00000000058']]=c(26511
 ,4
> 5!
> >  5190,466368,358528,268486,315461,149260,422804,137641,163718,352555)
> > 
> > The problem:
> > When I execute the command it apparently finish ok, without printed
> > errors but when I test the consistency of the data entered using the
> > command length() I always obtain different figures.
> > 
> > More facts:
> > When I source the data from a static file instead an url, the data is
> > fully entered and the length is always the same (20346 list elements).
> > It delays 30 secs to load.
> > 
> > When I source the data from the dynamic way, from an url, it delays 2
> > min. and always data is truncated.
> > 
> > Tried and miserably failed:
> > - Changed .Options$timeout from 60 to 300
> > - Using R --verbose is of no help, the data is silently truncated. 
> > - Changed the expression in which data is entered:
> > groups<-list(
> > 'ENSMUST00000000001'=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785),
> > 'ENSMUST00000000003'=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702)
> > ...
> > )
> > 
> > Kind list members, is there some timeout I am missing? Some way to debug
> > the process? Some suggestion?
> > 
> > Sincerely, thank you!
> > 
> > Alberto de Luis
> > www.cicancer.org
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 
> - --
> Duncan Temple Lang                duncan at wald.ucdavis.edu
> Department of Statistics          work:  (530) 752-4782
> 371 Kerr Hall                     fax:   (530) 752-7099
> One Shields Ave.
> University of California at Davis
> Davis, CA 95616, USA
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2 (Darwin)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
> 
> iD8DBQFDYSvk9p/Jzwa2QP4RAsqfAJ98RNScQ7ea1/MAnt72R0VGZoXaEQCfZvyl
> WNNN/HT1hx/Kix3KSp15XwM=
> =VsDG
> -----END PGP SIGNATURE-----




More information about the R-help mailing list