[R] Is there any way to know when a field is blank

Chuck Cleland ccleland at optonline.net
Tue Nov 21 11:06:17 CET 2006


Leeds, Mark (IED) wrote:
> I have many text files in the format below and in certain rare instances
> such as below there can be nothing in one of the fields so
> a double comma is written but I won't know this because I am reading in
> many,many files sequentially.
> 
> # TEXT FILE
> 
> 2004-02-10 00:01:31.00000,,105.60000000
> 2004-02-10 00:01:32.00001,,105.60000000
> 2004-02-10 00:01:45.00000,,105.60000000
> 2004-02-10 00:01:49.00000,,105.61000000
> 2004-02-10 00:02:08.00000,,105.60000000
> 2004-02-10 00:02:15.00000,,105.60000000
> 2004-02-10 00:02:23.00000,,105.60000000
> 2004-02-10 00:02:41.00000,,105.60000000
> 2004-02-10 00:03:09.00000,,105.59000000
> 2004-02-10 00:03:16.00000,,105.60000000
> 2004-02-10 00:03:19.00000,,105.59000000
> 2004-02-10 00:03:25.00000,,105.60000000
> 2004-02-10 00:03:39.00000,,105.59000000
> 2004-02-10 00:03:52.00000,,105.60000000
> 2004-02-10 00:03:54.00000,,105.60000000
> 
> # LINES OF CODE
> 
> fxdata<-read.zoo(file=fxfile,FUN=as.POSIXct,sep=",",col.names=c("date","
> bid","ask"))
> fxdata<-fxdata[( fxdata[,"bid"] > 0.0 ) & ( fxdata[,"ask"]  > 0.0 ),]
> aggfxdata<-as.zoo(aggregatebyminutes(zooobj=fxdata,aggtimeframe=aggtimef
> rame))
> 
> #=======================================================================
> ====================
> 
> Even with the double comma being there, the fxdata<-read.zoo line and
> the fxdata<-fxdata line still work but then on
> the aggfxdata<-as.zoo line , I get the error :
> 
> "Error in rep.int(seq(1:d[i]), prod(d[seq(length = i - 1)]) * rep.int(1,
> : 
>         invalid number of copies in rep()"
> 
> This error is reasonable because the routines, aggregatebyminutes,
> probably has a problem with nothing
> being in the bid field. My question is if there is some way tha I can
> know that nothing
> is in the bid field so that I can skip this file altogether and go onto
> the next one ?
> I'm not showing the details of the function because I'm not interested
> in the error. I am only interested in knowing
> that the "bid" field does not exist.
> 
> I ask only because I am unsure how often this double comma/missing field
> scenario can happen so it would
> be better to automate the skipping of the file.

  You could avoid the as.zoo() part when dim(fxdata)[1] is equal to zero
with something like this:

library(zoo)

fxdata <- read.zoo(file="fxfile", FUN=as.POSIXct, sep=",",
                   col.names=c("date","bid","ask"))

fxdata <- fxdata[(fxdata[,"bid"] > 0.0) &
                 (fxdata[,"ask"] > 0.0),]

if(dim(fxdata)[1] == 0) cat("\n All missing! \n") else{
aggfxdata <- as.zoo(aggregatebyminutes(zooobj=fxdata,
                    aggtimeframe=aggtimeframe))
}

  I'm not sure how to know without reading the file whether you have
this problem, but once you have read it and know that dim(fxdata)[1] ==
0, you can remove fxdata with rm().

> Thanks.
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list