[Rd] problem using "by" with custom function?
thalarctos
kmiddel at gmail.com
Mon Dec 10 19:57:23 CET 2007
Hi,
I'm relatively new to R and R development, so please forgive me for any
obvious errors.
What I am trying to do is use the command dpik within the package KernSmooth
to estimate bandwidth parameters for GPS telemetry data. I have been able
to get this to work on a case by case basis without any problem, but would
like to extend this so that I can batch process many different animals for
pre-determined time periods (months of the year). I have written a function
that first standardises the data based on the X and Y values, then uses dpik
to calculate the bandwidth for each variable. The result is the average of
the X and Y estimates. I then use the command "by" to run the function on a
dataframe which has X and Y in columns 5 and 6, and a grouping variable
"animonth" to individualise the data by animal and month. when I run the by
command on a small data table (only a few different levels of animonth) it
works perfectly.
The problem is when I try to run it on all the data (or more than a few
levels) at once. I get the error posted below. However, if I run a simple
embedded function like summary within by, there is no error. Can anyone
provide me with some assistance in interpreting this error? Any suggestions
on alternative commands to use would be appreciated as well, as I'm not
commited to using by, it was just the one that seemed to work.
Data table example:
uniqid animal month animonth x y
1 11748 W079 12 W079_12 1494206 12134126
2 11749 W079 12 W079_12 1494123 12134051
3 11750 W079 12 W079_12 1493639 12133705
4 11751 W079 12 W079_12 1493353 12135892
5 11752 W079 12 W079_12 1495157 12137797
6 11753 W079 12 W079_12 1498039 12132112
7 11754 W079 12 W079_12 1497991 12131842
8 11755 W079 12 W079_12 1497918 12131631
9 11756 W079 12 W079_12 1498019 12131638
10 11757 W079 12 W079_12 1498017 12131633
Function for calculating bandwidth:
> kern.est
function(data) {
x.var <- (data$x / sd(data$x)); y.var <- (data$y / sd(data$y))
dpik.x <- dpik(x.var, gridsize = round((max(data$x) - min(data$x))/100))
dpik.y <- dpik(y.var, gridsize = round((max(data$y) - min(data$y))/100))
bw.avg <- ((dpik.x + dpik.y)/2)
by command used:
junk3 <- by(w079.all[,5:6], w079.all$animonth, kern.est)
output from small files (only a few levels of animonth):
w079.all$animonth: W079_1
[1] 0.2117635
-----------------------------------------------------------------------------------------------------------------
w079.all$animonth: W079_12
[1] 0.2837849
Error on larger files:
Error in rep(0, P - 2 * L - 1) : invalid 'times' argument
Thank in advance for all any help,
Kevin
--
View this message in context: http://www.nabble.com/problem-using-%22by%22-with-custom-function--tp14259137p14259137.html
Sent from the R devel mailing list archive at Nabble.com.
More information about the R-devel
mailing list