[R] Cannot grasp how to apply "by" here...
Jonas Malmros
jonas.malmros at gmail.com
Mon Dec 17 19:47:46 CET 2007
I have a data frame named "database" with panel data, a little piece
of which looks like this:
Symbol Name Trial Factor1 Factor2
External
1 548140 A 1 -3.87
-0.32 0.01
2 547400 B 1 12.11
-0.68 0.40
3 547173 C 1 4.50
0.71 -1.36
4 546832 D 1 2.59
0.00 0.09
5 548140 A 2 2.41
0.50 -1.04
6 547400 B 2 1.87
0.32 0.39
What I want to do is to calculate correlation between each factor and
external for each Symbol, and record the corr. estimate, the p.value,
the name and number of observations in a vector named "vector", then
rbind these vectors together in "results". When there are fewer than 5
observations for a particular symbol I want to put NAs in each column
of "vector".
I tried with the following code, making assumption that by splits
database into sort of smaller dataframes for each Symbol (that's the
"x"):
factor.names <- c("Factor1", "Factor2")
factor.pvalue <- c("SigF1", "SigF2")
results <- numeric()
vector <- matrix(0, ncol=(length(factor.names)*2+2), nrow=1)
colnames(vector) <- c("No.obs", factor.names, factor.pvalue)
application <- function(x){
rownames(vector) <- x$Name
for(i in 1:length(factor.names)){
if(dim(x)[1]>=5){
vector[1] <- dim(x)[1]
vector[i+1] <- cor.test(x$External, x[,factor.names[i]],
method="kendall")$estimate
vector[i+3] <- cor.test(x$External, x[,factor.names[i]],
method="kendall")$p.value
} else {
vector <- rep(NA, length(vector))
}
}
results <- rbind(results, vector)
}
by(database, database$Symbol, application)
This did not work. I get :
"Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent"
I used browser() and I see that the Name is not assigned to the row
name of vector and then dim(x)[1] does not work.
What am I doing wrong? Do not understand. :-(
Thank you in advance for your help.
Regards,
JM
--
Jonas Malmros
Stockholm University
Stockholm, Sweden
More information about the R-help
mailing list