[R] Novice Questions

Bruce Moore bwmoore22 at yahoo.com
Mon Jun 30 17:32:23 CEST 2003


I'm writing a program to perform linear regressions to
estimate the number of bank teller transactions per
hour of various types based upon day of week, time of
day, week of month and several prices.  I've got about
25,000 records in my dataset, 85 columns of
transaction counts (used 1 at a time), about 50
columns of binary indicators (day, week, pay period,
hour, branch), and a half dozen real valued prices.

My program hangs on some regressions as I add
interactions, probably due to logic problems in my
code or collinearity problems in the data.

1) I'm running my program via the source() command. 
It appears that source() does not print any messages
until it completes.  

---->Is there a way to get diagnostic messages to
print immediately rather than when the source()
command has completed?

2) I'm fairly certain that I've got some collinearity
in the data set and the interactions.  I've found an
append (Ott Toomet 5/30/2003) that talks about a
procedure to find collinearity problems using
model.matrix() to generate the dataset with
interactions and kappa() to determine the condition
number of the matrix.  

---->Is there a more automated way to find collinear
variables?

3) Is there a way to get lm() and/or step() or some
other package to give a model with only coefficients
that are significant at a particular level?

4) Is there a way to suppress display of a password
when using the RODBC odbcConnect() function, or to get
the function to prompt for a password?

5) What is the practical size limit on the number of
terms in model?  I know that I won't be able to
consider all interactions, but would like to have some
idea when to give up and go with what I've got.



=====
Bruce Moore




More information about the R-help mailing list