[R-SIG-Mac] Data editor

John Walker john.s.walker at uchsc.edu
Tue Mar 11 17:44:48 CET 2008


Thank you for the responses to my email. I'd like to further the
discussion a little because I think it may be productive.

I'll take major points from the responses and deal with them one by one
although not necessarily in order.

With regard to my colleague's willingness to learn a command line
program. She is an extremely intelligent person and has learned Latex to
typeset her documents - no question she is able to learn. But she
requires some evidence that the program is worth learning. To her mind
and to mine, simple things should be simple to do and complex things
made easier by any software. My point here is that entering in data via
the data editor failed. Whether or not that is the best way or the right
way is irrelevant. R offers an option to enter small data sets via an
editor - the edit function. In the X11 interface and the Windows
interface to the edit function, the functionality is present. In the mac
editor it is not. I agree that the GUI interface to the command line is
nice and functional and the programmers are to be applauded. The fact
remains that the data editor in the Mac interface is inadequate.  When
someone tries a program for the first time and a simple function fails
then there is no incentive to go further.


As regards the best way to enter data.  I think statisticians are used
to investigators coming to them with very precious post processed data.
Consequently data entry is handled using programs that emphasize data
integrity, security and organisation. Hence the use of Excel (sic!!!) to
handle data for export to R.

In reality in lab work, data abounds. The scientist  performs ad hoc
experiments daily and tries to see what is going on. Sometimes, as in
this case, he/she wants to find out if a rough <preliminary> experiment
is showing a difference. For the scientist, the ability to enter data
quickly and see if the difference they think they are seeing is real
<is> important. When the experiments have settled into a routine and
data collection becomes part of a protocol, the data should and does go
into a data entry program ( a database with data entry front end or a
spreadsheet). That is the data the statistician sees and has been
collected with great care and expense, but it is not the only kind of
data the scientist deals with

Yes the t-test could be done on a calculator, but few scientists in the
biomedical sciences actually use a calculator for a t-test. They all use
a computer based stats program. To my mind, if they do it in R, two
things are accomplished; a broader adoption of R and the preliminary
data are in R to be added to and if difficulties arise to be handed to
an analyst who also uses R. (Please don't go off topic and tell me they
should have seen a statistician before starting; a.) I know the reasons
and b.) it isn't always necessary)

Mathematical statisticians regard data as holy. Scientists who collect
the data know it to be dirty, unkempt and often scribbled on pieces of
paper or on the margins of notebooks -especially when the experiments
are just getting started. Not all data are important. Many experiments
simply confirm that there is no difference due to a treatment. The
ability to quickly enter small data sets and check to see if there
really is a difference is important to investigators. Fisher knew that,
so did Student/Gosset hence the development of small sample statistics.
I'm suggesting that if 'R' wants to address the needs of scientists, a
method for entry of small data sets is important. The command line is
fine. I use it and I am happy with it. But R <offers> a data editor.
Those who want to use it should be able to. This does not mean a full
blown spreadsheet interface. I agree that would be stupid. All I am
suggesting is that the Mac data editor be functional. The Unix and
Windows ones already are.




-- 
John Walker
Assistant Professor of Cardiology
Department of Medicine
University of Colorado Health Sciences Center
4200 E. Ninth Ave B130
BRB Rm 351
Denver CO 80262

ph 303 315 0103



More information about the R-SIG-Mac mailing list