huber at ebi.ac.uk
Tue Jan 9 21:16:50 CET 2007
just to add to Florian's comment about user interface: I think the
annotatedDataFrame and new eSet classes are beautiful and elegant, and
much better than what we had.
Yet I find it now quite complex and unintuitive to construct an
annotatedDataFrame or an ExpressionSet from scratch, IMHO anything that
makes it simple to convert a simple dataframe or Excel table into a
valid annotatedDataFrame will make many users happy.
Florian Hahne wrote:
> Hi Seth,
> internal representation is one part of the story and I agree that row
> names are the way to go here. Another point however is how the user gets
> the information into R. At some point we need to match sample names and
> the sample meta data and IMO this should already be at the level of the
> text file. The closest to the row names idea is probably to take the
> first column in the file as the sample identifier, but this poses a
> pretty strict layout on the files (maybe for some users the first column
> is already the row numbering...). As far as I understand the current
> implementation the default is to take the first column and that you can
> pass row.names=x to read.AnnotatedDataFrame but there is this additional
> sampleNames parameter and I find this pretty confusing. So currently you
> can do almost everything with the function which is good in one sense
> but on the other hand might cause mix ups and confusion to the user. If
> the mapping is already clear at the level of the text file, we can sit
> back and tell people to check their files if something isn't showing up
> as they expect it to be, but currently you can do pretty stupid stuff
> just by setting a wrong argument without even realizing.
> I had the impression at the Bressanone courses that for the average user
> the biggest obstacle is to get all the necessary data from files
> somewhere on the hard disk into R and that it is important to provide a
> straightforward default way of doing that.
More information about the Bioc-devel