[BioC] regarding package ArrayExpress
audrey at ebi.ac.uk
audrey at ebi.ac.uk
Thu Sep 10 15:37:55 CEST 2009
Hi Tim,
This seems like a good alternative. I will have a look into this.
Thank you,
Audrey
> Hi,
>
> This would work assuming the featureData is kept synchronised with the
> assayData. I guess the alternative would be to take mean or median for
> the duplicated reporters, which might be more useful in some cases.
> Perhaps that could be added as an option? I know quite a few
> custom-printed arrays had duplicated reporter identifiers such as
> these; it should be less of a problem for the commercial arrays.
>
> Cheers,
>
> Tim
>
>
> 2009/9/10 Misha Kapushesky <ostolop at ebi.ac.uk>:
>> Hi,
>>
>> Without tweaking read.table, you'd have to read row names as one of the
>> data
>> columns, then make.names on that set of names and set the row names to
>> the
>> modified ones. So, something like
>>
>> d <- read.table("foo.tab") ## if read.table("foo.tab", row.names=1)
>> fails
>>
>> rownames(d) <- make.names(d[,1], unique=TRUE)
>>
>> d <- d[,-1] ## to remove the column used
>>
>> Whether these newly made "unique" row names are what you need is a good
>> question... :)
>>
>> --Misha
>>
>> On Thu, 10 Sep 2009, audrey at ebi.ac.uk wrote:
>>
>>> Dear Amit,
>>>
>>> You are not making any mistakes. This is the proper way of calling the
>>> functions to create an object from a processed dataset. However the
>>> problem comes from the dataset itself. It contains duplicate probe
>>> identifiers as row names, which is not allowed by the function
>>> read.table
>>> that is used in the procset function.
>>> Unfortunately I do not have an idea on how to prevent this. Does
>>> someone
>>> know how I could allow duplicate row names in my function?
>>>
>>> Best regards,
>>> Audrey
>>>
>>> --
>>> Audrey Kauffmann
>>> EMBL - EBI
>>> Cambridge UK
>>> +44 (0) 1223 492 631
>>> http://www.ebi.ac.uk/~audrey
>>>
>>>> Hello! List,
>>>>
>>>> I am trying to build an object from Array Express processed data using
>>>> bioconductor package ArrayExpress. I did following:-
>>>>
>>>> CAGE99d = getAE("E-GAGE-99",type="processed")
>>>> colname = getcolproc(CAGE99d)
>>>> CAGE99p = procset(CAGE99d, colname[3])
>>>>
>>>> and I got following error:-
>>>> Error in `row.names<-.data.frame`(`*tmp*`, value = c(6995L, 7017L,
>>>> 7006L,
>>>> :
>>>>
>>>> duplicate 'row.names' are not allowed
>>>> In addition: Warning message:
>>>> non-unique values when setting 'row.names': ?R:A-MEXP-58:210099?,
>>>> ?R:A-MEXP-58:210100?, ?R:A-MEXP-58:210111?,
>>>> ?R:A-MEXP-58:210123?,?R:A-MEXP-
>>>> [... truncated]
>>>>
>>>> I am not able to figure out mistake I am making. Please Help!
>>>> Amit
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
More information about the Bioconductor
mailing list