[BioC] breastCancerNKI patients question

Markus Schroeder mschroed at jimmy.harvard.edu
Tue Oct 25 13:59:52 CEST 2011


On 10/25/2011 12:47 PM, Sean Davis wrote:
> On Tue, Oct 25, 2011 at 7:38 AM, Xavier Robin<Xavier.Robin at unige.ch>  wrote:
>> Hello everyone!
>>
>>
>> According to the documentation of package breastCancerNKI, nki is a
>> merge of two datastets by van’t Veer et al. and van de Vijver et al. We
>> should have 117 patients for the former, and 295 for the latter, summing
>> up to 412 patients.
>>
>> However the nki dataset contains only 337 patients:
>>
>>> dim(pData(nki))
>> [1] 337  21
>>> dim(exprs(nki))
>> [1] 24481   337
>>
>> Apparently the "series" annotation specifies which dataset the patients
>> belong to:
>>
>>> table(pData(nki)$series)
>>   NKI NKI2
>>   117  220
>>
>> So we would have all 117 patients from van’t Veer et al, but only 220 of
>> the 295 patients of van de Vijver et al.
>>
>> Is it correct?
>> * If yes, where are the 75 remaining patients?
>> * If no, what did I misunderstand?
> There was overlap between the van't Veer and van de Vijver datasets,
> so the numbers are for unique patients only, I believe.

Yes, the duplicate patients were removed which resulted in 337 unique 
patients for both studies.

Regards,
Markus



More information about the Bioconductor mailing list