[BioC] Tissue specific genes with limma
Ron Ophir
ron.ophir at weizmann.ac.il
Thu Jan 13 09:49:20 CET 2005
Dear Limma users,
In our study we would like to identify tissue specific genes, i.e.,
genes that are differentially
expressed in a specific tissue. From practical reason each RNA extract
is a mixture of tissues. These RNA sample was hybridized to Affymetrix
chips. I thought that linear model is a good algorithm to extract the
relative contribution of tissue to each gene expression (correct if I
wrong up to here). Therefore I prepared a design matrix as follow:
LP ML ADL ABL T S B
LER 1 1 1 1 1 1 1
LER 1 1 1 1 1 1 1
M7 0 1 1 1 1 0 0
M5 1 0 1 1 1 1 1
M7 0 1 1 1 1 0 0
M5 1 0 1 1 1 1 1
AD 1 0 1 0 1 0 0
M2 1 0 1 1 1 1 1
Trichom 1 0 1 1 0 1 1
Stipuls 1 0 1 1 1 0 1
Stipuls 1 0 1 1 1 0 1
AB 1 0 0 1 0 1 0
AB 1 0 0 1 0 1 0
AD 1 0 1 0 1 0 0
LER 1 1 1 1 1 1 1
M2 1 0 1 1 1 1 1
Where LER for example is the RNA sample that has a mixture of all
tissues LER= LP+ML+ADL+ABL+T+S and the rest of the row are the RNA
mixtures of any set of tissues signed by 1. We also assume no
interaction and that the tissues are in equal amount therefore we expect
by linear models to find the relative contribution of each tissue to the
gene expression.
First is the above matrix is the right matrix or should I set the
replicates to its proportion in order not to violate the assumption that
the tissues are present in equal amount in all mixtures, like this:
LP ML ADL ABL T S B
LER 0.3 0.3 0.3 0.3 0.3 0.3 0.3
LER 0.3 0.3 0.3 0.3 0.3 0.3 0.3
M7 0 0.5 0.5 0.5 0.5 0 0
M5 0.5 0 0.5 0.5 0.5 0.5 0.5
M7 0 0.5 0.5 0.5 0.5 0 0
M5 0.5 0 0.5 0.5 0.5 0.5 0.5
AD 0.5 0 0.5 0 0.5 0 0
M2 0.5 0 0.5 0.5 0.5 0.5 0.5
Trichom 1 0 1 1 0 1 1
Stipuls 0.5 0 0.5 0.5 0.5 0 0.5
Stipuls 0.5 0 0.5 0.5 0.5 0 0.5
AB 0.5 0 0 0.5 0 0.5 0
AB 0.5 0 0 0.5 0 0.5 0
AD 0.5 0 0.5 0 0.5 0 0
LER 0.3 0.3 0.3 0.3 0.3 0.3 0.3
M2 0.5 0 0.5 0.5 0.5 0.5 0.5
Second, to identify tissue specific genes we would like to have the
summation of a specific tissue for all mixtures. In details,
as a result of linear model fit we expect to get a matrix of expression
values for each gene, which like design matrix rows are RNA samples and
columns are tissues. Where the observed value of LER mixture, for
example, equal for sum of the values of the relative contribution of
each tissue: LER= 0.5(from LP)+4(from ML)+3(from ADL)+1.2(from
ABL)+0.3(from T)+1(from S)=10 where 10 is the observed expression value
for a given mixture for a given gene and 0.5,4,3,1.2,0.3,1 are the
deduced expression values from the linear fit for each tiisues. What we
are interesting is finding the summation for each gene over the columns,
i.e., LP = 0.5(relative LP contribution in
LER)+0.6(M2)+1.2(M5)+0(M7)+1(Trichom)+3(AB)+2(AD) for each tissue. In
limma if we set in the design one of the tissues as a reference (tissue
that exist in all mixture) we will get the differential expression of
all other tissues relative to it, however we are looking to the absolute
expression. In other words I am looking for the absolute expression of
each gene for each tissue rather than having the differential expression
which is the usually the final result in limma.
Is it possible to do that?
Ron
More information about the Bioconductor
mailing list