[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
William Michels
wjm1 @end|ng |rom c@@@co|umb|@@edu
Fri Nov 2 05:59:13 CET 2018
Perhaps one of the following two methods:
> zgene = data.frame( TTT=c(0,1,0,0),
+ TTA=c(0,1,1,0),
+ ATA=c(1,0,0,0),
+ ATT=c(0,0,0,0),
+ row.names=c("gene1", "gene2", "gene3", "gene4"))
> zgene
TTT TTA ATA ATT
gene1 0 0 1 0
gene2 1 1 0 0
gene3 0 1 0 0
gene4 0 0 0 0
>
> zgene[ , zgene[2,1:4] > 0]
TTT TTA
gene1 0 0
gene2 1 1
gene3 0 1
gene4 0 0
>
> zgene[ , zgene[rownames(zgene) == "gene2",1:4] > 0]
TTT TTA
gene1 0 0
gene2 1 1
gene3 0 1
gene4 0 0
>
Best Regards,
Bill.
William Michels, Ph.D.
On Thu, Nov 1, 2018 at 9:07 PM, Bogdan Tanasa <tanasa using gmail.com> wrote:
> Dear Bill, and Bill,
>
> many thanks for taking the time to advice, and for your suggestions. I
> believe that I shall rephrase a bit my question, with a better example :
> thank you again in advance for your help.
>
> Let's assume that we start from a data frame :
>
> x = data.frame( TTT=c(0,1,0,0),
> TTA=c(0,1,1,0),
> ATA=c(1,0,0,0),
> ATT=c(0,0,0,0),
> row.names=c("gene1", "gene2", "gene3", "gene4"))
>
> Shall we select "gene2", at the end, we would like to have ONLY the COLUMNS,
> where "gene2" is NOT-ZERO. In other words, the output contains only the
> first 2 columns :
>
> output = data.frame( TTT=c(0,1,0,0),
> TTA=c(0,1,1,0),
> row.names=c("gene1", "gene2", "gene3",
> "gene4"))
>
> with much appreciation,
>
> -- bogdan
>
> On Thu, Nov 1, 2018 at 6:34 PM William Michels <wjm1 using caa.columbia.edu>
> wrote:
>>
>> Hi Bogdan,
>>
>> Are you saying you want to drop columns that sum to zero? If so, I'm
>> not sure you've given us a good example dataframe, since all your
>> numeric columns give non-zero sums.
>>
>> Otherwise, what you're asking for is trivial. Below is an example
>> dataframe ("ygene") with an example "AGA" column that gets dropped:
>>
>> > xgene <- data.frame(TTT=c(0,1,0,0),
>> + TTA=c(0,1,1,0),
>> + ATA=c(1,0,0,0),
>> + gene=c("gene1", "gene2", "gene3", "gene4"))
>> >
>> > xgene[ , colSums(xgene[,1:3]) > 0 ]
>> TTT TTA ATA gene
>> 1 0 0 1 gene1
>> 2 1 1 0 gene2
>> 3 0 1 0 gene3
>> 4 0 0 0 gene4
>> >
>> > ygene <- data.frame(TTT=c(0,1,0,0),
>> + TTA=c(0,1,1,0),
>> + AGA=c(0,0,0,0),
>> + gene=c("gene1", "gene2", "gene3", "gene4"))
>> >
>> > ygene[ , colSums(ygene[,1:3]) > 0 ]
>> TTT TTA gene
>> 1 0 0 gene1
>> 2 1 1 gene2
>> 3 0 1 gene3
>> 4 0 0 gene4
>>
>>
>> HTH,
>>
>> Bill.
>>
>> William Michels, Ph.D.
>>
>>
>> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa using gmail.com> wrote:
>> > Dear all, please may I ask for a suggestion :
>> >
>> > considering a dataframe that contains the numerical values for gene
>> > expression, for example :
>> >
>> > x = data.frame(TTT=c(0,1,0,0),
>> > TTA=c(0,1,1,0),
>> > ATA=c(1,0,0,0),
>> > gene=c("gene1", "gene2", "gene3", "gene4"))
>> >
>> > how could I select only the COLUMNS where the value of a GENE (a ROW) is
>> > non-zero ?
>> >
>> > thank you !
>> >
>> > -- bogdan
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list