[R] Odp: log2() and -min() very quick question
Petr PIKAL
petr.pikal at precheza.cz
Mon Jun 13 18:14:56 CEST 2011
Hi
r-help-bounces at r-project.org napsal dne 13.06.2011 17:59:03:
> Ben Ganzfried <ben.ganzfried at gmail.com>
> Odeslal: r-help-bounces at r-project.org
>
> 13.06.2011 17:59
>
> Komu
>
> r-help at r-project.org
>
> Kopie
>
> Předmět
>
> [R] log2() and -min() very quick question
>
> I'm looking over good-code a post-doc in my lab wrote and trying to
learn
> how it works. I came across the following:
> rel.abundance <-
as.matrix(read.delim("rel.abundance.csv",row.names=1,as.is
> =TRUE))
> rel.abundance <- log2(rel.abundance-min(rel.abundance)+1)
>
> I'm not sure what the second line is doing. I ran each line in R and
> couldn't see a noticeable difference in the output. I assume log2()
takes
> the log base 2 of the values? I'm not clear what -min(rel.abundance) is
> doing either...my hunch would be that it would take the smallest value
in
> each row?
No. If rel.abundance is matrix min(rel.abundance) is overall minimum
> mat<-matrix(1:12, 3,4)
> min(mat)
[1] 1
so
log2(rel.abundance-min(rel.abundance)+1)
subtract minimum value from all numbers, after that it add 1 do all
numbers, takes log base 2 from each number and returns matrix with the
same dimensions as input matrix.
> I'd really like to figure out:
> 1) What's actually going on?
> 2) Is there a good way to run a command over a large dataset in R and
better
> be able to tell what is going on? More specifically, when I run each
line
> in R it looks something like this (w/ dif. values per row):
> Archaea|Euryarchaeota|Methanobacteria|Methanobacteriales|
>
Methanobacteriaceae|Methanobrevibacter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>
0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,
>
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,
> 0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,23,0,3,0,0,0
>
>
> There are a lot of cells w/ values per row, which is one reason why I
think
> it is difficult to detect a pattern....
there are some summary and structure commands
summary(data) or str(data)
which can tell you some overall information about your data.
Regards
Petr
>
> Thanks in advance!
>
> Ben
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list