[R] File normalization
Phil Spector
spector at stat.berkeley.edu
Tue May 25 19:57:03 CEST 2010
The scale function can use whatever vector you choose for
subtraction and division. (It's basically a wrapper for
the sweep function.) For example, to subtract the
median and divide by the median absolute deviation, use
scale(x,center=apply(x,2,median),scale=apply(x,2,mad))
Either the center= or scale= arguments can be omitted if
you only want to divide or subtract.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Tue, 25 May 2010, Joris Meys wrote:
> Scale is written to do that IF you want to normalize according to the mean
> and the sd. For any other form of normalization, apply or sweep constructs
> will have to be used.
>
> I couldn't really see a way of using the absolute median value in a
> sweep-statement.
>
> On Tue, May 25, 2010 at 7:11 PM, Bert Gunter <gunter.berton at gene.com> wrote:
>
>> ?scale
>>
>> is specifically written for this. See also ?sweep
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>>
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On
>> Behalf Of Joris Meys
>> Sent: Tuesday, May 25, 2010 9:54 AM
>> To: cobbler_squad
>> Cc: r-help at r-project.org
>> Subject: Re: [R] File normalization
>>
>> My code substracts the median absolute value. If you want to divide by it,
>> the code must be :
>> apply(some_dataset,2,function(
>>>
>>> x){
>>> x/median(abs(x))
>>> })
>>
>>
>> Thanks to Peter Langfelder for pointing out my mistake.
>>
>> On Tue, May 25, 2010 at 6:24 PM, Joris Meys <jorismeys at gmail.com> wrote:
>>
>>> What kind of normalization do you want to do?
>>> If you want to divide all columns by the median absolute value, try :
>>>
>>> apply(some_dataset,2,function(x){
>>> x-median(abs(x))
>>> })
>>>
>>> also look at ?scale for normalization using the average and the sd.
>>> Cheers
>>> Joris
>>>
>>>
>>> On Tue, May 25, 2010 at 6:01 PM, cobbler_squad <la.foma at gmail.com>
>> wrote:
>>>
>>>>
>>>> Dear all,
>>>>
>>>> I have a file with 57 columns (671 time points in each column)
>>>>
>>>> File looks like this:
>>>> 1 0.279191 -1.203200e-02 -0.166772 6.12080e-02 0.196379
>>>> 4.591900e-02 0.293689
>>>> 2 0.267017 -1.150700e-02 -0.159463 5.85400e-02 0.187775
>>>> 4.392200e-02 0.280854
>>>> 3 0.053778 -2.322000e-03 -0.032103 1.18490e-02 0.037921
>>>> 8.867000e-03 0.056571
>>>> 4 0.035469 -1.531000e-03 -0.021166 7.79200e-03 0.024937
>>>> 5.843000e-03 0.037273
>>>> 5 0.040774 -1.761000e-03 -0.024342 8.96000e-03 0.028674
>>>> 6.726000e-03 0.042910
>>>> 6 -0.359709 1.547400e-02 0.214844 -7.87320e-02 -0.253034
>>>> -5.905100e-02 -0.378322
>>>>
>>>> I need to normalize it -- is it possible?
>>>>
>>>> I looked into normalize columns of a matrix to have the median absolute
>>>> value in R, but I am not sure how to apply it in this case. Would very
>>>> much
>>>> appreciate any input you could give me..
>>>>
>>>> Thank you all in advance,
>>>>
>>>> Cobbler
>>>> --
>>>> View this message in context:
>>>> http://r.789695.n4.nabble.com/File-normalization-tp2230251p2230251.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Joris Meys
>>> Statistical Consultant
>>>
>>> Ghent University
>>> Faculty of Bioscience Engineering
>>> Department of Applied mathematics, biometrics and process control
>>>
>>> Coupure Links 653
>>> B-9000 Gent
>>>
>>> tel : +32 9 264 59 87
>>> Joris.Meys at Ugent.be
>>> -------------------------------
>>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>>
>>
>>
>>
>> --
>> Joris Meys
>> Statistical Consultant
>>
>> Ghent University
>> Faculty of Bioscience Engineering
>> Department of Applied mathematics, biometrics and process control
>>
>> Coupure Links 653
>> B-9000 Gent
>>
>> tel : +32 9 264 59 87
>> Joris.Meys at Ugent.be
>> -------------------------------
>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
> Joris Meys
> Statistical Consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Applied mathematics, biometrics and process control
>
> Coupure Links 653
> B-9000 Gent
>
> tel : +32 9 264 59 87
> Joris.Meys at Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list