[R] [BioC] problem with function

Iain Gallagher iaingallagher at btopenworld.com
Sat Dec 18 16:55:19 CET 2010


Hi Christian, Chuck (and lists)

It seems that the problem may be the strange behaviour of 'unstack' inside a function. 

See this thread in the R mailing list:

http://tolstoy.newcastle.edu.au/R/help/04/03/1160.html

Anyway, I got round the problem by using 'aggregate' instead of converting to a list and then tapply to sum values of metric. Probably more efficient as well.

Thanks for the help offered.

My function now looks like this (for the record!) and behaves as it should.

makeMetric <- function(deMirPresGenes, deMirs){
    
#need to match position of each miR in deMirPresGenes with its FC to form a vector of FC in correct order
    
    fcVector <- as.numeric(with (deMirs, FC[match(deMirPresGenes[,4], Probe)] ) )

    #multiply fc by context score for each interaction    
    metric <- fcVector * as.numeric(deMirPresGenes[,11])

    geneMetric <- cbind(deMirPresGenes[,2], metric)
    colnames(geneMetric) <- c('sym', 'metric')
    


    #make cumul by aggregate
    listMetric <- aggregate(as.numeric(geneMetric[,2]), list(geneMetric[,1]), sum)#returns a dataframe
    colnames(listMetric) <- c('symbol','cumulMetric')
    
    #return whole list
    return(listMetric)# dataframe
}

Cheers

i

--- On Sat, 18/12/10, cstrato <cstrato at aon.at> wrote:

> From: cstrato <cstrato at aon.at>
> Subject: Re: [BioC] problem with function
> To: "Iain Gallagher" <iaingallagher at btopenworld.com>
> Cc: "bioconductor" <bioconductor at stat.math.ethz.ch>
> Date: Saturday, 18 December, 2010, 14:40
> You need to do:
> 
> cumulMetric <- function(deMirPresGenes, deMirs){
>     fc <- deMirs
>     fcVector <- as.numeric(with (fc,
> FC[match(deMirPresGenes[,4], Probe)] ) )
> 
>     metric <- fcVector *
> as.numeric(deMirPresGenes[,11])
>     geneMetric <-
> as.data.frame(cbind(deMirPresGenes[,2],
> as.numeric(metric)))
>     colnames(geneMetric) <- c('y', 'x')
> 
>     listMetric <- unstack(geneMetric, x ~
> y)
>     listMetric <-
> as.data.frame(sapply(listMetric,sum)) #returns a dataframe
>     colnames(listMetric) <-
> c('cumulMetric')
> 
>     return(listMetric)
> }
> 
> Regards
> Christian
> 
> On 12/17/10 11:52 PM, Iain Gallagher wrote:
> > ok... done. Not really any further forward here.
> >
> > print statements after creating fcVector, metric and
> geneMetric (see output below). They all look ok in terms of
> structure and length. But the error persists and listMetric
> is not made?!?! Odd.
> >
> > I have added some comments to the output below.
> >
> >> tf2<-cumulMetric(tf1, deMirs$up)#deMirs$up is a
> dataframe (see prev posts)
> > [1] 2.63 2.63 3.13 2.63 3.13 2.74 # print fcVector -
> looks ok
> > [1] -0.35505 -0.34979 -1.03290 -1.22558 -0.61348
> -0.86584 # print metric - looks ok
> > [1] 1045 # lengthof metric - is correct
> >       sym     
> metric    # print geneMetric - looks ok
> > [1,] "AAK1"   "-0.35505"
> > [2,] "ABCA1"  "-0.34979"
> > [3,] "ABCA2"  "-1.0329"
> > [4,] "ABCB10" "-1.22558"
> > [5,] "ABCE1"  "-0.61348"
> > [6,] "ABCF3"  "-0.86584"
> > [1] 1045 # nrow of geneMetric - is correct
> > Error in eval(expr, envir, enclos) : object
> 'geneMetric' not found
> >>
> >
> > cheers
> >
> > i
> > --- On Fri, 17/12/10, cstrato<cstrato at aon.at> wrote:
> >
> >> From: cstrato<cstrato at aon.at>
> >> Subject: Re: [BioC] problem with function
> >> To: "Iain Gallagher"<iaingallagher at btopenworld.com>
> >> Cc: "bioconductor"<bioconductor at stat.math.ethz.ch>
> >> Date: Friday, 17 December, 2010, 22:38
> >> At the moment I have no idea, but
> >> what I would do in this case is to put
> >> print() statements after each line to see where it
> fails.
> >>
> >> Christian
> >>
> >> On 12/17/10 10:59 PM, Iain Gallagher wrote:
> >>> Hi
> >>>
> >>> FC is the second column of the deMirs
> variable. deMirs
> >> is a dataframe with 2 columns - Probe (e.g.
> hsa-miR-145) and
> >> FC (e.g 1.45). Using 'with' allows me to use
> deMirs as an
> >> 'environment'. I thus don't have to pass FC
> explicitly.
> >>>
> >>> Cheers
> >>>
> >>> i
> >>>
> >>> --- On Fri, 17/12/10, cstrato<cstrato at aon.at>
> >> wrote:
> >>>
> >>>> From: cstrato<cstrato at aon.at>
> >>>> Subject: Re: [BioC] problem with function
> >>>> To: "Iain Gallagher"<iaingallagher at btopenworld.com>
> >>>> Cc: "bioconductor"<bioconductor at stat.math.ethz.ch>
> >>>> Date: Friday, 17 December, 2010, 20:39
> >>>> What is FC[]?  It is not passed
> >>>> to the function. Christan
> >>>>
> >>>> On 12/17/10 8:11 PM, Iain Gallagher
> wrote:
> >>>>> Sorry.
> >>>>>
> >>>>> That was a typo. In my script
> >> deMirPresGenes1[,4] is
> >>>> deMirPresGenes[,4].
> >>>>>
> >>>>> Just to be sure I'm going about this
> the right
> >> way
> >>>> though I should say that at the moment I
> assign
> >> the output
> >>>> of another function to a variable called
> 'tf1' -
> >> this object
> >>>> is the same as the deMirPresGenes is my
> previous
> >> email.
> >>>>>
> >>>>> This is then fed to my problem
> function using
> >>>> positional matching.
> >>>>>
> >>>>> e.g. tf2<- cumulMetric(tf1,
> deMirs)
> >>>>>
> >>>>> Which leads to:
> >>>>>
> >>>>> Error in eval(expr, envir, enclos) :
> object
> >>>> 'geneMetric' not found
> >>>>>
> >>>>> Hey ho!
> >>>>>
> >>>>> i
> >>>>>
> >>>>> --- On Fri, 17/12/10, cstrato<cstrato at aon.at>
> >>>> wrote:
> >>>>>
> >>>>>> From: cstrato<cstrato at aon.at>
> >>>>>> Subject: Re: [BioC] problem with
> function
> >>>>>> To: "Iain Gallagher"<iaingallagher at btopenworld.com>
> >>>>>> Cc: "bioconductor"<bioconductor at stat.math.ethz.ch>
> >>>>>> Date: Friday, 17 December, 2010,
> 18:40
> >>>>>> I am not sure but I would say
> that
> >>>>>> deMirPresGenes1 does not exist.
> >>>>>>
> >>>>>> Regards
> >>>>>> Christian
> >>>>>>
> >>>>>>
> >>>>>> On 12/17/10 6:42 PM, Iain
> Gallagher
> >> wrote:
> >>>>>>> Hello List
> >>>>>>>
> >>>>>>> I wonder if someone would help
> me with
> >> the
> >>>> following
> >>>>>> function.
> >>>>>>>
> >>>>>>> cumulMetric<-
> >> function(deMirPresGenes,
> >>>> deMirs){
> >>>>>>>
> >>>>>>> #need to match position of
> each miR
> >> in
> >>>> deMirPresGenes
> >>>>>> with its FC to form a vector of FC
> in
> >> correct
> >>>> order
> >>>>>>>       
>   fc<-
> >> deMirs
> >>>>>>>
> >> fcVector<-
> >>>> as.numeric(with (fc,
> >>>>>> FC[match(deMirPresGenes1[,4],
> Probe)] ) )
> >>>>>>>
> >>>>>>>       
>   #multiply
> >> fc by context
> >>>> score for
> >>>>>> each interaction
> >>>>>>>       
>   metric<-
> >> fcVector *
> >>>>>> as.numeric(deMirPresGenes[,11])
> >>>>>>>
> >> geneMetric<-
> >>>>>> cbind(deMirPresGenes[,2],
> >> as.numeric(metric))
> >>>>>>>
> >>>>>>>
> >>     #make
> >>>> cumul
> >>>>>> weighted score
> >>>>>>>
> >> listMetric<-
> >>>> unstack(geneMetric,
> >>>>>>
> >> as.numeric(geneMetric[,2])~geneMetric[,1])
> >>>>>>>
> >> listMetric<-
> >>>>>>
> as.data.frame(sapply(listMetric,sum))
> >> #returns a
> >>>> dataframe
> >>>>>>>
> >> colnames(listMetric)<-
> >>>>>> c('cumulMetric')
> >>>>>>>
> >>>>>>>       
>   #return
> >> whole list
> >>>>>>>
> >> return(listMetric)
> >>>>>>> }
> >>>>>>>
> >>>>>>> deMirPresGenes looks like
> this:
> >>>>>>>
> >>>>>>> Gene.ID
> >>>>>> Gene.Symbol   
> Species.ID
> >>>>>> miRNA    Site.type
> >>>>>> UTR_start    UTR_end
> >>>>>> X3pairing_contr
> >>>>>> local_AU_contr
> >>>>>> position_contr
> >>>>>> context_score
> >> context_percentile
> >>>>>>> 22848    AAK1
> >>>>>> 9606    hsa-miR-183
> >>>>>> 2    1546
> >>>>>> 1552    -0.026
> >>>>>> -0.047    0.099
> >>>>>> -0.135    47
> >>>>>>> 19    ABCA1
> >>>>>> 9606    hsa-miR-183
> >>>>>> 2    1366
> >>>>>> 1372    -0.011
> >>>>>> -0.048    0.087
> >>>>>> -0.133    46
> >>>>>>> 20    ABCA2
> >>>>>> 9606    hsa-miR-495
> >>>>>> 2    666
> >>>>>> 672    -0.042
> >>>>>> -0.092    -0.035
> >>>>>> -0.33    93
> >>>>>>> 23456    ABCB10
> >>>>>> 9606    hsa-miR-183
> >>>>>> 3    1475
> >>>>>> 1481    0.003
> >>>>>> -0.109    -0.05
> >>>>>> -0.466    98
> >>>>>>> 6059    ABCE1
> >>>>>> 9606    hsa-miR-495
> >>>>>> 2    1474
> >>>>>> 1480    0.005
> >>>>>> -0.046    0.006
> >>>>>> -0.196    58
> >>>>>>> 55324    ABCF3
> >>>>>> 9606    hsa-miR-1275
> >>>>>> 3    90
> >>>>>> 96    0.007
> >>>>>> 0.042    -0.055
> >>>>>> -0.316    94
> >>>>>>>
> >>>>>>>
> >>>>>>> The aim of the function is to
> extract
> >> a
> >>>> dataframe of
> >>>>>> gene symbols along with a weighted
> score
> >> from the
> >>>> above
> >>>>>> data. The weighted score is the FC
> column
> >> of
> >>>> deMirs * the
> >>>>>> context_score column of
> deMirPresGenes.
> >> This is
> >>>> easy peasy!
> >>>>>>>
> >>>>>>> Where I'm falling down is that
> if I
> >> run this
> >>>> function
> >>>>>> it complains that 'geneMetric'
> can't be
> >> found. Hmm
> >>>> - I've
> >>>>>> run it all line by line (i.e. not
> as a
> >> function)
> >>>> and it
> >>>>>> works but wrapped up like this it
> fails!
> >>>>>>>
> >>>>>>> e.g.
> >>>>>>>
> >>>>>>>> testF2<-
> cumulMetric(testF1,
> >>>> deMirs$up)
> >>>>>>> Error in eval(expr, envir,
> enclos) :
> >> object
> >>>>>> 'geneMetric' not found
> >>>>>>>
> >>>>>>> deMirs$up looks like this:
> >>>>>>>
> >>>>>>> Probe    FC
> >>>>>>> hsa-miR-183    2.63
> >>>>>>> hsa-miR-1275   
> 2.74
> >>>>>>> hsa-miR-495    3.13
> >>>>>>> hsa-miR-886-3p   
> 3.73
> >>>>>>> hsa-miR-886-5p   
> 3.97
> >>>>>>> hsa-miR-144*   
> 6.62
> >>>>>>> hsa-miR-451    7.94
> >>>>>>>
> >>>>>>> Could someone possibly point
> out where
> >> I
> >>>> falling
> >>>>>> down.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> i
> >>>>>>>
> >>>>>>>> sessionInfo()
> >>>>>>> R version 2.12.0 (2010-10-15)
> >>>>>>> Platform: x86_64-pc-linux-gnu
> >> (64-bit)
> >>>>>>>
> >>>>>>> locale:
> >>>>>>>     
>    [1]
> >>>> LC_CTYPE=en_GB.utf8
> >>>>>>
> >>     LC_NUMERIC=C
> >>>>>>>     
>    [3]
> >>>> LC_TIME=en_GB.utf8
> >>>>>>
> >> LC_COLLATE=en_GB.utf8
> >>>>>>>     
>    [5]
> >> LC_MONETARY=C
> >>>>>>
> >>>>   
>    LC_MESSAGES=en_GB.utf8
> >>>>>>>     
>    [7]
> >>>> LC_PAPER=en_GB.utf8
> >>>>>>     
>    LC_NAME=C
> >>>>>>>     
>    [9]
> >> LC_ADDRESS=C
> >>>>>>
> >> LC_TELEPHONE=C
> >>>>>>> [11]
> LC_MEASUREMENT=en_GB.utf8
> >>>> LC_IDENTIFICATION=C
> >>>>>>>
> >>>>>>> attached base packages:
> >>>>>>> [1] stats
> >>     graphics
> >>>>>> grDevices utils
> >>     datasets
> >>>>>> methods   base
> >>>>>>>
> >>>>>>> loaded via a namespace (and
> not
> >> attached):
> >>>>>>> [1] tools_2.12.0
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> _______________________________________________
> >>>>>>> Bioconductor mailing list
> >>>>>>> Bioconductor at r-project.org
> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >> _______________________________________________
> >>>>> Bioconductor mailing list
> >>>>> Bioconductor at r-project.org
> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>>
> >>>>
> >>>
> >>
> >
>





More information about the R-help mailing list