[R] Distribution

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Tue Feb 22 01:52:48 CET 2005


You can read in the data using read.delim() or read.table(). For
illustration let us generate some artificial data and suppose that you
are interested in equal sized breaks of 5 (you can define your own break
points instead).

   x   <- rchisq(500000, df=10, ncp=5)
   brk <- seq(0, 5*ceiling(max(x)/5), by=5) # increments of size 5
   h   <- hist(x, breaks=brk, plot=FALSE)

h$breaks, h$counts will give you the count and break points but I always
have trouble matching which interval the counts belong to.


Another easier way is to use cut() followed by table() where the labels
of cut is helpful.

   table( cut( x, breaks=brk ) )

As a bonus, you can simplify specifying the break points by including
Infinite as the endpoint in cut.

   brk2 <- seq(0, max(x), by=5) # increments of size 5
   table( cut( x, breaks=c(brk2, Inf) ) )


Regards, Adai


On Mon, 2005-02-21 at 18:44 -0500, Sean Davis wrote:
> Srini
> 
> You should probably look at ?hist.  If you look at the "value" section, you 
> will see that you can get the information you want from the values returned 
> from hist.  If these are microarray probes and intensities, there may be 
> specific methods for visualizing the data available from the bioconductor 
> project (www.bioconductor.org).
> 
> Hope this helps,
> Sean
> 
> ----- Original Message ----- 
> From: "Srinivas Iyyer" <srini_iyyer_bio at yahoo.com>
> To: "Rhelp" <r-help at stat.math.ethz.ch>
> Sent: Monday, February 21, 2005 6:21 PM
> Subject: [R] Distribution
> 
> 
> > Dear group,
> > apologies for asking a simple question. I have a file
> > where the data looks like this:
> > Probe    Intensity
> > 0:0 501.0
> > 1:0 17760.5
> > 2:0 511.0
> > 3:0 18468.3
> > 4:0 199.8
> > 5:0 508.0
> > 6:0 17241.8
> > 7:0 507.5
> > 8:0 17910.0
> > 9:0 482.5
> > 10:0 17480.3
> > 11:0 434.0
> > 12:0 17631.3
> > 13:0 444.8
> > 14:0 17423.0
> > 15:0 505.3
> > 16:0 16693.0
> > 17:0 438.5
> > 18:0 16920.0
> > 19:0 491.3
> > 20:0 16878.0
> > 21:0 486.3
> > 22:0 16582.0
> > 23:0 483.8
> > 24:0 16694.8
> > 25:0 452.3
> > 26:0 16221.5
> > 27:0 438.3
> > 28:0 17119.8
> > 29:0 455.5
> > 30:0 16579.0
> > 31:0 424.5
> > 32:0 16691.3
> > 33:0 472.0
> >
> >
> > My question is how do I know the distribution of the
> > intensities. My aim is to find out the number of
> > intensities or probes that fall in a certain range.
> >
> > For example 500 probes has intensities ranging from 50
> > to 150.
> >
> > 300 probes has intensities ranging from 151-250
> >
> > I have no clue how to do it for 500,000 probes. Can
> > any one please help doing it in R.
> >
> > thanks and apologies again
> >
> > srini
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list