[BioC] Bioconductor training course, Boston March 5-6-7 2008

Vincent Carey 525-2265 stvjc at channing.harvard.edu
Fri Feb 1 20:26:43 CET 2008


Tentative announcement:  This course will be withdrawn if
there is insufficient interest

Three day course on Bioconductor (intermediate level)
Instructor: Vincent Carey, Ph.D.

March 5,6,7, 9am - 5pm each day
Inn at Longwood, Boston Massachusetts
342 Longwood Ave, Boston MA, 02115

Tuition: $600 academic, $1200 commercial

Registration form: http://www.biostat.harvard.edu/~carey/form08.pdf

Questions: stvjc at channing.harvard.edu -- please do not post questions
on this course to the list

A block of sleeping rooms will be available at Inn at Longwood
at approximately $189/night; contact 617 731 4700 after
Feb 5 and mention "Bioconductor conference".


This course provides a hands-on survey of Bioconductor tools
for working with genome scale data.  The material targets students
with reasonable facility with R at the command line who wish
to get acquainted with data analysis for various experimental
paradigms.  We will cover, among other things:
  - the MAQC experimental design and platforms
  - the oligo package and new facilities for dealing with
       affymetrix chips (expression and DNA)
  - illumina expression and SNP chip data
  - SQLite facilities for biologic metadata and platform
       annotation
  - the MLInterfaces package for supervised learning
  - the GGtools package for genetics of gene expression

Students who successfully complete the course will be enabled
  - to transform raw outputs from affymetrix and illumina platforms
       into analyzable ExpressionSets or allied containers,
  - to apply various forms of statistical analysis to answer
       questions about differential expression and genotype effects
       in genome scale data
  - to use various annotation resources such as GO and KEGG to
       help interpret patterns in genome scale data
using only transparent and fully open source software

Requirements:

  * prerequisites: There will be very little background material provided
on either R or the assays to be studied.  We are focusing on working
with digital artifacts of experiments (possibly retrieved from GEO, or
from a core, to which we may apply some QA, or which we accept as
valid numerical data).  If you have no prior experience with R but are
interested in the course, be sure to have read Dalgaard, "Introductory
Statistics with R" (Springer) and/or the introductory material on
www.r-project.org.

  * equipment: Every student must bring a reasonably modern laptop
computer with a DVD drive or a USB port to allow installation of
several GB of software and data.  All software and data are supplied
for windows machines so that all students have identical working
environments.  Mac or Linux laptops may be used, but students using
these will be expected to have good mastery of their operating
system so that the majority of students, who use windows, will not
be distracted by idiosyncratic support requests.

Format:  Each major topic is addressed in a brief lecture.
A handout is provided with specific exercises and hints/partial
solutions.  Students work independently or in teams to solve
exercises; the module concludes with discussion of the solution.

Tentative curriculum

Day 1:
 * morning: four technologies in 'cooked' form
    - transcript profiling: affy, illumina
    - CHiP-chip (yeast)
    - SNP-chips + expression
    - aCGH + expression

  * mid-day: containers: structure, population, methods
    - arrays
    - gene sets
    - browser tracks

  * afternoon: workflow components I
    - capture
    - QA
    - preprocessing

Day 2:
  * morning: workflow components II: annotation resources
    - SQLite representations of array and general metadata annotations
    - web services

  * mid-day: statistical analysis concepts
    - categorical methods
    - limma and other regularized methods
    - multiple comparisons

  * afternoon: exploratory tools: visualization, PCA, clustering

Day 3:
  * morning: exercises: MAQC, spike ins, genetics of gene expression

  * mid-day: category and enrichment analyses; supervised learning (MLInterfaces)

  * afternoon: reports and audits; reproducible research
     - Sweave/odfWeave

The information transmitted in this electronic communica...{{dropped:9}}



More information about the Bioconductor mailing list