Boston Bioconductor Course, Oct 24th, 25th 2011

Main Course Website Boston Bioconductor


Exploratory Data Analysis

  • Importance of EDA
  • Clustering Data using hierarchical cluster analysis
  • Dimensions reduction using principal components analysis
  • Interpreting Results of EDA

Exploratory Data Analysis (slides)


Exercise/tutorial

## Install R packages for this tutorial

install.packages(“RCurl”)
install.packages(“gplots”)
install.packages(“scatterplot3d”)

## Install Bioconductor Packages
source(“http://www.bioconductor.org/biocLite.R”)
#biocLite()
biocLite(“made4″)
biocLite(“hgu95av2.db”)

Exercise Data Files

  • Normalised (vsn) fibroblast data data.vsn.csv. To load directly into R, use
      read.csv("data.vsn.csv", row.names=1, as.is=TRUE)
      read.csv("http://bcb.dfci.harvard.edu/~aedin/courses/BostonBioc/EDA/data.vsn.csv", row.names=1, as.is=TRUE)
  • Samples annotation annt.txt. If your browser opens this on clicking, right mouse click and select save as. To read directly into R
    read.table("annt.txt", header=TRUE)
    read.table("http://bcb.dfci.harvard.edu/~aedin/courses/BostonBioc/EDA/annt.txt", header=TRUE)

Reproducible Research

  • Importance of Reproducdible Research
  • Why should we perform reproducible research
  • A survey of reproducibility, cases studies
  • Using Sweave

Reproducible Research (slides)

Manual on producing documents using Sweave and creating Bioconductor packages (pdf)

Exercise Data Files

Sweave style file... Sweave.sty

Load rnw file, use commands Stangle, Sweave to extract R and tex files respectively

  • Example Sweave File (rnw file..edit this one)exampleSweave.rnw
  • Results of Sweave("exampleSweave.rnw") on Example Sweave File (tex file.. do not edit) exampleSweave.tex
  • Results of tools::texi2dvi("exampleSweave.tex", pdf=TRUE) on tex file exampleSweave.pdf
  • Results of Stangle("exampleSweave.rnw") on Example Sweave File exampleSweave.R
Useful within text

Add R code within text

\Sexp{1+2}

Sweave is not restricted to tex/pdf

  • Sweave can produce html output. Just edit a basic html template document and then use the html Sweave driver:
    Sweave("filename.rnw", driver=RweaveHTML)
    
  • You can also use odfWeave to generate R code in OpenOffice documents.


Experimental Design

Experimental design (slides)


Instructor information

Aedin Culhane contact: aedin@jimmy.harvard.edu

Additional Resources and Manuals

(these will not be covered in the course, but maybe helpful if you are new to R)

Lecture notes from Bio503 Programming and Statistical Modeling in R (Jan 2011)


biomaRt

biomaRt is an excellent resourse for gene annotation. Its has an extensive manual. I will provide one simple example here Remenber, if you can't remembers what is biomart, or need a help in getting started, Use its web interface to get your attributes and filters.
 
library(biomaRt)
mart=useMart("ensembl")
mart<-useDataset("hsapiens_gene_ensembl",mart)
geneAnnt<-getBM(attributes=c("affy_hg_u95av2","hgnc_symbol","chromosome_name","band", "entrezgene"),filters="affy_hg_u95av2",values=c("1939_at","1503_at","1454_at"), mart=mart)

Make a basic eSet from text files of data and annotation


makeEset<-function(eSet, annt){
    metadata <- data.frame(labelDescription = colnames(annt), row.names=colnames(annt))
    phenoData<-new("AnnotatedDataFrame", data=annt, varMetadata=metadata)
   # pData(eSet) = pData(phenoData)
    if (inherits(eSet, "data.frame")) eSet= as.matrix(eSet)
    if (inherits(eSet, "ExpressionSet")) eSet=exprs(eSet)
    data.eSet<-new("ExpressionSet", exprs=eSet, phenoData=phenoData)
    print(varLabels(data.eSet))
    return(data.eSet)
}

Bioconductor Resources


Updated.Oct 2011. Aedin Culhane