Main Course Website Boston Bioconductor
Exploratory Data Analysis
- Importance of EDA
- Clustering Data using hierarchical cluster analysis
- Dimensions reduction using principal components analysis
- Interpreting Results of EDA
Exploratory Data Analysis (slides)
Exercise/tutorial
## Install R packages for this tutorial
install.packages(“RCurl”)
install.packages(“gplots”)
install.packages(“scatterplot3d”)
## Install Bioconductor Packages
source(“http://www.bioconductor.org/biocLite.R”)
#biocLite()
biocLite(“made4″)
biocLite(“hgu95av2.db”)
Exercise Data Files
- Normalised (vsn) fibroblast data data.vsn.csv. To load directly into R, use
read.csv("data.vsn.csv", row.names=1, as.is=TRUE)
read.csv("http://bcb.dfci.harvard.edu/~aedin/courses/BostonBioc/EDA/data.vsn.csv", row.names=1, as.is=TRUE)
- Samples annotation annt.txt. If your browser opens this on clicking, right mouse click and select save as.
To read directly into R
read.table("annt.txt", header=TRUE)
read.table("http://bcb.dfci.harvard.edu/~aedin/courses/BostonBioc/EDA/annt.txt", header=TRUE)
- Exploratory Data Analysis Exercise (pdf)
- This code is available in the following R script
- Sweave (rnw) file
- Results of html annotation from annaffy
example.html
Reproducible Research
- Importance of Reproducdible Research
- Why should we perform reproducible research
- A survey of reproducibility, cases studies
- Using Sweave
Reproducible Research (slides)
Manual on producing documents using Sweave and creating Bioconductor packages (pdf)
Exercise Data Files
Sweave style file... Sweave.sty
Load rnw file, use commands Stangle, Sweave to extract R and tex files respectively
- Example Sweave File (rnw file..edit this one)exampleSweave.rnw
- Results of Sweave("exampleSweave.rnw") on Example Sweave File (tex file.. do not edit) exampleSweave.tex
- Results of tools::texi2dvi("exampleSweave.tex", pdf=TRUE) on tex file exampleSweave.pdf
- Results of Stangle("exampleSweave.rnw") on Example Sweave File exampleSweave.R
Add R code within text
\Sexp{1+2}
Sweave is not restricted to tex/pdf
- Sweave can produce html output. Just edit a basic html template document
and then use the html Sweave driver:
Sweave("filename.rnw", driver=RweaveHTML)
- You can also use odfWeave to generate R code in OpenOffice documents.
Experimental Design
Instructor information
Aedin Culhane contact: aedin@jimmy.harvard.eduAdditional Resources and Manuals
(these will not be covered in the course, but maybe helpful if you are new to R)
Lecture notes from Bio503 Programming and Statistical Modeling in R (Jan 2011)
biomaRt
biomaRt is an excellent resourse for gene annotation. Its has an extensive manual. I will provide one simple example here Remenber, if you can't remembers what is biomart, or need a help in getting started, Use its web interface to get your attributes and filters.library(biomaRt) mart=useMart("ensembl") mart<-useDataset("hsapiens_gene_ensembl",mart) geneAnnt<-getBM(attributes=c("affy_hg_u95av2","hgnc_symbol","chromosome_name","band", "entrezgene"),filters="affy_hg_u95av2",values=c("1939_at","1503_at","1454_at"), mart=mart)
Make a basic eSet from text files of data and annotation
makeEset<-function(eSet, annt){ metadata <- data.frame(labelDescription = colnames(annt), row.names=colnames(annt)) phenoData<-new("AnnotatedDataFrame", data=annt, varMetadata=metadata) # pData(eSet) = pData(phenoData) if (inherits(eSet, "data.frame")) eSet= as.matrix(eSet) if (inherits(eSet, "ExpressionSet")) eSet=exprs(eSet) data.eSet<-new("ExpressionSet", exprs=eSet, phenoData=phenoData) print(varLabels(data.eSet)) return(data.eSet) }
Bioconductor Resources
- An Introduction to R and Bioconductor. Includes information about installation and getting help. Basic Introduction to R and Bioconductor
- Bioconductor Courses
- An excellent starter to Affymetrix data analysis Jean Wu’s excellent lab on Affymetrix data analysis
- Guide to importing GEO soft data files into bioconductor
- Thomas Girke’s (UC Riverside) intro into R and Bioconductor
Updated.Oct 2011. Aedin Culhane