Some Simple Reports in R ======================================================= We will look at some of the summary methods in R. This document will be available as a markdown doc, so you can use this to create MSoffice, pdf or html report files on your own data. # Define datasets ```{r defineData} data(mtcars) df<-mtcars dim(df) ``` ```{r loadLibraries} library(gmodels) library(Hmisc) library(ade4) library(markdown) library(knitr) ``` View data ```{r viewData} View(df) head(df) tail(df) str(df) ``` Basic Summary ```{r summary} summary(df) ``` Using the describe function ```{r describe} library(Hmisc) describe(df) ``` 1,2 and 3-way Cross Tabulations ================== ```{r basicTable} table(df$cyl) table(df$cyl, df$gear) #Number of cyclinders, numbers of gear, transmission type table(df$cyl, df$gear, df$am) ``` Crosstabulation using formula format ```{r CrossTabs2} xtabs(cyl~gear, df) xtabs(cyl~gear+am+vs, df) ``` Create Contingency Table ```{r contingency} ?ftable ftable(df$cyl, df$vs, df$am, df$gear, row.vars = c(2, 4), dnn = c("Cylinders", "V/S", "Transmission", "Gears")) ftable(df$cyl, df$vs, df$am, df$gear, row.vars = c(2, 3), dnn = c("Cylinders", "V/S", "Transmission", "Gears")) ``` 2 way cross tabulation in SAS format ```{r CrossTabSAS} library(gmodels) CrossTable(df$cyl, df$gear, format="SAS") CrossTable(df$cyl, df$gear, expected=TRUE,format="SAS") ``` 2 way cross tabulation in SPSS format ```{r CrossTabSPSS} library(gmodels) CrossTable(df$cyl, df$gear, format="SPSS") CrossTable(df$cyl, df$gear, expected=TRUE,format="SPSS") ``` Categorical Data ================= The library *vcd* is very useful Some Plots for Exploring Data ================================= - scatterplot ```{r scatterplot} attach(df) plot(qsec, mpg, col=cyl, pch=19, main="Miles per gallon by 1/4 mile time (by cylinder)") legend("topleft", legend=unique(cyl), fill=unique(cyl)) ``` - boxplot ```{r boxplot} plot(qsec~factor(cyl), col=unique(cyl)) ``` - boxplot all of the columns ```{r boxplotALL} boxplot(df) ``` - Correlation across ```{r pairs} plot(df) ``` Or calculate correlation and view on heatmap ```{r heatmap} heatmap(cor(df)) ``` Basic principcal component analysis ```{r prcomp} res<-prcomp(df) screeplot(res) biplot(res) ``` Or using fast.prcomp (optimized for big wide datasets) ```{r fastprcomp} res<-fast.prcomp(df) s.class(res$li, factor(cyl), col=unique(cyl)) s.arrow(res$li, cpoint=cyl) ``` ```{r dudi.pca} library(ade4) res<-dudi.pca(df, scan=FALSE) par(mfrow=c(2,2)) barplot(res$eig) s.class(res$li, factor(cyl)) s.label(res$co) s.label(res$li, clabel=0.5) ``` Missing Data =============== ```{r missing} df[sample(1:nrow(df),2),sample(1:ncol(df),2)]<-NA summary(df) ``` Analyzing >1 Dataset ========================= Often we have 2 or more tables either reflecting different time points of the same sample population or different measuments on the same population. *Merge Data* There are several function for manipulating data, see the plyr library for functions. Also see the function reshape and stack which make it easier to convert a "wide" table into a narrow one. ```{r merge} x1<-data.frame(Case=sample(letters,10), A1=rnorm(10), B1=1:10, C1=rep(1:5,2)) x1 x2<-data.frame(A1=seq(1,10,2), Case=sample(letters,10), D1=rnorm(10,4), E1= rep(1:5,2), B1=c(rep(c("Non-Smoker", "Smoker"), each=4),NA,NA)) x2 merge(x1, x2, "Case") ``` Multivariate methods for exploring covariance across studies ============================================================= Lets look at the doubs data in the ade4 package. This data set gives environmental variables, fish species and spatial coordinates for 30 sites ```{r doubs} require(ade4) data(doubs) lapply(doubs, head) ``` ```{r coinertia} dudi1 <- dudi.pca(doubs$env, scale = TRUE, scannf = FALSE, nf = 3) dudi2 <- dudi.pca(doubs$fish, scale = FALSE, scannf = FALSE, nf = 2) coin1 <- coinertia(dudi1,dudi2, scan = FALSE, nf = 2) plot(coin1) #s.arrow(coin1$l1, clab = 0.7) ``` How to Process this document ================================= ```{} require(knitr) dir(pattern="Rmd") knit("Reports.Rmd") knit2html("Reports.Rmd") knit2pdf("Reports.Rmd") purl("Reports.Rmd") ``` Or use pandoc to convert markdown file ```{} system("pandoc -s Reports.md -o Reports.pdf") system("pandoc -s Reports.md -o Reports.docx") system("pandoc -s Reports.md -o Reports.html") dir() ```