Wednesday, May 7, 2014

S

John Chambers

John Chambers * Bell Labs * 1976

R

gentleman ihaka
Robert Gentleman & Ross Ihaka * University of Auckland, New Zealand * 1993

ETL

http://www.rcasts.com/2012/11/big-data-etl-and-big-data-analysis.html

ETL

Modeling

Modeling

Linear Models

Poisson Regression

Poisson Regression

Time Series

Graphics

Graphics

Machine Learning

  • Ridge and Lasso Regression
  • K-means Clustering
  • K-medoids Clustering
  • Hierarchical Clustering
  • Decision Trees
  • Random Forests
  • Splines
  • Generalized Additive Models

Penalized Regression

K-means Clustering

Plot of wine data scaled into two dimensions and color coded by results of K-means clustering

K-means Clustering

Gap curves for wine data.  The blue curve is the observed within-cluster dissimilarity, and the green curve is the expected within-cluster dissimilarity.  The red curve represents the Gap statistic (expected-observed) and the error bars are the standard deviation of the gap.Gap curves for wine data.  The blue curve is the observed within-cluster dissimilarity, and the green curve is the expected within-cluster dissimilarity.  The red curve represents the Gap statistic (expected-observed) and the error bars are the standard deviation of the gap.

Hierarchical Clustering

Hierarchical clustering of wine data

Hierarchical Clustering

Hierarchical clustering of wine data split into three groups (red) and 13 groups (blue)

Decision Trees

Splines

Reporting and Presenting

This whole presentation

R code and all

R for Everyone

R for Everyone

R for Everyone

Based on. . .

What's Inside

Encouraging Girls in STEM