Some of the colleges will host pre-DataFest workshops, and there will also be some at the event itself. If you can think of a topic you would like to see presented (or present yourself!) please let us know. 

Pre-DataFest workshops

Data Wrangling in R, with Paul and Jonathan (Amherst College)
Wednesday, February 15th – 7:30-8:30 pm in SM 204


Regression in R, with Silvia, Leonard, and Connor (Amherst College)
Wednesday, February 22nd – 7:00-8:00 pm in SM 204


Intro to ggplot2 in R, with Pei and Sarah (Amherst College)
Wednesday, March 1st – 7:00-8:00 pm in SM 204


Intro to Machine Learning in R, with Azka, Caleb, and Muling (Amherst College)
Monday, March 20th – 7:00-8:00 pm in SM 204


Intro to R/RStudio, with GRiD (UMass Amherst)
Wednesday, March 22nd – 4:30 pm in ISB 145


Data Pre-Processing in R, with GRiD (UMass Amherst)
Wednesday, March 29th – 4:30 pm in ISB 145


Workshops during DataFest

Introduction to R, with Martha Miller and Eddie Pantridge (MassMutual)
Friday, March 31 – 8:15-9:00 pm
Prerequisites: none
An accessible introduction to the statistical programming language R, intended for complete beginners. R makes it easier and faster to play with larger data sets than Excel can handle.


Thinking fast with dplyr, with Freddie Sanchez (MassMutual)
Friday, March 31 – 9:15-10:00 pm
Prerequisites: basic R
In this workshop, we’ll practice creating and answering questions about data quickly with dplyr. dplyr is an R package that provides easy-to-use functions for playing with large data.


Modeling with R, with Dana Udwin (MassMutual)
Saturday, April 1 – 11:00-11:45 am
Prerequisites: basic R
We will learn how to use and evaluate linear regression for continuous predictions, logistic regression for binary predictions, and clustering analysis to identify subgroups in a population.


Visualizing data with ggplot2, with Em Beauchamp (MassMutual)
Saturday, April 1 – 1:00-1:45 pm
Prerequisites: basic R, dplyr
This introduction will focus on using the R package ggplot2 to create neat visualizations. Near the end, we’ll work with spatial data and make some cool maps.


Handling dates with lubridate, with John Karlen (MassMutual)
Saturday, April 1 – 2:00-2:45 pm
Prerequisites: basic R, dplyr
In this session, we’ll learn how to reshape and summarize data with dates and times, using packages like dplyr and lubridate. We’ll also touch on creating timeseries.