Smith College Applied Statistics Lecture series (2012-2013)

All lectures are free and open to the public. No prior exposure to statistics is assumed.

  1. Unlocking the code to personalized medicine: Fact, fiction and statistics in genetic studies
    Andrea Foulkes, Associate Professor of Biostatistics
    Department of Epidemiology and Biostatistics
    University of Massachusetts Amherst
    Thursday September 27, 2012, noon, Burton Forum (3rd floor) (Clark Science Center), lunch provided, please bring your own drink.

    In this talk I will discuss state-of-the-science for relating genetic information to clinical outcomes. One of the main points I will discuss is the need to be applying the appropriate analytic tools that are designed to address the specific hypotheses we aim to address. Just as we would not use a stethoscope to look in someone's ear, or an otoscope to listen to someone's heart, we can only find what our analytic tools are designed to uncover. Secondly, I will speak about the challenge of translating findings involving associations to developing clinically relevant predictive models. Here, there are two important things to keep in mind. First, the more people we recruit to a study, the easier it is to declare a small effect 'significant'. In the case of genome-wide association studies, it is common now to involve as many as 10,000 people, leading to "discovery" of genetic polymorphisms that have only very small effects on disease. The second thing to remember is that a high relative risk does not imply a high absolute risk. Individuals may be twice as likely to get a disease if they have a particular genetic risk factor compared to people who do not have this characteristic, but the actual likelihood of getting the disease may be quite small. Finally, I conclude by emphasizing the importance of training in statistical and computational sciences for making significant strides in translational medicine.

  2. Adjusting for non-response in the Occupational Employment Statistics survey
    Nicholas Horton, Professor of Statistics
    Department of Mathematics and Statistics
    Smith College
    Thursday October 4, 2012, noon, Burton Forum (3rd floor) (Clark Science Center), lunch provided, please bring your own drink.

    Missing data is a common occurrence in all real world investigations. Past research indicates that employment size, industry sector, multi-establishment status, and metropolitan area size, along with important interactions, have a significant impact on an establishment's propensity to respond to the Bureau of Labor Statistics Occupational Employment Statistics survey (OES). Using administrative wage data linked to the sample, we find that these establishment characteristics are related to wages; wage estimates are a major OES outcome variable. In this talk, I investigate the use of the administrative data for imputing missing data due to nonresponse. The multiple imputation method focuses on adjusting the OES wage estimates with this auxiliary data to reduce potential bias.

  3. Cluster pruning: finding a better cluster center
    Amy Wagaman, Assistant Professor of Mathematics (Statistics)
    Department of Mathematics
    Amherst College
    Thursday October 11, 2012, noon, Burton Forum (3rd floor) (Clark Science Center), lunch provided, please bring your own drink.

    How do you look for groups of similar objects in data? Once you have groups, how do you pick a good representative object out of a group? Clustering refers to a collection of methods that group observations together into clusters of objects according to a given similarity measure. The end result for most clustering methods is that each observation is assigned to a cluster. (Advanced methods called fuzzy methods exist where observations may be assigned to multiple clusters, and other methods exist where observations could be left out as too unique to be in a cluster at all.) What are the possibilities that exist when you want to extract a cluster center for each cluster or a representative object from each cluster? This talk will provide an introduction to clustering methods and dimension reduction methods, and discuss the use of dimension reduction methods to improve the extraction of representative cluster objects. The research work is motivated by a real problem faced during the CASP protein folding competition, where scientists had to submit representative protein structures that were predictions for protein folded/native structures from a large collection of possible structures.

  4. Information Performance
    Mark Hansen, Director, David and Helen Gurley Brown Institute for Media Innovation & Professor of Journalism
    Columbia University
    March 12, 2013, 7:00pm, Ford 240 (Clark Science Center).

    Far from virtual, inert quantities, data exert real forces in the physical world. They are incendiaries wielding the power of once-secret diplomatic cables; mores initiated with the invention of a privacy setting; and physical laws shaping the built environment with the quiet, persistent action of zoning regulations. Data rarely act in isolation, gaining power through combination, joining forces and moving into new terrain. Their presence is thought to guarantee transparency, their absence is seen as suspicious, and restrictions on their movement appear to be temporary, at best. Mark took a broad view of data (and companion ideas like "algorithm," "model" and "visualization") and explore their use in creative practices. Mark will present a selection of work from his artistic collaborations over the last decade -- From a permanent display in the lobby of the New York Times building and a new work for the 9/11 Memorial Museum in New York City; to a performance designed as part of the New York Public Library’s centennial celebration last June. He tied these artworks to a larger movement in which data and data processing are seeping into almost every academic discipline on campus. Mark's talk will culminate with a proposal to aggregate the data practices from science, the humanities, and even art and design under a single umbrella -- Data science.

Thanks to the Department of Mathematics and Statistics and the Center for Women in Mathematics for support of the series.

Applied Statistics Lecture series (2011-2012)

Other 5 college seminars of interest:
University of Massachusetts Statistics and Probability Seminar Series

University of Massachusetts Biostatistics and Epidemiology Seminar Series

Organized by Nicholas Horton.
Last updated August 10, 2013