Letters from a young statistician

Occasionally, I have thoughts that I can't fit into a 144-character tweet, and I need somewhere to put them. This is that place! I'm learning Jekyll as I go, so please bear with any bugs, and email me if you see something wrong.

What are the chances my name is Amanda?

I get called Amanda a lot. This tends to drive me crazy, because I think my real name is much more interesting. But, I realized recently that given the prior probabilities, it's actually a very reasonable thing to call me.

[Note: this blog post has some interactive elements, so it is probably more fun to read on my shiny server. And of course, if you want to see what I did, the code is on GitHub.]

Census data: A rant

As usual toward the end of the semester, my students begin working in earnest on their final projects. And as usual, this results in my selective amnesia about census.gov being ruptured yet again. Looking for a way out, I turned to twitter and got some good recommendations. In case it's useful to other teachers/people, I'll share my thoughts.

(The tl;dr is that you probably want students to be using Social Explorer, but read on for the full rant.)

Statistics graduate school advice

It's that time of year again, the time where I find myself meeting with students thinking about graduate school in statistics. Since I often end up sending people the same things, I figured I'd pull them together into a blog post. You probably already know about the grad cafe, the professor is in, and PhD comics. These other links might not be as common.

Worth adding to your inbox

For the most part, I get my data news from the web (blogs like flowingdata, simply statistics, stats chat) and twitter (check out the people I follow). However, there are a few email mailing lists I've joined that are worth the addional lines in my inbox. Here they are:

OpenVisConf talk transcript

During OpenVisConf, I hurriedly posted some links from my talk so people could follow up with resources I had mentioned. I kept meaning to write up a fuller blog post, but of course life comes along (in the interim, Smith has finished finals, I've graded a ton of projects, gone out of the state twice, and attended commencement).

So, when the official transcripts from the conference were posted, I knew I'd struck gold. Amanda Lundberg was the transcriptionist, and she did a fantastic job. The transcripts are provided with a creative commons license, so I grabbed the text from my talk, edited it a bit, and stuck in some images.

It was quite interesting to read the transcripts as-is. I know Amanda edited out a lot of "um"s and "uh"s as everyone was speaking, but a person's verbal tics still get through. In my case, I say "that", "so", and "really" a LOT. Some of those have been removed here for ease of reading.

On a more interesting note, the organizers used TF-IDF to identify unique words and bi-grams from each talk. Mine make it pretty clear I'm a statistician. Words like "distribution", "statistic", "observed difference" and "bootstrap" all appeared more in my talk than others.

Do you know Nothing when you see it?

I'm currently at OpenVisConf and in awe of my fellow presenters. I did my presentation, "Do you know Nothing when you see it?" before lunch and have been slowly unwinding since then. A couple people asked for links to my slides and resources, so here they are!

I saved my Keynote slides as HTML and posted them on my website. They seem even lower quality than in Keynote (how is that possible?!) but you can access them here.

What's wrong with being data-collecting pigeons?

As you might know from my blog or twitter, I've been a Fitbit person since December 2013. I'm also very interested in participatory sensing and citizen science, and I've been thinking about the ways that data can be used for good, both on the personal and societal level.

Contextual notes

Because I tend to select books by how thick they are (super-fast-reader problems) and I am a glutton for punishment, I have been slowly working my way through two of David Foster Wallace's books concurrently. I am reading a paper copy of Infinite Jest, but the Kindle edition of The Pale King. These contrasting experiences got me thinking about contextual notes, particularly in electronic media.

20th New England Isolated Statisticians Meeting (NEISM)

Now that I am at Smith, I am technically an "isolated" statistician. It's a funny term, because I don't feel isolated. I have great colleagues in the Statistical and Data Sciences program (Ben Baumer, Jordan Crouser and Katherine Halverson), although I am the only person with a PhD in statistics. However, there is a great mailing list for isostats and a few local meetings, so I am happy to accept the designation if it means I get to participate in these great discussions, many of which are teaching-oriented.

Earlier this month, I got to go to my first NEISM (New England Isolated Statisticians Meeting). It was great fun, and as always I got more out of the experience by livetweeting it.

Fitbit colors

I have been a devoted Fitbit user for over a year now, and I think that as a device it is the sort of thing that a statistician would natually enjoy. It produces rich data about my day-to-day life, can be used to identify notable dates in my year, and does a great job of making me more physically active "on the margin." But, the Fitbit app designers have made very strange color choices.