What are the chances my name is Amanda?

13 Jan 2017

I get called Amanda a lot. This tends to drive me crazy, because I think my real name is much more interesting. But, I realized recently that given the prior probabilities, it’s actually a very reasonable thing to call me.

[Note: this blog post has some interactive elements, so it is probably more fun to read on my shiny server. And of course, if you want to see what I did, the code is on GitHub.]

To investigate this using data, I am using the babynames R package, which has data from the Social Security Administration. It includes all names that had at least 5 uses for a particular gender in a given year. Obviously, that leaves some people out, but actually this data includes most (documented) people in the United States. For more about this, see my data appendix.

The basic idea is to take my (approximate) age and see how likely it is that my name is Amanda. I look incredibly young, but given that I have a PhD and am a statistics professor, there’s a lower bound on how young I could really be. Lets assume that people talking to me believe I was born between 1980-1989, inclusive. So the question is, given that I was born then, what are the chances my name is Amanda?

In a blog post at the beginning of last semester, Mine linked to a 538 article from 2014 that approaches the problem from the other side– How to tell someone’s age when all you know is her name. Interestingly, although using this method would help confirm people’s suspicions that I’m a child (the Age|Nam**e method would estimate I’m about 13 years old), that’s not how people tend to think about names and ages.

Amelias'

Instead, people seem to be taking an approximate age and then just grabbing a name out of the hat. In other words, they are using the Name*|*Ag**e method. With that in mind, let’s see how likely Amanda really is. I’m focusing just on girls names for this analysis, mostly because Amelia isn’t a common boy’s name (see my data appendix).

In the eighties, numbers for Amanda and Amelia were as follows:

name number proportion
Amanda 369690 0.0215329
Amelia 9734 0.0005670

Armed only with the information that I was born between 1985 and 1989, there’s a 2% chance my name is Amanda. That’s actually pretty incredible! (In contrast, there’s just a 0.05% chance my name is Amelia.) And, we can figure most people remember at least the first letter of a name. So, what if we add the fact that my name starts with an A?

In the eighties, numbers for Amanda and Amelia (out of all A names) were as follows:

name proportion
Amanda 0.1575295
Amelia 0.0041478

Now, there’s a 15.7% chance my name is Amanda (and a 0.4% chance my name is Amelia). The only A name that was more popular than Amanda in the eighties was Ashley, which made up 19.5% of the female names starting with A. Of course, Ashley doesn’t have as similar of a sound to Amelia as Amanda does. I didn’t go this far, but we could also look at the names starting with Am– I think this would serve to solidify Amanda as the much more likely choice.

So, for those of you that have felt bad about getting my name wrong in the past– the data supports you!

I’m not sure if doing this analysis has made me feel better about being called Amanda– for that, I just tell myself “they’re probably mixing me up with Amanda Cox.”

Your name

Does this trend hold up with the name people are always calling you? Are you a Jacob that always gets called Jason? A Kirstin that gets called Kristen? (I’m guilty of that one.)

[This is the part of the post that is responsive, so you probably want to head to my shiny server.]

You want to go to my server version

Data appendix

I can get off-track when doing analyses, so here are a couple more thoughts.

How full is the data?

One thing I was worried about was how well the data really represented the population. Uncommon names are excluded for privacy purposes, and I thought maybe people were getting more (or less) creative with names over time. It turns out that may be the case, but only slightly.

Missing data'

Min. 1st Qu. Median Mean 3rd Qu. Max.
2.462 3.535 5.592 5.406 6.793 9.374

On average, only 5% of people are missing from the data. That feels pretty good to me. The article mentioned above, 538 estimated that only 1% of the data was missing, which I’m not sure how they estimated.

How many baby boys are named Amelia?

Answer– not many. In 2004, the year with the most male Amelias, 14 baby boys were named Amelia. Or, someone checked the wrong box on a birth certificate.

year sex name n
2004 M Amelia 14

Creativity in naming

Of course, there’s a lot deeper I could dig on this analysis. Just by scrolling through the data I noticed there are many other ways to spell Amanda, which didn’t get taken into account. (Maybe the creative spellings of Amelia balance it out.)

name number proportion
Aamanda 10 0.0000006
Amamda 178 0.0000104
Amanda 369690 0.0215329
Amandah 16 0.0000009
Amannda 23 0.0000013

A is for eighties

You can actually see the eighties pretty clearly when you look at plots of letters names start with. Girls’ names starting with A had been on the rise since the 1960s, but you see a local maximum in about 1984 and then a small decline before continuing to rise.

Letters over time'

A over time'