Collective action for gender pay equality

Despite women’s advances in the labor market, women on average still make less money than men, and women of color in particular are paid less money for equal work relative to their white counterparts. The gap in pay between women and men to referred to as the gender pay gap. This lab investigates the factors that might lead people of all genders to be willing to participate in collective action on behalf of decreasing this pay gap. These factors include people’s gender identity (i.e., how satisfied people feel with their gender group and how central this identity it to their self-concept), social dominance orientation (SDO; i.e., the extent to which people think some groups should have more social value than other groups), and their level of hostile sexism. Each of these constructs is measures with multiple items in the data set you will be using.

Getting Started

Load packages

In this lab we will explore the data using the dplyr package and visualize it using the ggplot2 package for data visualization. The data is stored in an SPSS file so we will use the haven package to load it. Lastly, we will use the psych and lavaan packages to conduct factor analysis and SEM.

Let’s load the packages.


Creating a reproducible lab report

Please write your lab code and answers in a R Markdown file. You are to submit the knitted HTML file to Moodle when your lab is complete.

The data

In this lab we will analyze the data from a study investigating what factors lead people to engage in collective action behaviors specifically surrounding gender pay equality.

The data were gathered from 272 respondents living in the United States. There are men (gender = 1) and women (gender = 0) in the data set. Collective action was measured with 5 items: collact1 to collact5, and the average of these items has already been computed, it is call collact. Hostile sexism is measured with 11 items: hs1 to hs11, SDO with 4 items: sdo1 to sdo4. Gender identity is measured with 14 items: id1 to id14. Similar to collective action, scale scores for hostile sexism and sdo have also been created. The gender identity scale is actually comprised of 5 subscales (idsat, idsolid, idcent, idih, idiss) which can further be averaged into 2 second order subscales (idselfinv, idselfdef), and finally averaged into one large scale score of all 14 items combined (identification). The data also contains a self-efficacy scale that you can play with if you have time.

Let’s load the data:

sdo <- read_sav("")

variable_view <- variable_view(sdo)

We have observations on 52 different variables, some categorical and some numerical. The meaning of each variable can be found by bringing up the variable_view data frame:


Exploring the data

We can first take a look at all of the variables with glimspe().


We might also make some univariate histograms and scatterplots of the relationships between pairs of study variables.

We might want to look at the items individually, but also the scale scores.

qplot(x = sdo3, data = sdo, bins = 5)
qplot(x = sdo, data = sdo, bins = 5)

The sdo scale score appears to be skewed to the right. That’s not great for the purposes of using sdo as an endogenous variable in a path analysis where we estimate parameters with maximum likelihood estimation. But we can keep it exogenous, no problem.

  1. Describe the shape of the collective action items as well as the collact scale score.

  2. Create a data visualization that shows the relationship between two variables from this list: sdo, collact, hsexism, and identification. Next, create a second visualization that allows you to see if this relationship is different by gender.

Path Analysis

Recall that for path analysis, each construct is a measured variable, that is, we’re only using the scale scores (means of items) in our model. Each endogenous variable has an error or “disturbance.”