Is Big Worse than Bad?

Today’s post is about the concept of how being big could be worse than being bad in regards to Equal Employment Opportunity Enforcement Policies (EEOP).

So far on StatisticalBullshit.com, I’ve written about general Statistical Bullshit, Statistical Bullshit that I’ve come across in consulting, or Statistical Bullshit that readers have sent into the website.  I don’t *think* that I’ve written about Statistical Bullshit that was pointed out by an academic article.  For this reason, today’s post is about the concept of how being big could be worse than being bad in regards to Equal Employment Opportunity Enforcement Policies (EEOP).  Most of the material for this post comes from Jacobs, Murphy, and Silva’s (2012) article, entitled “Unintended Consequences of EEO Enforcement Policies: Being Big is Worse than Being Bad,” which was published in the Journal of Business and Psychology.  Rick Jacobs was my advisor at Penn State, so I am happy to have his article discussed on StatisticalBullshit.com.  For more information about this concept, please email me at MHoward@SouthAlabama.edu or check out Jacobs et al. (2012).


As stated by Jacobs et al. (2012), “The Equal Employment Opportunity Commission (EEOC) is the chief Federal agency charged with enforcing the Civil Rights Acts of 1964 and 1991 and other federal laws that forbid discrimination against a job applicant or an employee because of the person’s race, color, religion, sex (including pregnancy), national origin, age (40 or older), disability, or genetic information” (p. 467).  In other words, the EEOC ensures that businesses do not discriminate against protected classes, and this includes in business employment practices.

When a disproportionately low number of peoples from a protected class are hired, most often relative to the majority class, this is called disparate impact.  But how do we know whether a “disproportionately low number” has occurred?  In EEO cases, there are many methods, but the 80% rule and statistical significance testing are among the most popular.

The 80% rule specifies that disparate impact occurs when members of a protected class are hired at a rate that is less than 80% the rate of the majority class.  Let’s take a look at the following example to figure out what this means:

Hired

Applied

Ratio

80% Rule

Caucasian

20

40

1:2 (.50)

.50 * .80 = .40

African American

5

20

1:4 (.25)

.40 > .25

In this example, 40 Caucasian people and 20 African Americans applied for the same job.  The organization selected 20 Caucasians and 5 African Americans for the job.  This results in 50% of the Caucasians being hired, but only 25% of the African Americans being hired.  To determine whether disparate impact occurred, we take .50 (ratio for Caucasians) and multiply it by .80 (80% rule).  This results in .40.  We then compare this number to the ratio of African Americans hired, .25.  Since .40 is greater than .25, we can determine that disparate impact occurred on the basis of the 80% rule.

Although it may seem relatively simple, the 80% rule works quite well across most situations.  But what about the other method – statistical significance testing?

Many different tests could be used to identify disparate impact, but the chi-square test may be the simplest.  The chi-square test can be used to determine whether the association between two categorical variables is significant, such as whether the association between race and hiring decisions is significant.  So, we can use this to test whether disparate impact may have occurred.

To do so, you can use the following calculator: https://www.graphpad.com/quickcalcs/contingency2/ .  Let’s enter the data above, which would look like the following in a chi-square calculator:

Hired

Not Hired

Caucasian

20

20

African American

5

15

The resultant p-value is .06, which is greater than .05.  Not statistically significant!  Although this is the exact same data as the 80% rule example, the chi-square test determined that it was not a case of disparate impact.  Interesting!

But what happens when we double the size of each group?  The 80% rule table would look like this:

Hired

Applied

Ratio

80% Rule

Caucasian

40

80

1:2 (.50)

.50 * .80 = .40

African American

10

40

1:4 (.25)

.40 > .25

Again, the resultant ratio for Caucasians is .50, which is .40 when multiplied by .80 (80% rule).  The resultant ratio for African Americans is .25, which is smaller than .40.  This suggests that disparate impact occurred on the basis of the 80% rule.

On the other hand, let’s enter this data into the chi-square calculator, which would look like this:

Hired

Not Hired

Caucasian

40

40

African American

10

30

The resultant p-value is .009, which is much less than .05.  Statistically significant!  While the ratios are identical for the two examples, the latter chi-square test determines that disparate impact occurred.

This is the idea behind “Being Big is Worse than Being Bad.”  Although both examples had the same ratio, and thereby were just as bad, the chi-square test indicated that disparate impact only occurred in the latter example, which was bigger.  Thus, significance testing has concerns when identifying disparate impact, because the sample size strongly influences whether a result is significant or not.

So, do we just apply the 80% rule?  Not necessarily.  Jacobs et al. (2012) call for “a more dynamic definition of adverse impact, one that considers sample size in light of other important factors in the specific selection situation” (p. 470), and they also call for a less-simplified view of disparate and adverse impact.  While the 80% rule can certainly address certain problems that significance testing encounters, it cannot satisfy all the needs within this call.

Issues like these is why we need more statistics-savvy people in the world.  Disparate and adverse impact are huge issues that impact millions of people.  And many of these decisions are not made by statisticians.  Instead, they are made by courts and companies.  Even if you aren’t interested in becoming a statistician, the world still needs people that understand statistics – and know how to watch out for Statistical Bullshit in significance testing!

That’s all for today.  If you have any questions, please email me at MHoward@SouthAlabama.edu.  Until next time, watch out for Statistical Bullshit!

Excel Statistics Help

Excel HelpSorry, but there is no Statistical Bullshit this week!  No, the world did not run out of it – trust me, there is still plenty.  Instead, I’ve been developing a section of another one of my websites: MattCHoward.com.  The Statistics Help Page of my academic website has been getting lengthier and (fortunately) more popular.  One of the most common requests that I receive is for guides on calculating statistics in Excel.  This is understandable.  Other stats programs can be expensive, whereas most everyone has access to Excel in their workplace or home.  So, I’ve been spending time writing short guides on my Excel Statistics Help Page.

A primary method to avoid Statistical Bullshit is to understand statistics yourself.  If you are unsure about calculating statistics in Excel, be sure to check out this page.  I’ll be updating it regularly throughout the current academic semester.  So, if you need a guide on a certain topic, let me know by emailing MHoward@SouthAlabama.edu.  I’d be happy to create a guide sooner rather than later!

Correlation Does NOT Equal Causation

Your variables may be related, but does one really cause the other?

Most readers have probably heard the phrase, “correlation does not equal causation.”  Recently, however, I heard someone confess that they’ve always pretended to know the significance of this phrase, but they truly didn’t know what it meant.  So, I thought that it’d be a good idea to make a post on the meaning behind “correlation ≠ causation.”

Imagine that you are the president of your own company.  You notice one day that your highly-payed employees perform much better than your lower-payed employees.  To test whether this is true, you create a database that includes employee salaries and their performance ratings.  What do you find?  There is a strong correlation between employee pay and their performance ratings.  Success!  Based on this information, you decide to improve your employees’ performance by increasing their pay.  You’re certain that this will improve their performance. . .right?

Not so fast.  While there is a correlation between pay and performance, there may not be a causal relationship between the two – or, at least, such that pay directly influences performance.  It is fully possible that increasing pay has little effect on performance.  But why is there a correlation?  Well, it is also possible that employees get raises due to their prior performance, as the organization has to provide benefits in order to keep good employees.  Because of this, an employee’s high performance may not be due to their salary, but rather their salary is due to their prior high performance.   This results in current performance and pay having a strong correlational relationship, but not a causal relationship such that pay predicts performance.  In other words, current performance and pay may be correlated because they have a common antecedent (past performance).

This is the idea behind the phrase, “correlation does not equal causation.”  Variables do not necessarily have a causal relationship just because they are correlated.  Instead, many other types of underlying relationships could exist, such as both having a common antecedent.

Still don’t quite get it?  Let’s use a different example.  Prior research has shown that ice-cream sales and murder rates are strongly correlated, but does that mean that ice cream causes people to murder each other?  Hopefully not.  Instead, it is that warm weather (i.e. the summer) causes people to (a) buy ice cream (b) and be more aggressive.  This results in both ice-cream sales and murder rates.  Once again, these two variables are correlated because they have a common antecedent – not because there is a causal relationship between the two.

Correlation does not imply causation

Hopefully you now understand why correlation does not equal causation.  If you don’t, please check out one of my favorite websites:  Spurious Correlations.  This website is a collection of very significant correlations that almost assuredly do not have a causal relationship – thereby providing repeated examples of why correlation does not equal causation.  If you do understand, beware of this fallacy in the future!  Organizations can make disastrous decisions based on falsely assuming causality.  Make sure that you are not one of these organizations!

Until next time, watch out for Statistical Bullshit!  And email me at MHoward@SouthAlabama.edu with any questions, comments, or stories.  I’d love to include your content on the website!