To many the idea that tenure-track hiring favors women is not only a surprising finding but a disturbing one, hence the impatience of the following blogger with a commenter who asked for evidence of anti-female hiring bias: "As to whether there is gender discrimination in hiring in academia and elsewhere, come on! This is well documented in Virginia Vallian's book ‘Why So Slow’ which reviews the many, many studies which show how changing the gender on an application gets fewer offers, lower salaries, and so on
“I have seen study upon study — including metastudies — that show that women's CVs are consistently evaluated as less impressive than men's even when they are identical and only the names are changed; that they are consistently less likely to be promoted, their research is consistently viewed as less good/relevant, etc.
This view of bias against hiring women is sufficiently ingrained that numerous university policies are predicated on it, such as this one from Boston University: “research studies have shown that biases and assumptions can affect the evaluation and hiring of candidates for academic positions. These studies show that the assessment of résumés and postdoctoral applications, evaluation of journal articles, and the language and structure of letters of recommendation are significantly influenced by the sex of the person being evaluated.
Googling strings such as “Wenneras and Wold bias against hiring women” will yield many similar claims based on experimental findings from social psychology. We reviewed many of these studies in our past work, including two metaanalyses.
Given the dominant narrative, which our findings contravened, it was unsurprising that the reaction was swift and ugly. Critics assailed not only our methods and interpretation, but also what they presumed to be our sociopolitical agenda and character. (This is the basis of a paper we are writing and will not be remarked on further.)
Here we limit discussion to critiques posted about our methods and interpretation. Elsewhere, we have posted a series of responses in the Huffington Post and American Scientist that addressed these criticisms
In this post we describe our methods and interpretation and welcome SPSP members to weigh-in.
In the online supplement to our article we described audits of actual academic hiring. Typically, they show that women apply less often than men for professorial jobs but when they do apply, they are more likely to be interviewed and hired. For example, in one NRC study of tenure-track hiring at R1s during 2002-2004, 20% of applicants for mathematics professorships were women, but 32% of those offered the job were women. Similar pro-woman preferences in tenure-track hiring was found in all of the fields in which women are underrepresented in the academy—chemistry, physics, engineering, computer science, and mathematics. It is also found, albeit to a lesser extent, in biology and psychology, fields in which women are well-represented. In our article we described 8 large-scale audit studies and they showed a pro-female hiring bias; none ever showed a pro-male bias. Since publishing our study, numerous correspondents shared with us additional audit studies from their own universities and national organizations and all of these accord with a pro-female preference. Usually, the preference is sizable. For example, in a national audit of U.S. and Canadian computer science hiring conducted by the Computer Research Association, new women recipients of PhDs submitted far fewer job applications than men but they received many more offers per application. Female new hires obtained 0.77 interviews per application (vs. 0.37 for men), and received 0.55 offers per application (vs. 0.19 for men), prompting the report’s authors to conclude “Obviously women were much more selective in where they applied, and also much more successful in the application process.”
There has been some speculation as to why women seem to be advantaged in academic hiring at the entry level. A popular view is that women are usually stronger applicants due to having endured sexism, less psychological access to resources, and lack of same-sex mentors in graduate school that winnowed their numbers and resulting in only the strongest PhD recipients competing for tenure-track positions. This view is widespread. For example:
“(Some) assume that if a field has X percent of women applicants, then if women get X percent of grants, we know bias is not operating. But is that right? Perhaps the women who survive training in a field where they have few mentors and surmount barriers most men may have little knowledge of, might actually be better. At least we cannot assume they aren’t.
“Given qualified women drop out of math-intensive fields at higher rates than their male peers, there probably is sampling bias among those who remain. Thus, the women who remain are probably, on average, better than their male colleagues and should be having better outcomes on average…(and) indicates gender discrimination still exists, not that this problem has been solved
In our five experiments we addressed this claim by asking whether women will continue to be preferred if they compete against identically-qualified males. If so, then the preference must be the result of other factors, such as valuing gender diversity when two applicants are otherwise identical. To address this issue, we constructed nationally-representative samples of faculty from four fields, two of which have good representations of women (psychology, biology) and two of which do not (economics, engineering). We stratified sampling of tenure-track faculty in these four disciplines by gender of faculty and their institutional Carnegie classification (large doctoral-granting, baccalaureate/master degree, and small liberal arts). Across the five experiments 872 tenured or tenure-track faculty provided full data (34% response). Each was asked to evaluate between 1 and 3 hypothetical job candidates for a tenure-track post in their department. Using essentially polling methods, we assigned sample weights to the 34% of respondents in our main experiment, based on the numbers of women and men in their departments, the number of similar departments nationally, the number of similar institutions in the respondent sample, and the number in the nonrespondent sample. Respondents and nonrespondents were similarly distributed across all strata and all analyses were run with sample weights as well as unweighted; the results were nearly identical. In addition to scientifically sampling faculty and using sample weights, we also tested a paid sample of psychologists who provided a 90% response rate (82 out of 90) and their preferences were the same as psychologists not paid.
Faculty in our study were told to imagine their colleagues attended the job talks of applicants for a tenure-track assistant professorship, read their CVs, met with them, and read their letters of reference; in four of the five experiments, their colleagues short-listed three of these applicants. Which would they prefer to hire? In these four experiments faculty were not given the CVs of the applicants, but instead were given narrative summaries that can be found in the online supplement. There were good reasons for not using actual CVs, which we described in the paper. These had to do with the non-comparability of what faculty at various institutions (and even within the same institution and field but in different subfields) regard as excellent. The same CV is often viewed differently by different subfields (e.g., mechanical engineers regard "proceedings" less favorably than do electrical engineers) and by different Carnegie-classified institutions (e.g., a given number and type of publications for an applicant at a small teaching-intensive college may be viewed differently by faculty at a large research-intensive institution). So, we used text summaries that described the research rating their hypothetical colleagues gave to each applicant based on reading CVs, attending job talks, and reading reference letters; they gave an overall strength score based on a 10-point-scale. No number of publications, type of outlet (journal, proceedings, chapters, etc.), order of authorship, etc. was given, only summaries describing the applicants’ scholarship and quotes from reference letters, along with their colleagues’ overall rating of the applicants’ strength and some summative text. (In one experiment we did use actual CVs, but it was restricted to a single subfield of 35 mechanical engineering faculty; their data was statistically no different from what was found for this same subfield when given narrative summaries.)
Overall, we found a 2-to-1 preference for hiring women applicants over otherwise-identical men. It was true in nearly all contrasts (field, type of institution, gender of faculty, rank of faculty, applicant lifestyle), with a few interesting exceptions in which faculty preferred men or expressed no preference. Critics understandably expressed concern about the ecological realism of these experiments, pointing out that in real tenure-track hiring, it is customary to attend interviews, job talks, and read CVs, rather than being given colleagues’ evaluations based on these activities. Therefore, the claim was made that our results may not generalize to the real world academic hiring. However this misses the fact that women are preferred in real-world hiring and we have known this for at least two decades, and we reviewed eight such audit studies in our article (and more exist that accord with them.) We wanted to know if this pro-woman hiring preference was due to women being stronger applicants as some claim, not whether they women were preferred under actual hiring conditions.
Other concerns were expressed about whether our findings extend to cases in which applicants are not as impressive as the ones on our short-list, a reasonable point--although we designed it this way and justified doing so in the article, noting that when 100+ applicants, all of who successfully finished doctoral programs and garnered publications and strong letters, compete for a single tenure-track post, and the search committee selects the top three, they are usually very impressive. (And in one experiment with mechanical engineering faculty in which we used actual CVs, the same pro-woman preference was found, even slightly greater, actually, though not significantly so.)
While we expected critics would challenge various decisions we made when we designed these experiments, we were caught off-guard by the vitriol in blogs, comment threads, and tweets. In some of the posts above we describe these comments, and SPSP readers can judge whether they fall into the category of criticism or character-bashing.