Some parts of machine learning are incredibly esoteric and hard to grasp, surprising even seasoned computer science pros; other parts of it are just the same problems that programmers have contended with since the earliest days of computation. The problem Amazon had with its machine-learning-based system for screening job applicants was the latter.
Amazon understood that it had a discriminatory hiring process: the unconscious biases of its technical leads resulted in the company passing on qualified woman applicants. This isn't just unfair, it's also a major business risk, because qualified developers are the most scarce element of modern businesses.
So they trained a machine-learning system to evaluate incoming resumes, hoping it would overcome the biases of the existing hiring system.
Of course, they trained it with the resumes of Amazon's existing stable of successful job applicants -- that is, the predominantly male workforce that had been hired under the discriminatory system they hoped to correct.
The computer science aphorism to explain this is "garbage in, garbage out," or GIGO. It is pretty self-explanatory, but just in case, GIGO is the phenomenon in which bad data put through a good system produces bad conclusions.
Amazon built the system in 2014 and scrapped it in 2017, after concluding that it was unsalvagable -- sources told Reuters that it rejected applicants from all-woman colleges, and downranked resume's that included the word "women's" as in "women's chess club captain." Amazon says it never relied on the system.
There is a "machine learning is hard" angle to this: while the flawed outcomes from the flawed training data was totally predictable, the system's self-generated discriminatory criteria were surprising and unpredictable. No one told it to downrank resumes containing "women's" -- it arrived at that conclusion on its own, by noticing that this was a word that rarely appeared on the resumes of previous Amazon hires.
The group created 500 computer models focused on specific job functions and locations. They taught each to recognize some 50,000 terms that showed up on past candidates’ resumes. The algorithms learned to assign little significance to skills that were common across IT applicants, such as the ability to write various computer codes, the people said.
Instead, the technology favored candidates who described themselves using verbs more commonly found on male engineers’ resumes, such as “executed” and “captured,” one person said.
Gender bias was not the only issue. Problems with the data that underpinned the models’ judgments meant that unqualified candidates were often recommended for all manner of jobs, the people said. With the technology returning results almost at random, Amazon shut down the project, they said.
Amazon scraps secret AI recruiting tool that showed bias against women [Jeffrey Dastin/Reuters]
(Image: Cryteria, CC-BY)