Why your voice assistant might be sexist
Chris Baraniuk / TheConversation :
From reinforcing entrenched gender roles to potentially even fuelling misogyny, choosing the right voice for a particular task can be a minefield. James Bond flings open the door of his new BMW which comes with hidden machine guns as standard and immediately a feminine computerised voice announces, “Welcome! Please fasten seatbelt and obey all instructions for a safe trip.”
Bond’s MI6 colleague and master of gadgets, Q, pipes up to explain: “Thought you’d pay more attention to a female voice.” But, predictably, Bond later ignores repeated commands to wear his seatbelt and, using his mobile phone as a remote control, he subsequently drives the car off the top of a multi-storey car park in the 1997 blockbuster Tomorrow Never Dies.
Q’s market research was wrong – and so was BMW’s. The firm famously recalled a feminine-voiced GPS system from its cars when German drivers complained that they didn’t want to take instructions “from a woman”. But why won’t German men, British secret agents, or anyone else often follow directions delivered in feminine tones? Today, navigation systems with feminine voices are actually quite common. But multiple studies suggest that digital voices continue to reinforce deeply problematic gender stereotypes.
The smart speaker with a feminine voice that politely does your bidding versus the masculine-sounding recorded message that takes charge and orders you to stay clear of a reversing truck. Gender bias is rife in artificial intelligence (AI) systems, according to a widely discussed 2019 report from Unesco. The report’s title, “I’d blush if I could,” refers to the response that Apple’s voice assistant Siri used to give when people remarked to it, “Hey Siri, you’re a b****.”
While some improvements have been made to AI voice systems since then, many argue there is still a way to go. So how did gender bias get so deeply embedded in these systems in the first place – and how do we go about getting rid of it? The history of digital voices, and how we have used and abused them, doesn’t make for easy reading. Take the computer systems in aircraft that talk to pilots and provide information or warnings. One such system, which used voice recordings made by the singer and actor Joan Elms, was dubbed “Sexy Sally”. A more recent system, originally featuring the voice of actor Kim Crow, was informally named “B** Betty”.
And in the UK, the term “Nagging Nora ” is sometimes used. Similarly, staff on the London Underground are also reported to refer to an automated announcement system as “Sonya” because it “gets onya nerves”. A masculine equivalent of aircraft voice systems is called “Barking Bob” by some pilots – though, noticeably, that phrase doesn’t connote the same gender-based prejudice as the other epithets. It’s not just whether or not people have accepted feminine voices in certain roles, it’s also how developers have designed synthetic voices in the past to perform those roles that’s an issue, says Verena Rieser at Heriot-Watt University. Voice assistants have sometimes been incapable of recognizing and challenging inappropriate behavior. “These systems are gendered and anthropomorphized,” she explains. “There is basically a reinforcement cycle here.”
The default voice of assistants such as Siri or Alexa was always feminine in the past, though in recent years Apple and Amazon have made other options available. Despite this, you might find feminine synthesized voices to be more common than masculine ones. But why? It’s partly down to the fact that companies spent decades acquiring many more recordings of women’s voices than of men’s. This mass of data has influenced subsequent technologies, including AI. Women have operated telephone exchanges and loaned their voices to lots of pre-digital message systems, meaning that a feminine voice is what many have come to expect from helpful, compliant technologies. Research indicates that this fits with our misogynistic expectations of what tasks are “suitable” for women versus men. And yet other work suggests that there is actually little or no practical reason to priorities feminine voices over masculine ones for certain applications – the two are more or less equally intelligible and both are capable of delivering information effectively.
Despite that, instead of promoting gender equality, voice assistants have often done quite the opposite. Journalist Leah Fessler tested virtual voice assistants’ responses to sexual harassment back in 2017 and found multiple issues. When told “You’re hot”, Amazon’s Alexa replied obsequiously, “That’s nice of you to say”. To the remark, “You’re a slut”, Microsoft’s Cortana delivered a web search result with an article entitled: “30 signs you’re a slut”. In 2020, researchers from the Brookings Institution re-tested these interactions and found that they had improved somewhat. The voice assistants were more likely to push back against abuse than before, if not always very clearly. Big tech companies have also diversified the voices that users can select for their virtual assistants. There are more masculine options than before and Apple, for example, no longer pre-selects a feminine voice as the default for Siri.
But researchers who study gender argue that simply offering masculine voices as an alternative, and tweaking assistants’ responses to inappropriate language, still leaves us far from resolving the wider problem. A lack of diversity and sophistication remains in these disembodied voices, they say. Not least because of the identities that are often left out by virtual assistant systems. Some people identify as non-binary or gender fluid and it is increasingly accepted that gender identities are the sum of many factors, including social and cultural influences.
Can a digital voice capture this? The answer is “maybe”, partly because it’s difficult to synthesise an adult human voice that sounds anything other than either masculine or feminine. Plus, although we tend to think of masculine voices as deeper than feminine voices, this is not always true, says Selina Sutton at Northumbria University.
“There’s this middle range of pitches, fundamental frequencies, that’s the same for men and women,” she explains. Various projects have attempted to synthesise “gender-neutral” voices with varying success. Consulting firm Accenture produced an experimental gender-neutral voice, though it may simply sound masculine or feminine depending on how the listener perceives it.
In 2019, a team of designers and researchers came up with a project called Q (no connection to James Bond), which was billed as a “genderless voice”. Co-creator Ryan Sherman, who now works for design lab Space10, says he was inspired to develop Q after noticing the subservient feminine voices that often characterised virtual assistants – even when they were faced with aggression from human users. “It’s usually in a way that’s submissive and reinforcing this idea that women are available to help at the touch of a button,” he says.
Although Q is yet to be developed into a fully working synthetic voice system, the demo created by Sherman and his collaborators illustrated what it might sound like.
They recorded the voices of people who identify as non-binary and adjusted the pitch to between 145 Hz and 175 Hz, which straddles many feminine and masculine human voices. The result was played to 4,500 different people, Sherman says, and while some thought it sounded feminine while others perceived it as masculine, many judged it to be neither. Given this mix of responses, Sutton says Q is best described as “gender ambiguous” rather than neutral. While a greater diversity of voices in computer systems could help move technologies away from stereotypical feminine performances, just because a digital voice’s gender isn’t obvious doesn’t mean it can’t also be sexist, notes Sutton.
Much depends on what the voice says and the functions it performs.