The Coast Guard had a one-word problem: A hoax caller kept contacting them, saying “mayday” and disconnecting.
Just a couple of seconds of the voice recordings was all that the agency had to work with. Coast Guard investigators had been snagging some hoax callers with old-fashioned detective work, GPS-based technology and searches on social media, where braggarts sometimes post about the commotion they’ve caused. But they couldn’t figure out how, if at all, a suspected repeat caller might be identified from just the word mayday.
And it was important to find a way. These calls were among the 150 or so confirmed or suspected hoaxes that the Coast Guard typically receives in a year, out of more than 16,000 legitimate calls from boaters needing help. Responding to the hoaxes wastes hundreds of thousands of taxpayer dollars, pulls resources away from boaters who might be in harm’s way and can put responders’ lives on the line.
It’s a felony to make a hoax call. The maximum penalty includes prison time, a $250,000 fine, a $5,000 civil penalty and possible restitution for search-and-rescue costs.
When Fred Roberts heard about the Coast Guard’s one-word problem, he had no ready solution, and he’s a hard guy to stump. Since 2009, Roberts has directed the Command, Control and Interoperability Center for Advanced Data Analysis, a U.S. Department of Homeland Security Center of Excellence that uses data analysis to address threats. From his office at Rutgers University in New Jersey, Roberts had worked with the Coast Guard on numerous challenges, from using years’ worth of data to determine which boats are most likely violating fisheries rules, and therefore should be boarded, to figuring out more efficient ways to position Coast Guard fleets when the agency faces budget cuts — a challenge that, Roberts says, helped save $120 million through a data-analysis approach.
The Coast Guard figured the CCICADA might be able to help with the one-word conundrum. “They said, ‘We have this person who seems to be a serial hoax caller, and all we have is one word: mayday. Is there anything you can tell us about this guy that might help us?’ ” Roberts says. “I’ll tell you the truth, the first thing I thought was, Wow. I’m not sure we can do anything. But I don’t want to say no to the Coast Guard, so we tried.”
The problem’s degree of difficulty, Roberts says, is beyond sky-high. Determining a person’s identity from a single recorded word, even with multiple recordings of that word, goes beyond anything the center has done using far larger data sets. It even goes beyond the cutting-edge technology used to track and apprehend cyberterrorists.
“The hoax calls, I would say, is up in the stratosphere as far as the challenge to develop new tools and techniques,” Roberts says. “It’s harder than cybersecurity in a different way. With cybersecurity, we have a lot of intelligent people working on it; a lot of the problem is developing best practices and sharing information. Technical is not the issue there so much as educating people.
“But when it came to the hoax calls and the voice forensics, that’s really totally new technology,” he adds. “A lot of it is still not developed. We’re just beginning to explore the use of this and how helpful it’s going to be.”
CCICADA is a consortium of universities and private entities, so Roberts thought about which partner might have something, anything, that the Coast Guard could use. He reached out to Rita Singh at Carnegie Mellon University in Pittsburgh. Singh’s specialty at the university’s School of Computer Science is using algorithms for voice recognition — specifically, applying artificial intelligence, or machine learning, to the field of voice forensics.
And she, like Roberts, had no initial answer to the Coast Guard’s question, for the simple reason that nobody had ever asked. Her team’s work had been rooted in the academic realm; applying it to a real-world problem was a novel concept. “It’s not something that we had ever looked at or thought about,” Singh says. “But we had 30 years of experience and scientific research into speech, so we looked at it and thought about it.”
The timing of the Coast Guard’s question, Singh says, was key. Coming up with an answer would have been impossible with machine-learning technology as recently as seven or eight years ago, she says, because artificial intelligence was not as developed as it is now. But with today’s capabilities, with just a single recorded word, some computers can take a reasonable guess at a surprising number of a hoax caller’s characteristics.
She and her team, based only on the word mayday, ended up giving the Coast Guard an amount of information that surprised everyone. “Rita Singh came back and said, ‘You know, based on what I heard, this person is about this tall and weighs this many pounds, is a native speaker and is probably talking from a warehouse with an electric fan in the background,’ ” Roberts says. “All of that just blew my mind. I was amazed.”
So were Singh and her colleagues, even as they were doing the analysis. “This problem forced us to put the pieces together and realize, whoa, this is something that we can actually do,” she says. “This is now an area of science that was started by the Coast Guard, and we’re frantically trying to build it up.”
Everyone working on the project acknowledges that the technology is still in its infancy, has yet to be tested in a court of law during a hoax caller’s prosecution and, admittedly, sounds just this side of the HAL 9000 from 2001: A Space Odyssey. The notion of being able to tell a person’s height, for instance, from hearing a single spoken word sounds like science fiction.
Singh says that understanding the technology and the way it breaks down a word such as mayday requires an understanding of how human speech occurs. Each person’s voice is unique. It is created when air comes out of the lungs and goes through the neck, where it makes the vocal cords vibrate. Vocal cords are membranes controlled by muscles, all of them slightly different in size, shape and other characteristics, inside each human’s body.
Also different is each person’s vocal track, which is used to turn sounds into words and sentences. The nasal cavity, trachea, windpipe — all are part of the vocal track, and all differ in shape and size in every person, too. “It’s like a chamber,” Singh says. “If you go into a big hall and shout, you hear echoes. You hear reverberation. The sounds produced by your vocal cords reverberate in your vocal chamber. That’s what we hear.”
Each person also changes those sounds by moving the lips, jaws and tongue — again, body parts with different shapes, creating unique sounds. “If you’re standing in a building, and if the shape of the building were different, the reverberation would be different,” Singh says. “That’s how we change the reverberation pattern in the voice.”
Machine-learning technology can break down these items and more into fractions of a second that humans can’t comprehend. Singh’s computers can slice a word like mayday into segments 1/40 of a second long, creating a data set that gives a computer plenty of raw material to analyze.
And from those fractional slices of the word mayday, the computer can start to figure out things such as the length of the windpipe in the vocal chamber, information that can be extrapolated to, say, take a reasonable guess at a hoax caller’s height. Some machines can determine the structure of the tissue in the body parts that made the sound, information that can help pin down, for instance, the speaker’s weight.
“Because that sound is reverberating in your vocal chamber and your vocal chamber is unique to you — shaped by your skeletal structure and the moisture level inside you and the way your brain works — all of these things are so individual to you that, by a study of those reverberations and other features, we can pick out a lot of things about you,” Singh says.
The technology is advancing every day, she says, as computers learn new ways to process information that humans can’t. Some machines can already take reasonable guesses at the type of location where a hoax caller might be located.
“Your environment also affects your speech,” she says. “If you’re sitting in a room right now, your voice is reflecting all the objects in the room, especially the walls, the ceiling, the glass pane of the window, all of those things. All of those reverberations end up in the audio signal being recorded when you talk into the phone. If your office is near a highway, the sounds of those cars will end up on that recording. It’s possible to say you’re next to a highway, or you’re somewhere close to a water body.
“We can roughly tell what materials your walls and ceiling are made of, or if there’s a carpet in the room, if your room is crowded with things or not,” Singh continues. “We’re not at the pull-the-rabbit-out-of-the-hat level yet, but we’re making progress, and there’s a clear path ahead.”
Her belief is that when the Coast Guard calls five or 10 years from now, researchers will be able to create a 3-D hologram — the future’s version of a police sketch — from analysis of a single-word hoax call. “This is not 50 years or 500 years away,” she says. “You can count the years on your fingers because artificial intelligence, as a field, is progressing so fast.”
For now, the Coast Guard is adding the information that Singh’s team provides, in collaboration with CCICADA, to its other tools, especially in cases where the agency in the past had no hope of discerning the hoax caller’s identity. “What they’ve done has given us some really valuable investigative leads to help us identify the perpetrator or help us exclude people,” says Marty Martinez, special agent in charge with the Coast Guard Investigative Service, Chesapeake Region.
That’s true, Singh says, but she’s also looking at a far bigger picture. The Coast Guard, in asking for this type of help, set researchers on a path that could make the world better far beyond the marine environment. “This has the potential to help so many people,” she says. “Hoax callers are everywhere. July Fourth is the big day for hoax callers. They’ll pick up phones and call in bomb threats to stadiums, all kinds of things — that’s when it peaks every year. There are people calling emergency rooms and reporting emergencies that aren’t there. There are people who say, ‘I have a person in a basement at gunpoint.’ There are virtual kidnappings on the rise. They call and say they have your kid, and they play the sound of some kid screaming, and they want you to send money right now.
“We want to handle all of it,” she says. “As a team, with our colleagues across the world, I think we can make this happen. It will require lots and lots of scientists, but it will happen, and it will be thanks to the Coast Guard.”
This article originally appeared in the December 2017 issue.