Whale scientists could soon do themselves out of a job – or at least a tiring and repetitive one – by applying artificial intelligence (AI) to their research.
Using machine learning, a team from the Australian Antarctic Division, the K. Lisa Yang Center for Conservation Bioacoustics at Cornell University, and Curtin University, have trained an algorithm to detect blue whale ‘D-calls’ in sound recordings, with greater accuracy and speed than human experts.
Whale acoustician Dr Brian Miller said the technology will allow scientists to more easily analyse hundreds-of-thousands of hours of recordings of these elusive and difficult to study whales, to better understand trends in their populations as they recover from whaling.
“By analysing our recordings for D-calls and other sounds, we get a more complete picture of the behaviour of these animals, and the trends and potential changes in their behaviour,” Dr Miller said.
“The deep learning algorithm we’ve applied to this task outperforms experienced whale acousticians in accuracy, it’s much faster and it doesn’t get tired.
“So it frees us up to think about other big picture questions.”
Social calls
D-calls (see video above) are thought to be ‘social’ calls made by male and female whales on feeding grounds. Unlike male blue whale ‘songs’, which have a regular and predictable pattern, D-calls are highly variable across individual whales and across seasons and years.
This variability makes automation of the recording analysis harder than it would be for a consistent sound.
To overcome this, the team trained the algorithm on a comprehensive library of about 5,000 D-calls, captured in 2,000 hours of sound recorded from sites around Antarctica between 2005 and 2017.
“The library covered different seasons and the range of habitats we’d expect to find Antarctic blue whales, to ensure we captured variability in the D-Calls as well as the variable soundscapes through which the whales travel,” Dr Miller said.
Before the training could begin, however, six different human analysts went through the recordings and identified or ‘annotated’ the D-calls.
Rather than analysing the sound, the calls were turned into ‘spectrograms’ or visual representations of each call and its duration.
Using machine learning techniques, the algorithm trained itself to identify the D-calls from 85% of the data in the library, using the remaining 15% of the data to validate itself and improve.
Human vs Machine
The trained AI was then given a test dataset of 187 hours of annotated recordings from a year at Casey in 2019.
The research team compared the number of D-call detections the AI made, with those identified by the human experts, to see where they disagreed.
An independent human judge (Dr Miller) determined which of the disagreements were D-calls or not, to come up with a final ruling on who was more accurate.
“The AI found about 90% of the D-calls and the human just over 70%, and the AI was better at detecting very quiet sounds,” Dr Miller said.
“It took about 10 hours of human effort to annotate the test data set, but it took the AI 30 seconds to analyse this data – 1,200 times faster.”
The team has made their AI available to other whale researchers around the world, to train it on other whale sounds and soundscapes.
“Now that we have this power to analyse thousands of hours of sounds very quickly, it would be great to build more recording sites and bigger recording networks, and develop a long-term monitoring project to look at trends in blue whales and other species,” Dr Miller said.
The research is published in Remote Sensing in Ecology and Conservation.