Minke whale detection in underwater imagery using classification CNNs

A predictable aggregation of dwarf minke whales occurs annually in the Australian offshore waters of the northern Great Barrier Reef in June-July, which has been the subject of a long-term photo-identification study. Researchers from the Minke Whale Project (MWP) at James Cook University collect lar...

Full description

Bibliographic Details
Published in:Global Oceans 2020: Singapore – U.S. Gulf Coast
Main Authors: Konovalov, Dmitry A., Swinhoe, Natalie, Efremova, Dina B., Birtles, R. Alastair, Kusetic, Martha, Adams, Kent, Hillcoat, Suzanne, Curnock, Matthew I., Williams, Genevieve, Sobtzick, Susan, Sheaves, Marcus
Format: Conference Object
Language:unknown
Published: Institute of Electrical and Electronics Engineers 2020
Subjects:
Online Access:https://researchonline.jcu.edu.au/67937/1/viewpaper_OCEANS2020.pdf
Description
Summary:A predictable aggregation of dwarf minke whales occurs annually in the Australian offshore waters of the northern Great Barrier Reef in June-July, which has been the subject of a long-term photo-identification study. Researchers from the Minke Whale Project (MWP) at James Cook University collect large volumes of underwater digital imagery each season (e.g. 1.8TB in 2018), much of which is contributed by citizen scientists. Manual processing and analysis of this quantity of data had become infeasible, and Convolutional Neural Networks (CNNs) offered a potential solution. Our study sought to design and train a CNN that could detect whales from video footage in complex near-surface underwater surroundings containing multiple peo- ple, boats, research and recreational gear. We modified known classification CNNs to localize whales in video frames and digital still images. The required high detection accuracy was achieved by discovering an effective negative-labeling training technique. This resulted in a less than 1% false-positive detection rate and below 0.1% false-negative rate. The final operation-version CNN- pipeline processed all videos (with the interval of 10 frames) in approximately four days (running on two GPUs) delivering 1.95 million sorted images.