Skip to content
Scott Veirs edited this page Mar 3, 2023 · 60 revisions

Welcome to the orcadata wiki!

This is a place to share and collaborate, especially regarding bioacoustic analysis of real-time and archived audio data related to the Orcasound open source project. Here you can learn more about Orcasound: machine learning resources related to orcas (training sets | test sets) and access to Orcasound data -- both archived training and testing data, and real-time audio streams. You may also be interested in the synopses of projects that leverage these open data at ai4orcas.net.

Data Resources

Most-recent progress (within the last year)

2023

  • Mar: Completing transition to Amazon-sponsored S3 data buckets for Orcasound open data (Registry | Data Exchange on AWS Marketplace).
  • Feb: John Ford contributes historic recordings of SRKWs from a 1997 event in Dyes Inlet to the Orcasound open data repository. The recording was acquired by the Center for Whale Research, a member of Orcasound, and is being shared with permission from their Research Director, Dr. Michael Weiss. Ben Hendricks and Jono Mendez request feedback on a prototyped bioacoustic dashboard developed in collaboration with the BC Hydrophone Network, coordinated by Janie Wray. Rachael Cheng, Val Veirs, and David Bain collaborate on orca call autoencoders and similarity algorithms. HALLO project formalizes 4-year Canadian government grant for AI-assisted SRKW movement monitoring and forecasting system, with Orcasound as a U.S. collaborator.
  • Jan: Scott initiates general orca-ai team in Orcasound organization on Github; UW MS data science student team finalizes project plan with Valentina to advance noise analysis with Orcasound open data archive; Valentina builds initial catalog of Orcasound archive; Orcasound labels more SRKW and humpback bouts, including S04 calls and whistles in first labeled bout from Point Robinson in southern Puget Sound. HALLO continues beta-testing new online catalogue with SRKW calls.

2022

  • Dec: Orcasound volunteer data scientist Zoe pioneers first semi-automated integration of SRKW movement data from Acartia and Chinook salmon counts from the Fraser and Columbia rivers.
  • Nov: Val delivers autoencoder talk for ONC/Meridian workshop; Ze spins up project to automate vessel image classification, a collaboration of Orcasound and Protected Seas using the M2 system deployed at Val's Orcasound Lab node; WDFW grants $25k for maintaining/expanding Orcasound nodes as real time oil spill response equipment.
  • Oct: Rachael Cheng in Berlin joins Orcasound standups to share NRKW AI progress and open source code, along with Alex Barnhill and Christian Bergler; Val starts work on rough SRKW click classifier (buzz, slow, fast); Valentina proposes project to UW Data Science Capstone on historical noise analysis; Ben Hendricks joins Orcasound standup with updates on open source bioacoustic dashboard project, a collaboration of Orcasound & the BC hydrophone network.
  • Sep: Orcasound's GSoC 2022 contributors make final reports; DemocracyLab hackathon (9/10) connects Acartia.io to orcamap; Microsoft hackathon (9/20-22, Github Project) refines OrcaHello UI, model training/deployment/monitoring, notifications, begins annotations to SRKW pod and call type, and establishes first Kaggle for orca calls
  • Aug: HALLO workshop on open data for SRKW movement forecast modeling (Aug 31 - Sep 01); Orcasound applies for AWS Open Data sponsorship (2 years); planning for Microsoft and DemocracyLab hackathons in Sept.
  • Jul: First blog posts from Orcasound GSoC 2022 contributors regarding: open source approaches to de-noising and source separation; ingestion of OOI hydrophone data from Oregon; refinement of the Orca Active Learning tool code & deployment.
  • Jun: Orcasound Google Summer of Code (GSoC) 2022 students begin coding
  • May: At DCLDE 2022 workshop, Beam Reach extern Emily Vierling shares her Haro Humpback open data & dictionary project, including a humpback non-song vocalization dictionary based on recordings from Haro Strait, WA, and an annotated training data set for 12 humpback signal types.
  • Apr: Earth Day hackathon organizes Orcasound open data visualization opportunities; OrcaHello Azure subscription extended until Oct, 2022.
  • Mar: OrcaHello Dashboard reaches 3,500 annotated 1-min candidates; Orcasound and HALLO project present at the DCLDE workshop in Hawaii
  • Feb: Orcasound accepted as 2022 GSoC host organization (3rd year)
  • Jan: OrcaHello tag cloud curated using standardized dictionary of labels.

2021

  • Dec: Orcasound presents at the Acoustical Society of America meeting in Seattle
  • Nov: SRKWs in Puget Sound, humpbacks in Haro! OrcaHello migrates to new Azure subscription; coordination with HALLO on ASA/DCLDE/SSEC talks; Orcasound extern Emily Vierling catalyzes humpback non-song vocalization label standardization.
  • Oct: Beluga in Puget Sound! OrcaHello team improves real-time inference system during annual hackathon (Oct 12-14), including re-training model, continuous integration, moderator UI enhancements, and documentation. MBARI publishes acoustic archive via AWS open data repository.

For more details, see the growing list of documentation pages for each Orcasound machine learning effort.

Deeper history of AI for Orcas project

Starting in the early 2000s, members of the Orcasound community have been contemplating the application of artificial intelligence to the problem of detecting orcas acoustically. Orcasound's AI for Orcas project page describes the evolution of our collective efforts. #ai4orcas