Harnessing the Pangeo ecosystem for delivering the cloud-based Global Fish Tracking System

We present our approach of leveraging the Pangeo software stack for developing the Global Fish Tracking System (GFTS). The GFTS project tackles the challenge of accurately modelling fish movement in the ocean based on biologging data with a primary focus on Sea Bass. Modelling fish movements is essential to better understand migration strategies and site fidelity, which are critical aspects for fish stock management policy and marine life conservation efforts.

Estimating fish movements is a highly compute intensive process. It involves matching pressure and temperature data from in-situ biologging sensors with high resolution ocean temperature simulations over long time periods. The Pangeo software stack provides an ideal environment for this kind of modelling. While the primary target platform of the GFTS project is the new Destination Earth Service Platform (DESP), relying on the Pangeo ecosystem ensures that the GFTS project is a robust and portable solution that can be re-deployed on different infrastructure. 

One of the distinctive features of the GFTS project is its advanced data management approach, synergizing with the capabilities of Pangeo. Diverse datasets, including climate change adaptation digital twin data, sea temperature observations, bathymetry, and biologging in-situ data from tagged fish, are seamlessly integrated within the Pangeo environment. A dedicated software called pangeo-fish has been developed to streamline this complex modelling process. The technical framework of the GFTS project includes Pangeo core packages such as Xarray and Dask, which facilitate scalable computations.

Pangeo's added value in data management becomes apparent in its capability to optimise data access and enhance performance. The concept of "data visitation" is central to this approach. By strategically deploying Dask clusters close to the data sources, the GFTS project aims to significantly improve performance of fish track modelling when compared to traditional approaches. This optimised data access ensures that end-users can efficiently interact with large datasets, leading to more streamlined and efficient analyses.

The cloud-based delivery of the GFTS project aligns with the overarching goal of Pangeo. In addition, the GFTS includes the development of a custom interactive Decision Support Tool (DST). The DST empowers non-technical users with an intuitive interface for better understanding the results of the GFTS project, leading to more informed decision-making. The integration with Pangeo and providing intuitive access to the GFTS data is not merely a technicality; it is a commitment to FAIR (Findable, Accessible, Interoperable and Reusable), TRUST (Transparency, Responsibility, User focus, Sustainability and Technology) and open science principles. 

In short, the GFTS project, within the Pangeo ecosystem, exemplifies how advanced data management, coupled with the optimization of data access through "data visitation," can significantly enhance the performance and usability of geoscience tools. This collaborative and innovative approach not only benefits the immediate goals of the GFTS project but contributes to the evolving landscape of community-driven geoscience initiatives.

How to cite
Wiesmann Daniel, Odaka Tina, Fouilloux Anne, Autret Emmanuelle, Woillez Mathieu, Ragan-Kelley Benjamin (2024). Harnessing the Pangeo ecosystem for delivering the cloud-based Global Fish Tracking System. EGU General Assembly 2024. 14–19 April 2024, Vienna, Austria & Online. https://doi.org/10.5194/egusphere-egu24-10741, https://archimer.ifremer.fr/doc/00889/100110/

