Deep-Sea Fauna Segmentation: A Comparative Analysis of Convolutional and Vision Transformer Architectures at Lucky Strike Vent Field

Soto Vega, Pedro j.; Andrade-Miranda, Gustavo x.; Da Costa, Gilson A. o. p.; Papadakis, Panagiotis; Matabos, Marjolaine; Napoleon, Thibault; Karine, Ayoub; Fagundes Gasparoto, Henrique

doi:10.5194/isprs-annals-X-3-2024-387-2024

Deep-Sea Fauna Segmentation: A Comparative Analysis of Convolutional and Vision Transformer Architectures at Lucky Strike Vent Field

Type

Article

Date

2024-11-04

Author(s)

Soto Vega Pedro j. ^1,3, Andrade-Miranda Gustavo x. ⁴, Da Costa Gilson A. o. p. ⁵, Papadakis Panagiotis ⁶, Matabos Marjolaine ⁷, Napoleon Thibault ¹, Karine Ayoub ², Fagundes Gasparoto Henrique ³

Affiliation(s)

L@bISEN, Vision-AD, ISEN Yncr´ea Ouest, 20 rue Cuirass´e Bretagne, 29200 Brest, France
L@bISEN, Vision-AD, ISEN Yncr´ea Ouest, 33 Quater Chemin du Champ de Manoeuvre, 44470 Carquefou, France
L@bISEN, AutoROB, ISEN Yncr´ea Ouest, 20 rue Cuirass´e Bretagne, 29200 Brest, France
University Brest, LaTIM, INSERM, UMR 1101, Brest, France
Institute of Mathematics and Statistics, State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil –
IMT Atlantique, Lab-STICC, UMR 6285, Team RAMBO, F-29238 Brest, France
University Brest, CNRS, Ifremer, UMR6197 Biologie et Ecologie des Ecosyst`emes marins Profonds, 29280 Plouzan´e, France

Conference

ISPRS TC III Mid-term Symposium “Beyond the canopy: technologies and applications of remote sensing”, 4–8 November 2024, Belém, Brazil

Source

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (2194-9050) (Copernicus GmbH), 2024-11-04, Vol. X-3-2024, P. 387-395

DOI

10.5194/isprs-annals-X-3-2024-387-2024

Archimer ID

103033

Language(s)

English

Publisher

Copernicus GmbH

Due to recent technological developments, the acquisition and availability of deep-sea imagery has increased exponentially in the last years, leading to an increasing backlog in image annotation and processing, attributable to limited specialized human resources. In this work, we investigate the performance of well-established convolutional neural networks and Vision Transformer (ViT) based architectures, namely, DeepLabv3+ and UNETR, for the segmentation of fauna in deep-sea images. The dataset consists of images captured at the Lucky Strike Vent field, located on the mid-Atlantic ridge, of three edifices named Montsegur, White Castle, and Eiffel Tower. Our experimental investigation reveals that the Vision Transformer consistently outperforms the fully convolutional deep learning architecture, by approximately 14% in terms of F1-Score, demonstrating the effectiveness of ViTs in capturing intricate patterns and long-range dependencies present in deep-sea imagery. Our findings highlight the potential of ViTs as a promising approach for accurate semantic segmentation in challenging environmental contexts, paving the way for improved understanding and analysis of deep-sea ecosystems.

Keyword(s)

Fauna Segmentation, Deep Learning, Hydrothermal Vents, Vision Transformers, Convolutional Networks

Full Text

File	Pages	Size	Access
Publisher's official version	9	1 Mo		Download

How to cite

Soto Vega Pedro J., Andrade-Miranda Gustavo X., Da Costa Gilson A. O. P., Papadakis Panagiotis, Matabos Marjolaine, Napoleon Thibault, Karine Ayoub, Fagundes Gasparoto Henrique (2024). Deep-Sea Fauna Segmentation: A Comparative Analysis of Convolutional and Vision Transformer Architectures at Lucky Strike Vent Field. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. X-3-2024. 387-395. https://doi.org/10.5194/isprs-annals-X-3-2024-387-2024, https://archimer.ifremer.fr/doc/00918/103033/

Copy this text