Copy this text
The Pangeo Big Data Ecosystem and its use at CNES
Pangeo[1] is a community-driven effort for open-source big data initially focused on the Earth System Sciences. One of its primary goals is to enable scientists in analyzing petascale datasets both on classical high-performance computing (HPC) and on public cloud infrastructure. In only a few years, Pangeo has grown into a very productive community collaborating on the development of open-source analysis tools for science. It provides a set of example deployments based on open-source Scientific Python packages like Jupyter[2], Dask[3], and Xarray[4] that bring together scientists and developer with their actual use-cases. In this paper, we first describe Pangeo ecosystem and community. We then present its impact on the work of scientists from CNES on the HPC deployment there. We conclude with a future outlook for Pangeo in this agency and beyond.
Keyword(s)
Pangeo, Dask, Jupyter, HPC, Cloud, Big Data, Analysis, Open Source
Full Text
Alternative access
File | Pages | Size | Access | |
---|---|---|---|---|
Publisher's official version | 6 | 446 Ko |