Addressing Big Data Challenges in Space Science

Speaker and affiliation: 
Sandor Kruk (European Space Agency)
Tue, 2024-01-09 12:30 to 13:30
ul. Pasteura 7, sala 404 /

Space science missions, both ongoing and upcoming, will generate petascale data in the near future. For instance, the Euclid mission is projected to produce a staggering 20 Pb of data during its operational lifetime. Analysis and processing of data on this scale requires specialised infrastructure and toolchains. The European Space Agency is developing the ESA Datalabs platform, which offers the essential infrastructure for accessing data from missions like the Hubble Space Telescope, James Webb Space Telescope, Gaia, and Euclid. With the aid of software tools such as JupyterLab, users can easily access mission data without the need for downloading it. ESA Datalabs fosters collaboration among scientists through team workspaces and provides a streamlined environment for users to create, deploy, and share their software and pipelines. In this manner, ESA Datalabs provides an accessible and potent framework for high-performance computing and a direct access to the data in the ESA archives.

            In this presentation, I will introduce the ESA Datalabs platform and discuss current and foreseen scientific applications, for example related to galaxy classification with machine learning, and the large-scale exploration of data available in the ESA science archives through data mining and machine learning.

File astrosemi20240109.docx13.47 KB
PDF icon astrosemi20240109.docx.pdf179.61 KB