Pipeline for Ocean Features Detection with Sentinel-2.
POS2IDON is a tool to detect suspected locations of floating marine debris, and other ocean features (e.g., floating macroalgae, ships, turbid water), in Sentinel- 2 satellite imagery using Machine Learning. The pipeline includes modules for data acquisition, pre-processing, and pixel-based classification using different ML models (e.g. Random Forest, XGBoost, Unet). Available models were trained with spectral signatures from events available in literature, in particularly from MARIDA library, and show satisfactory metrics. The data pipeline allows to detect large enough features that can be suspicious in terms of aggregation of floating plastic litter and therefore be used to alert and inform stakeholders. POS2IDON outputs include the classification maps for all the available Sentinel-2 imagery of a given region of interest and temporal period, specified by the user. By providing the source code, the vision is to share a transparent easy-to-examine, and flexible, code that is decomposed in several modules, and in this way stimulate improvements and new implementations from the scientific community.
In this repository we propose an open-policy data pipeline framework for ocean features detection (e.g. marine debris, floating vegetation, foam and water) using Sentinel-2 satellite imagery and machine learning methods. The presented workflow consists of three main steps:
-
search and download Level-1C Sentinel-2 imagery from Google Cloud Storage using FeLS - Fetch Landsat & Sentinel Data from Google Cloud, Copernicus Data Space Ecosystem using CDSETool or Copernicus Open Access Hub using sentinelsat (discontinued), for a given region of interest and specified time period.
-
image pre-processing: application of ACOLITE atmospheric correction module to obtain Rayleigh-corrected reflectances and surface reflectances, application of a land mask based on ESA World Cover 2021, application of a cloud mask computed with Sentinel Hub's cloud detector for Sentinel-2 imagery, application of “marine clear water” mask (NDWI-based, or a NIR-reflectance based thresholding) and NaN mask.
-
pixel-based classification with machine learning methods on the downloaded set of Sentinel-2 images. The workflow supports three well-known machine learning algorithms (Random Forest, XGBoost and Unet) trained with spectral signatures, as well as spectral indices (e.g., NDVI - Normalized Difference Vegetation Index, FAI - Floating Algae Index, FDI - Floating Debris Index). As additional option, the classification step using Unet, can be computed also with Julia programming language. Outputs include the classification maps and classification probability maps, for the chosen region and time period. For large regions of interest, one has the option to split the image for classification and then mosaic.
POS2IDON is coded in Python 3.9. In the terminal, create a Python environment using conda, activate it, update pip:
conda create -n pos2idon-env python=3.9
conda activate pos2idon-env
pip install --upgrade pip
and install libraries in the following order (takes approx. 8-15 minutes):
macOS:
conda install -c conda-forge gdal=3.5.0 geopandas=0.11.1 s2cloudless=1.7.0 lightgbm=3.3.2
pip install python-dotenv==0.20.0 cdsetool==0.1.3 zipfile36==0.1.3 netCDF4==1.5.8 pyproj==3.3.1 scikit-image==0.19.2 pyhdf==0.10.5 matplotlib==3.5.2 pandas==1.4.3 scikit-learn==1.1.1 ubelt==1.1.2 rasterio==1.3.0.post1 hummingbird-ml==0.4.5 xgboost==1.7.3 juliacall==0.9.14 pyarrow==14.0.1
pip install --extra-index-url https://artifactory.vgt.vito.be/api/pypi/python-packages/simple terracatalogueclient==0.1.11
conda install -c pytorch pytorch=1.13.1 torchvision=0.14.1 torchaudio=0.13.1
Windows:
conda install -c conda-forge gdal=3.5.0 geopandas=0.11.1 lightgbm=3.3.2
pip install python-dotenv==0.20.0 cdsetool==0.1.3 zipfile36==0.1.3 netCDF4==1.5.8 pyproj==3.3.1 scikit-image==0.19.2 pyhdf==0.10.5 matplotlib==3.5.2 pandas==1.4.3 scikit-learn==1.1.1 ubelt==1.1.2 rasterio==1.3.0.post1 hummingbird-ml==0.4.5 xgboost==1.7.3 s2cloudless==1.7.0 juliacall==0.9.14 pyarrow==14.0.1
pip install --extra-index-url https://artifactory.vgt.vito.be/api/pypi/python-packages/simple terracatalogueclient==0.1.11
conda install -c pytorch pytorch=1.13.1 torchvision=0.14.1 torchaudio=0.13.1
Ubuntu:
pip install --find-links=https://girder.github.io/large_image_wheels --no-cache GDAL==3.5.0
pip install geopandas==0.11.1 s2cloudless==1.7.0 pip install lightgbm==3.3.2
pip install python-dotenv==0.20.0 cdsetool==0.1.3 zipfile36==0.1.3 netCDF4==1.5.8 pyproj==3.3.1 scikit-image==0.19.2 pyhdf==0.10.5 matplotlib==3.5.2 pandas==1.4.3 scikit-learn==1.1.1 ubelt==1.1.2 rasterio==1.3.0.post1 hummingbird-ml==0.4.5 xgboost==1.7.3 juliacall==0.9.14 pyarrow==14.0.1
pip install --extra-index-url https://artifactory.vgt.vito.be/api/pypi/python-packages/simple terracatalogueclient==0.1.11
pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1
To run the Unet classification step in Julia with models trained with FLUX.jl you need Julia installed and added to PATH. If already installed check your version before proceeding and eventually upgrade it. Download here; (the code is tested with Julia version 1.11.0). If not already installed:
1- Run POS2IDON selecting in User Inputs a model in .bson
format. The first time you run it julicall
will install the latest version of Julia;
2- In Windows. open Environment Variables and add the full path to the julia binary folder located in the conda environment folder at C:/users/../anaconda/envs/pos2idon-env/julia_env/pyjuliapkg/install/bin
;
4- Open a terminal inside the POS2IDON Julia environment, usually it is inside conda environments folder envs/pos2idon-env/julia_env
folder and type julia
;
5- Type ]
and write:
activate .
add [email protected] [email protected] [email protected] [email protected] [email protected]
6- Run POS2IDON again.
You only need to to this the first time you run POS2IDON.
We recommend a machine with a dedicated GPU.
-
Get credentials for the followings data providers:
and type them in the file
configs/Environments/.env
-
Place your saved Machine Learning models in :
configs/MLmodels/YourModelFolder/YourModel.pkl
(for scikit-learn RF and XGB)
configs/MLmodels/YourModelFolder/YourModel.zip
(for Py-Torch)
configs/MLmodels/YourModelFolder/YourModel.pth
(for Python Unet)
configs/MLmodels/YourModelFolder/YourModel.bson
(for Julia Unet) -
Execute the script
workflow.py
, this will automatically clone the following repositories:-
FeLS - Fetch Landsat & Sentinel Data from Google Cloud (private) repository in the folder :
/configs/fetchLandsatSentinelFromGoogleCloud-master
-
ACOLITE - generic atmospheric correction module (20221114.0) repository in the folder :
/configs/acolite-main
-
If the cloning does not start automatically or if the repositories were corrupted during cloning, you can manually download them using the previous links.
The first time you run FeLS it will download a csv table, this process may take a few minutes.
Open configs/User_Inputs.py
and follow the descriptions to set up wanted workflow options, insert region of interest and sensing period, select download service, define masking and classification options. Execute the script workflow.py
to run the workflow.
To test the classification workflow we provide a random forest model based on MARIDA spectral signatures library and trained as described in Kikaki et al., 2022. You can download the model folder using this link and place it in configs/MLmodels
. By default the User_Inputs.py
is configured to perform a classification on a plastic debris event case study that occurred in the Gulf of Honduras on 18th September 2020.
Visualization with QGIS, color palette provided inside configs/QGIScolorpalettes
.
If you find POS2IDON useful in your research, acknowledge us using the following reference:
- A. Valente, E. Castanho, A. Giusti, J. Pinelo and P. Silva, "An Open-Source Data Pipeline Framework to Detect Floating Marine Plastic Litter Using Sentinel-2 Imagery and Machine Learning," IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 2023, pp. 4108-4111, doi: 10.1109/IGARSS52108.2023.10281415.
POS2IDON tool has been tested in the framework of different EU projects (LabPlas and EcoBlue) and under different data approaches.
POS2IDON is provided by AIR Centre as an experimental tool, without explicit or implied warranty. Use of the tool is at your own discretion and risk.