diff --git a/README.md b/README.md index 240e3d17..d1267c59 100644 --- a/README.md +++ b/README.md @@ -19,25 +19,25 @@ Pre-trained models are provided. * New workflows ## Getting Started -The recommended way of installing plantseg is via the conda package, +The recommended way of installing plantseg is via the conda package, which is currently supported on Linux and Windows. -For detailed usage documentation checkout our [**wiki**](https://github.com/hci-unihd/plant-seg/wiki) +For detailed usage documentation checkout our [**wiki**](https://github.com/hci-unihd/plant-seg/wiki) [📖](https://github.com/hci-unihd/plant-seg/wiki): ### Prerequisites for conda package * Linux or Windows (Might work on MacOS but it's not tested). -* (Optional) Nvidia GPU with official Nvidia drivers installed. +* (Optional) Nvidia GPU with official Nvidia drivers installed. ### Install on Linux #### Install Anaconda python First step required to use the pipeline is installing anaconda python. -If you already have a working anaconda setup you can go directly to next item. Anaconda can be downloaded for all -platforms from here [anaconda](https://www.anaconda.com/products/individual). We suggest to use Miniconda, +If you already have a working anaconda setup you can go directly to next item. Anaconda can be downloaded for all +platforms from here [anaconda](https://www.anaconda.com/products/individual). We suggest to use Miniconda, because it is lighter and install fewer unnecessary packages. To download Anaconda Python open a terminal and type @@ -74,16 +74,16 @@ The above command will create new conda environment `plant-seg` together with al ### Install on Windows #### Install Anaconda python First step required to use the pipeline is installing anaconda python. -If you already have a working anaconda setup you can go directly to next item. Anaconda can be downloaded for all -platforms from here [anaconda](https://www.anaconda.com/products/individual). We suggest to use Miniconda, +If you already have a working anaconda setup you can go directly to next item. Anaconda can be downloaded for all +platforms from here [anaconda](https://www.anaconda.com/products/individual). We suggest to use Miniconda, because it is lighter and install fewer unnecessary packages. -Miniconda can be downloaded from [miniconda](https://docs.conda.io/en/latest/miniconda.html). Download the +Miniconda can be downloaded from [miniconda](https://docs.conda.io/en/latest/miniconda.html). Download the executable `.exe` for your Windows version and follow the installation instructions. #### Install PlantSeg using mamba The tool can be installed directly by executing in the anaconda prompt the following commands -(***For installing and running plantseg this is equivalent to a linux terminal***). +(***For installing and running plantseg this is equivalent to a linux terminal***). Fist step is to install mamba, which is an alternative to conda: ```bash conda install -c conda-forge mamba @@ -134,7 +134,7 @@ First, activate the newly created conda environment with: ```bash conda activate plant-seg ``` -then, one can just start the pipeline with +then, one can just start the pipeline with ```bash plantseg --config CONFIG_PATH ``` @@ -143,8 +143,8 @@ file and our [wiki](https://github.com/hci-unihd/plant-seg/wiki/PlantSeg-Classic detailed description of the parameters. ## Update PlantSeg -The easiest way to update plantseg to the latest version is to reinstall the conda environment from scratch. -To do so on a freshly open terminal: +The easiest way to update plantseg to the latest version is to reinstall the conda environment from scratch. +To do so on a freshly open terminal: ```bash mamba create -n plant-seg [The command that matches your OS] @@ -153,19 +153,19 @@ In the headless mode (i.e. when invoked with `plantseg --config CONFIG_PATH`) th If prediction on all available GPUs is not desirable, restrict the number of GPUs using `CUDA_VISIBLE_DEVICES`, e.g. ```bash CUDA_VISIBLE_DEVICES=0,1 plantseg --config CONFIG_PATH -``` +``` ### Optional dependencies (not fully tested on Windows) -Some types of compressed tiff files require an additional package to be read correctly (eg: Zlib, -ZSTD, LZMA, ...). To run plantseg on those stacks you need to install `imagecodecs`. +Some types of compressed tiff files require an additional package to be read correctly (eg: Zlib, +ZSTD, LZMA, ...). To run plantseg on those stacks you need to install `imagecodecs`. In the terminal: ```bash conda activate plant-seg pip install imagecodecs ``` -Experimental support for SimpleITK watershed segmentation has been added to PlantSeg version 1.1.8. This features can be used only -after installing the SimpleITK package: +Experimental support for SimpleITK watershed segmentation has been added to PlantSeg version 1.1.8. This features can be used only +after installing the SimpleITK package: ```bash conda activate plant-seg pip install SimpleITK @@ -176,7 +176,7 @@ The PlantSeg repository is organised as follows: * **plantseg**: Contains the source code of PlantSeg. * **conda-reicpe**: Contains all necessary code and configuration to create the anaconda package. * **Documentation-GUI**: Contains a more in-depth documentation of PlantSeg functionality. -* **evaluation**: Contains all script required to reproduce the quantitative evaluation in +* **evaluation**: Contains all script required to reproduce the quantitative evaluation in [Wolny et al.](https://www.biorxiv.org/content/10.1101/2020.01.17.910562v1). * **examples**: Contains the files required to test PlantSeg. * **tests**: Contains automated tests that ensures the PlantSeg functionality are not compromised during an update. @@ -185,7 +185,7 @@ The PlantSeg repository is organised as follows: We publicly release the datasets used for training the networks which available as part of the _PlantSeg_ package. Please refer to [our publication](https://www.biorxiv.org/content/10.1101/2020.01.17.910562v1) for more details about the datasets: - _Arabidopsis thaliana_ ovules dataset (raw confocal images + ground truth labels) -- _Arabidopsis thaliana_ lateral root (raw light sheet images + ground truth labels) +- _Arabidopsis thaliana_ lateral root (raw light sheet images + ground truth labels) Both datasets can be downloaded from [our OSF project](https://osf.io/uzq3w/) @@ -207,8 +207,8 @@ The following pre-trained networks are provided with PlantSeg package out-of-the * `lightsheet_2D_unet_root_nuclei_ds1x` - a variant of 2D U-Net trained on light-sheet images _Arabidopsis_ lateral root nuclei. Training the 2D U-Net is done on the Z-slices (pixel size: 0.1625x0.1625 µm^3) with BCEDiceLoss. * `confocal_3D_unet_sa_meristem_cells` - a variant of 3D U-Net trained on confocal images of shoot apical meristem dataset from: Jonsson, H., Willis, L., & Refahi, Y. (2017). Research data supporting Cell size and growth regulation in the Arabidopsis thaliana apical stem cell niche. https://doi.org/10.17863/CAM.7793. voxel size: (0.25x0.25x0.25 µm^3) (ZYX) * `confocal_2D_unet_sa_meristem_cells` - a variant of 2D U-Net trained on confocal images of shoot apical meristem dataset from: Jonsson, H., Willis, L., & Refahi, Y. (2017). Research data supporting Cell size and growth regulation in the Arabidopsis thaliana apical stem cell niche. https://doi.org/10.17863/CAM.7793. pixel size: (25x0.25 µm^3) (YX) -* `lightsheet_3D_unet_mouse_embryo_cells` - A variant of 3D U-Net trained to predict the cell boundaries in live light-sheet images of ex-vivo developing mouse embryo. Voxel size: (0.2×0.2×1 µm^3) (XYZ) -* `confocal_3D_unet_mouse_embryo_nuclei` - A variant of 3D U-Net trained to predict the cell boundaries in live light-sheet images of ex-vivo developing mouse embryo. Voxel size: (0.2×0.2×1 µm^3) (XYZ) +* `lightsheet_3D_unet_mouse_embryo_cells` - A variant of 3D U-Net trained to predict the cell boundaries in live light-sheet images of ex-vivo developing mouse embryo. Voxel size: (0.2×0.2×1 µm^3) (XYZ) +* `confocal_3D_unet_mouse_embryo_nuclei` - A variant of 3D U-Net trained to predict the cell boundaries in live light-sheet images of ex-vivo developing mouse embryo. Voxel size: (0.2×0.2×1 µm^3) (XYZ) Selecting a given network name (either in the config file or GUI) will download the network into the `~/.plantseg_models` directory. @@ -220,9 +220,9 @@ export PLANTSEG_HOME="/path/to/plantseg/home" ``` ## Training on New Data -For training new models we rely on the [pytorch-3dunet](https://github.com/wolny/pytorch-3dunet). +For training new models we rely on the [pytorch-3dunet](https://github.com/wolny/pytorch-3dunet). A similar configuration file can be used for training on new data and all the instructions can be found in the repo. -When the network is trained it is enough to create `~/.plantseg_models/MY_MODEL_NAME` directory +When the network is trained it is enough to create `~/.plantseg_models/MY_MODEL_NAME` directory and copy the following files into it: * configuration file used for training: `config_train.yml` * snapshot of the best model across training: `best_checkpoint.pytorch` @@ -231,13 +231,13 @@ and copy the following files into it: The later two files are automatically generated during training and contain all neural networks parameters. Now you can simply use your model for prediction by setting the [model_name](examples/config.yaml) key to `MY_MODEL_NAME`. - + If you want your model to be part of the open-source model zoo provided with this package, please contact us. ## Using LiftedMulticut segmentation As reported in our [paper](https://elifesciences.org/articles/57613), if one has a nuclei signal imaged together with -the boundary signal, we could leverage the fact that one cell contains only one nucleus and use the `LiftedMultict` -segmentation strategy and obtain improved segmentation. This workflow can be used from our napari gui and from our +the boundary signal, we could leverage the fact that one cell contains only one nucleus and use the `LiftedMultict` +segmentation strategy and obtain improved segmentation. This workflow can be used from our napari gui and from our [CLI](https://github.com/hci-unihd/plant-seg/wiki/PlantSeg-Classic-CLI/_edit#liftedmulticut-segmentation). ## Troubleshooting @@ -256,7 +256,7 @@ or: raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled ``` -It means that your cuda installation does not match the default in plantseg. +It means that your cuda installation does not match the default in plantseg. You can check your current cuda version by typing in the terminal ``` cat /usr/local/cuda/version.txt @@ -265,7 +265,7 @@ Then you can re-install the pytorch version compatible with your cuda by activat ``` conda activate plant-seg ``` -and +and ``` conda install -c pytorch torchvision cudatoolkit= pytorch ``` @@ -297,7 +297,7 @@ $ rm ~/.plantseg_models/configs/config_gui_last.yaml RuntimeError: key : 'crop_volume' is missing, plant-seg requires 'crop_volume' to run ``` -Please make sure that your configuration has the correct formatting and contains all required keys. +Please make sure that your configuration has the correct formatting and contains all required keys. An updated example can be found inside the directory `examples`, in this repository. * If when trying to execute the Lifted Multicut pipeline you receive an error like: @@ -310,13 +310,13 @@ conda install -c conda-forge python-elf ``` * PlantSeg is under active development, so it may happen that the models/configuration files saved in `~/.plantseg_modes` -are outdated. In case of errors related to loading the configuration file, please close the PlantSeg app, +are outdated. In case of errors related to loading the configuration file, please close the PlantSeg app, remove `~/.plantseg_models` directory and try again. ## Tests -In order to run tests make sure that `pytest` is installed in your conda environment. You can run your tests +In order to run tests make sure that `pytest` is installed in your conda environment. You can run your tests simply with `python -m pytest` or `pytest`. For the latter to work you need to install `plantseg` locally in "develop mode" -with `pip install -e .`. +with `pip install -e .`. ## Citation ``` diff --git a/docs/_images/main_figure.png b/docs/_images/main_figure.png deleted file mode 100644 index 361adf2a..00000000 Binary files a/docs/_images/main_figure.png and /dev/null differ diff --git a/docs/_toc.yml b/docs/_toc.yml index c73e6132..8890518f 100644 --- a/docs/_toc.yml +++ b/docs/_toc.yml @@ -6,6 +6,30 @@ root: intro parts: - caption: Getting Started chapters: - - file: chapters/installation - - file: chapters/quick_start - - file: chapters/troubleshooting + - file: chapters/getting_started/installation + - file: chapters/getting_started/quick_start + - file: chapters/getting_started/troubleshooting + - caption: Interactive PlantSeg with Napari + chapters: + - file: chapters/plantseg_interactive_napari/index + - file: chapters/plantseg_interactive_napari/data_processing + - file: chapters/plantseg_interactive_napari/unet_gasp_workflow + - file: chapters/plantseg_interactive_napari/unet_training + - file: chapters/plantseg_interactive_napari/extra_pred + - file: chapters/plantseg_interactive_napari/extra_seg + - file: chapters/plantseg_interactive_napari/headless_batch_processing + - caption: Classic PlantSeg GUI + chapters: + - file: chapters/plantseg_classic_gui/index + - file: chapters/plantseg_classic_gui/data_processing + - file: chapters/plantseg_classic_gui/cnn_predictions + - file: chapters/plantseg_classic_gui/segmentation + - caption: Classic PlantSeg CLI + chapters: + - file: chapters/plantseg_classic_cli/index + - caption: PlantSeg Python API + chapters: + - file: chapters/python_api/index + - file: chapters/python_api/data_processing + - file: chapters/python_api/cnn_predictions + - file: chapters/python_api/segmentation diff --git a/docs/chapters/installation.md b/docs/chapters/getting_started/installation.md similarity index 100% rename from docs/chapters/installation.md rename to docs/chapters/getting_started/installation.md diff --git a/docs/chapters/quick_start.md b/docs/chapters/getting_started/quick_start.md similarity index 100% rename from docs/chapters/quick_start.md rename to docs/chapters/getting_started/quick_start.md diff --git a/docs/chapters/troubleshooting.md b/docs/chapters/getting_started/troubleshooting.md similarity index 100% rename from docs/chapters/troubleshooting.md rename to docs/chapters/getting_started/troubleshooting.md diff --git a/docs/chapters/plantseg_classic_cli/index.md b/docs/chapters/plantseg_classic_cli/index.md new file mode 100644 index 00000000..0484bba0 --- /dev/null +++ b/docs/chapters/plantseg_classic_cli/index.md @@ -0,0 +1,189 @@ +# PlantSeg Classic CLI + +## Guide to Custom Configuration File + +The configuration file defines all the operations in our pipeline together with the data to be processed. +Please refer to [config.yaml](../../../examples/config.yaml) for a sample pipeline configuration and a detailed explanation +of all parameters. + +## Main Keys/Steps +* `path` attribute: is used to define either the file to process or the directory containing the data. +* `preprocessing` attribute: contains a simple set of possible operations one would need to run on their data before calling the neural network. +This step can be skipped if data is ready for neural network processing. +Detailed instructions can be found at [Data Processing](https://github.com/hci-unihd/plant-seg/wiki/Data-Processing). +* `cnn_prediction` attribute: contains all parameters relevant for predicting with a neural network. +Description of all pre-trained models provided with the package is described below. +Detailed instructions can be found at [Predictions](https://github.com/hci-unihd/plant-seg/wiki/Predictions). +* `segmentation` attribute: contains all parameters needed to run the partitioning algorithm (i.e., final Segmentation). +Detailed instructions can be found at [Segmentation](https://github.com/hci-unihd/plant-seg/wiki/Segmentation.md). + +## Additional information + +The PlantSeg-related files (models, configs) will be placed inside your home directory under `~/.plantseg_models`. + +Our pipeline uses the PyTorch library for CNN predictions. PlantSeg can be run on systems without GPU, however +for maximum performance, we recommend that the application is run on a machine with a high-performance GPU for deep learning. +If the `CUDA_VISIBLE_DEVICES` environment variable is not specified, the prediction task will be distributed on all available GPUs. +E.g. run: `CUDA_VISIBLE_DEVICES=0 plantseg --config CONFIG_PATH` to restrict prediction to a given GPU. + +## configuration file example +This modality of using PlantSeg is particularly suited for high throughput processing and for running +PlantSeg on a remote server. +To use PlantSeg from command line mode, you will need to create a configuration file using a standard text editor + or using the save option of the PlantSeg GUI. + +Here is an example configuration: + +``` +path: /home/USERNAME/DATA.tiff # Contains the path to the directory or file to process + +preprocessing: + # enable/disable preprocessing + state: True + # create a new sub folder where all results will be stored + save_directory: "PreProcessing" + # rescaling the volume is essential for the generalization of the networks. The rescaling factor can be computed as the resolution + # of the volume at hand divided by the resolution of the dataset used in training. Be careful, if the difference is too large check for a different model. + factor: [1.0, 1.0, 1.0] + # the order of the spline interpolation + order: 2 + # optional: perform Gaussian smoothing or median filtering on the input. + filter: + # enable/disable filtering + state: False + # Accepted values: 'gaussian'/'median' + type: gaussian + # sigma (gaussian) or disc radius (median) + param: 1.0 + +cnn_prediction: + # enable/disable UNet prediction + state: True + # Trained model name, more info on available models and custom models in the README + model_name: "generic_confocal_3D_unet" + # If a CUDA capable gpu is available and corrected setup use "cuda", if not you can use "cpu" for cpu only inference (slower) + device: "cpu" + # how many subprocesses to use for data loading + num_workers: 8 + # patch size given to the network (adapt to fit in your GPU mem) + patch: [32, 128, 128] + # stride between patches will be computed as `stride_ratio * patch` + # recommended values are in range `[0.5, 0.75]` to make sure the patches have enough overlap to get smooth prediction maps + stride_ratio: 0.75 + # If "True" forces downloading networks from the online repos + model_update: False + +cnn_postprocessing: + # enable/disable cnn post processing + state: False + # if True convert to result to tiff + tiff: False + # rescaling factor + factor: [1, 1, 1] + # spline order for rescaling + order: 2 + +segmentation: + # enable/disable segmentation + state: True + # Name of the algorithm to use for inferences. Options: MultiCut, MutexWS, GASP, DtWatershed + name: "MultiCut" + # Segmentation specific parameters here + # balance under-/over-segmentation; 0 - aim for undersegmentation, 1 - aim for oversegmentation. (Not active for DtWatershed) + beta: 0.5 + # directory where to save the results + save_directory: "MultiCut" + # enable/disable watershed + run_ws: True + # use 2D instead of 3D watershed + ws_2D: True + # probability maps threshold + ws_threshold: 0.5 + # set the minimum superpixels size + ws_minsize: 50 + # sigma for the gaussian smoothing of the distance transform + ws_sigma: 2.0 + # sigma for the gaussian smoothing of boundary + ws_w_sigma: 0 + # set the minimum segment size in the final segmentation. (Not active for DtWatershed) + post_minsize: 50 + +segmentation_postprocessing: + # enable/disable segmentation post processing + state: False + # if True convert to result to tiff + tiff: False + # rescaling factor + factor: [1, 1, 1] + # spline order for rescaling (keep 0 for segmentation post processing + order: 0 +``` +This configuration can be found at [config.yaml](https://github.com/hci-unihd/plant-seg/blob/master/examples/config.yaml). + +## Pipeline Usage (command line) +To start PlantSeg from the command line: +First, activate the newly created conda environment with: +```bash +conda activate plant-seg +``` +then, one can just start the pipeline with +```bash +plantseg --config CONFIG_PATH +``` +where `CONFIG_PATH` is the path to a YAML configuration file. + +## Results + +The results are stored together with the source input files inside a nested directory structure. +As an example, if we want to run PlantSeg inside a directory with two stacks, we will obtain the following +outputs: +``` +/file1.tif +/file2.tif +/PreProcesing/ +------------>/file1.h5 +------------>/file1.yaml +------------>/file2.h5 +------------>/file2.yaml +------------>/generic_confocal_3d_unet/ +------------------------------------->/file1_predictions.h5 +------------------------------------->/file1_predictions.yaml +------------------------------------->/file2_predictions.h5 +------------------------------------->/file2_predictions.yaml +------------------------------------->/GASP/ +------------------------------------------>/file_1_predions_gasp_average.h5 +------------------------------------------>/file_1_predions_gasp_average.yaml +------------------------------------------>/file_2_predions_gasp_average.h5 +------------------------------------------>/file_2_predions_gasp_average.yaml +------------------------------------------>/PostProcessing/ +--------------------------------------------------------->/file_1_predions_gasp_average.tiff +--------------------------------------------------------->/file_1_predions_gasp_average.yaml +--------------------------------------------------------->/file_2_predions_gasp_average.tiff +--------------------------------------------------------->/file_2_predions_gasp_average.yaml +``` +The use of this hierarchical directory structure allows PlantSeg to find the necessary files quickly and can be used +to test different segmentation algorithms/parameter combinations minimizing the memory overhead on the disk. +For the sake of reproducibility, every file is associated with a configuration file ".yaml" that saves all parameters used +to produce the result. + +## LiftedMulticut segmentation +As reported in our [paper](https://elifesciences.org/articles/57613), if one has a nuclei signal imaged together with +the boundary signal, we could leverage the fact that one cell contains only one nucleus and use the `LiftedMultict` +segmentation strategy and obtain improved segmentation. +We will use the _Arabidopsis thaliana_ lateral root as an example. The `LiftedMulticut` strategy consists of running +PlantSeg two times: +1. Using PlantSeg to predict the nuclei probability maps using the `lightsheet_unet_bce_dice_nuclei_ds1x` network. +In this case, only the pre-processing and CNN prediction steps are enabled in the config. See [example config](../../../plantseg/resources/nuclei_predictions_example.yaml). +```bash +plantseg --config nuclei_predictions_example.yaml +``` +2. Using PlantSeg to segment the input image with the `LiftedMulticut` algorithm given the nuclei probability maps from the 1st step. +See [example config](../../../plantseg/resources/lifted_multicut_example.yaml). The notable difference is that in the `segmentation` +part of the config, we set `name: LiftedMulticut` and the `nuclei_predictions_path` as the path to the directory where the nuclei pmaps +were saved in step 1. Also, make sure that the `path` attribute points to the raw files containing the cell boundary staining (NOT THE NUCLEI). +```bash +plantseg --config lifted_multicut_example.yaml +``` + +If case when the nuclei segmentation is given, one should skip step 1., add `is_segmentation=True` flag in the [config](../../../plantseg/resources/lifted_multicut_example.yaml) +and directly run step 2. diff --git a/docs/chapters/plantseg_classic_gui/cnn_predictions.md b/docs/chapters/plantseg_classic_gui/cnn_predictions.md new file mode 100644 index 00000000..d0d32085 --- /dev/null +++ b/docs/chapters/plantseg_classic_gui/cnn_predictions.md @@ -0,0 +1,31 @@ +# CNN Predictions + +![alt text](https://github.com/hci-unihd/plant-seg/raw/assets/images/cnn-predictions.png) + +The CNN predictions widget process the stacks at hand with a Convolutional Neural Network. The output is +a boundary classification image, where every voxel gets a value between 0 (not a cell boundary) and 1 (cell boundary). + +The input image can be a raw stack "tiff"/"h5" or the output of the PreProcessing widget. + +* The **Model Name** menu shows all available models. There are two main basic models available + 1. **Generic confocal**is a generic model for all confocal datasets. + Some examples: + ![alt text](https://github.com/hci-unihd/plant-seg/raw/assets/images/confocal.png) + + 2. **Generic lightsheet** this is a generic model for all lightsheet datasets. + Some examples: + ![alt text](https://github.com/hci-unihd/plant-seg/raw/assets/images/cos_root_mc_raw.png) + +* Due to memory constraints, usually a complete stack does not fit the GPUs memory, + therefore the **Patch size** can be used to optimize the performance of the pipeline. + Usually, larger patches cost more memory but can slightly improve performance. + For 2D segmentation, the **Patch size** relative to the z-axis has to be set to 1. + +* To minimize the boundary effect due to the sliding windows patching, we can use different **stride**: + 1. Accurate: corresponding to a stride 50% of the patch size (yield best predictions/segmentation accuracy) + 2. Balanced: corresponding to a stride 75% of the patch size + 3. Draft: corresponding to a stride 95% of the patch size (yield fastest runtime) + +* The **Device type** menu can be used to enable or not GPU acceleration. CUDA greatly accelerates the network +predictions on Nvidia GPUs. At the moment, we don't support other GPUs manufacturers. + diff --git a/docs/chapters/plantseg_classic_gui/data_processing.md b/docs/chapters/plantseg_classic_gui/data_processing.md new file mode 100644 index 00000000..8c5cb701 --- /dev/null +++ b/docs/chapters/plantseg_classic_gui/data_processing.md @@ -0,0 +1,45 @@ +# Classic Data Processing + +![alt text](https://github.com/hci-unihd/plant-seg/raw/assets/images/preprocessing.png) +**PlantSeg** includes essential utilities for data pre-processing and post-processing. + +## Pre-Processing + +The input for this widget can be either a "raw" image or a "prediction" image. +Input formats allowed are tiff and h5, while output is always h5. + +* **Save Directory** can be used to define the output directory. + +* The most critical setting is the **Rescaling**. It is important to rescale the image to + match the resolution of the data used for training the Neural Network. +This operation can be done automatically by clicking on the GUI on **Guided**. +Be careful to use this function only in case of data considerably different from +the reference resolution. +``` +As an example: + - if your data has the voxel size of 0.3 x 0.1 x 0.1 (ZYX). + - and the networks was trained on 0.3 x 0.2 x 0.2 data (reference resolution). + +The required voxel size can be obtained by computing the ratio between your data and the +reference train dataset. In the example the rescaling factor = 1 x 2 x 2. +``` + +* The **Interpolation** field controls the interpolation type (0 for nearest neighbors, 1 for linear spline, +2 for quadratic). + +* The last field defines a **Filter** operation. Implemented there are: + 1. **Gaussian** Filtering: The parameter is a float and defines the sigma value for the gaussian smoothing. +The higher, the wider is filtering kernel. + 2. **Median** Filtering: Apply median operation convolutionally over the image. + The kernel is a sphere of size defined in the parameter field. + +## Post-Processing + +A post-processing step can be performed after the **CNN-Predictions** and the **Segmentation**. +The post-processing options are: + * Converting the output to the tiff file format (default is h5). + + * Casting the **CNN-Predictions** output to *data_uint8* drastically reduces the memory footprint of the output + file. + +Additionally, the post-processing will scale back your outputs to the original voxels resolutions. \ No newline at end of file diff --git a/docs/chapters/plantseg_classic_gui/index.md b/docs/chapters/plantseg_classic_gui/index.md new file mode 100644 index 00000000..a472958a --- /dev/null +++ b/docs/chapters/plantseg_classic_gui/index.md @@ -0,0 +1,77 @@ +# PlantSeg from GUI + +The graphical user interface is the easiest way to configure and run PlantSeg. +Currently the GUI does not allow to visualize or interact with the data. +We recommend using [MorphographX](https://www.mpipz.mpg.de/MorphoGraphX) or +[Fiji](https://fiji.sc/) in order to assert the success and quality of the pipeline results. + +![alt text](https://github.com/hci-unihd/plant-seg/raw/assets/images/plantseg_overview.png) + +## File Browser Widget +The file browser can be used to select the input files for the pipeline. +PlantSeg can run on a single file (button A) or in batch mode for all files inside a directory (button B). +If a directory is selected PlantSeg will run on all compatible files inside the directory. + +## Main Pipeline Configurator +The central panel of PlantSeg (C) is the core of the pipeline configuration. +It can be used for customizing and tuning the pipeline accordingly to the data at hand. +Detailed information for each stage can be found at: +* [Data-Processing](data_processing.md) +* [CNN-Predictions](cnn_predictions.md) +* [Segmentation](segmentation.md) + +Any of the above widgets can be run singularly or in sequence (left to right). The order of execution can not be +modified. + +## Run +The last panel has two main functions. +Running the pipeline (D), once the run button is pressed the +pipeline starts. The button is inactive until the process is finished. +Adding a custom model (E). Custom trained model can be done by using the dedicated popup. Training a new model can be +done following the instruction at [pytorch-3dunet](https://github.com/wolny/pytorch-3dunet). + +## Results + +The results are stored together with the source input files inside a nested directory structure. +As example, if we want to run PlantSeg inside a directory with 2 stacks, we will obtain the following +outputs: +``` +/file1.tif +/file2.tif +/PreProcesing/ +------------>/file1.h5 +------------>/file1.yaml +------------>/file2.h5 +------------>/file2.yaml +------------>/generic_confocal_3d_unet/ +------------------------------------->/file1_predictions.h5 +------------------------------------->/file1_predictions.yaml +------------------------------------->/file2_predictions.h5 +------------------------------------->/file2_predictions.yaml +------------------------------------->/GASP/ +------------------------------------------>/file_1_predions_gasp_average.h5 +------------------------------------------>/file_1_predions_gasp_average.yaml +------------------------------------------>/file_2_predions_gasp_average.h5 +------------------------------------------>/file_2_predions_gasp_average.yaml +------------------------------------------>/PostProcessing/ +--------------------------------------------------------->/file_1_predions_gasp_average.tiff +--------------------------------------------------------->/file_1_predions_gasp_average.yaml +--------------------------------------------------------->/file_2_predions_gasp_average.tiff +--------------------------------------------------------->/file_2_predions_gasp_average.yaml +``` +The use of this hierarchical directory structure allows PlantSeg to easily find the necessary files and can be used +to test different combination of segmentation algorithms/parameters minimizing the memory overhead on the disk. +For sake of reproducibility, every file is associated with a configuration file ".yaml" that saves all parameters used +to produce the result. + +## Start PlantSeg GUI +In order to start the PlantSeg app in GUI mode: +First, activate the newly created conda environment with: +```bash +conda activate plant-seg +``` + +then, run the GUI by simply typing: +```bash +$ plantseg --gui +``` diff --git a/docs/chapters/plantseg_classic_gui/segmentation.md b/docs/chapters/plantseg_classic_gui/segmentation.md new file mode 100644 index 00000000..bef0ca57 --- /dev/null +++ b/docs/chapters/plantseg_classic_gui/segmentation.md @@ -0,0 +1,48 @@ +# Segmentation + +The segmentation widget allows using very powerful graph partitioning techniques to obtain a segmentation from the +input stacks. +The input of this widget should be the output of the [CNN-predictions widget](https://github.com/hci-unihd/plant-seg/wiki/CNN-Predictions). +If the boundary prediction stage fails for any reason, a raw image could be used (especially if the cell boundaries are + very sharp, and the noise is low) but usually does not yield satisfactory results. + +* The **Algorithm** menu can be used to choose the segmentation algorithm. Available choices are: + 1. GASP (average): is a generalization of the classical hierarchical clustering. It usually delivers very + reliable and accurate segmentation. It is the default in PlantSeg. + 2. MutexWS: Mutex Watershed is a derivative of the standard Watershed, where we do not need seeds for the + segmentation. This algorithm performs very well in certain types of complex morphology (like ) + 3. MultiCut: in contrast to the other algorithms is not based on a greedy agglomeration but tries to find the + optimal global segmentation. This is, in practice, very hard, and it can be infeasible for huge stacks. + 4. DtWatershed: is our implementation of the distance transform Watershed. From the input, we extract a distance map + from the boundaries. Based this distance map, seeds are placed at local minima. Then those seeds are used for + computing the Watershed segmentation. To speed up the computation of GASP, MutexWS, and MultiCut, an over-segmentation + is obtained using Dt Watershed. + +* **Save Directory** defines the sub-directory's name where the segmentation results will be stored. + +* The **Under/Over- segmentation factor** is the most critical parameters for tuning the segmentation of GASP, +MutexWS and MultiCut. A small value will steer the segmentation towards under-segmentation. While a high-value bias the +segmentation towards the over-segmentation. This parameter does not affect the distance transform Watershed. + +* If **Run Watershed in 2D** value is True, the superpixels are created in 2D (over the z slice). While if False makes +the superpixels in the whole 3D volume. 3D superpixels are much slower and memory intensive but can improve + the segmentation accuracy. + +* The **CNN Predictions Threshold** is used for the superpixels extraction and Distance Transform Watershed. It has a +crucial role for the watershed seeds extraction and can be used similarly to the "Unde/Over segmentation factor." +to bias the final result. +A high value translates to less seeds being placed (more under segmentation), while with a low value, more seeds are + placed (more over-segmentation). + +* The input is used by the distance transform Watershed to extract the seed and find the segmentation boundaries. +If **Watershed Seeds Sigma** and **Watershed Boundary Sigma** are larger than + zero, a gaussian smoothing is applied on the input before the operations above. This is mainly helpful for + the seeds computation but, in most cases, does not impact segmentation quality. + +* The **Superpixels Minimum Size** applies a size filter to the initial superpixels over-segmentation. This removes +Watershed often produces small segments and is usually helpful for the subsequent agglomeration. + Segments smaller than the threshold will be merged with the nearest neighbor segment. + +* Even though GASP, MutexWS, and MultiCut are not very prone to produce small segments, the **Cell Minimum Size** can +be used as a final size processing filter. Segments smaller than the threshold will be merged with the nearest +neighbor cell. diff --git a/docs/chapters/plantseg_interactive_napari/data_processing.md b/docs/chapters/plantseg_interactive_napari/data_processing.md new file mode 100644 index 00000000..dee7e0f7 --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/data_processing.md @@ -0,0 +1,3 @@ +# Data Processing + +TODO diff --git a/docs/chapters/plantseg_interactive_napari/extra_pred.md b/docs/chapters/plantseg_interactive_napari/extra_pred.md new file mode 100644 index 00000000..bb19a7dd --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/extra_pred.md @@ -0,0 +1,3 @@ +# Extra Pred + +TODO diff --git a/docs/chapters/plantseg_interactive_napari/extra_seg.md b/docs/chapters/plantseg_interactive_napari/extra_seg.md new file mode 100644 index 00000000..366cad7a --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/extra_seg.md @@ -0,0 +1,3 @@ +# Extra Seg + +TODO diff --git a/docs/chapters/plantseg_interactive_napari/headless_batch_processing.md b/docs/chapters/plantseg_interactive_napari/headless_batch_processing.md new file mode 100644 index 00000000..0a44ea28 --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/headless_batch_processing.md @@ -0,0 +1,3 @@ +# Headless Batch Processing + +TODO diff --git a/docs/chapters/plantseg_interactive_napari/index.md b/docs/chapters/plantseg_interactive_napari/index.md new file mode 100644 index 00000000..846a6980 --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/index.md @@ -0,0 +1,3 @@ +# PlantSeg Interactive - Napari + +TODO diff --git a/docs/chapters/plantseg_interactive_napari/unet_gasp_workflow.md b/docs/chapters/plantseg_interactive_napari/unet_gasp_workflow.md new file mode 100644 index 00000000..4596e317 --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/unet_gasp_workflow.md @@ -0,0 +1,3 @@ +# UNet GASP Workflow + +TODO diff --git a/docs/chapters/plantseg_interactive_napari/unet_training.md b/docs/chapters/plantseg_interactive_napari/unet_training.md new file mode 100644 index 00000000..af5260f0 --- /dev/null +++ b/docs/chapters/plantseg_interactive_napari/unet_training.md @@ -0,0 +1,3 @@ +# UNet Training + +TODO diff --git a/docs/chapters/python_api/cnn_predictions.md b/docs/chapters/python_api/cnn_predictions.md new file mode 100644 index 00000000..824e0924 --- /dev/null +++ b/docs/chapters/python_api/cnn_predictions.md @@ -0,0 +1,52 @@ +# PlantSeg CNN Predictions + +In this section we will describe how to use the PlantSeg CNN Predictions workflow from the python API. + +## API-Reference: [plantseg.predictions.functional.predictions](https://github.com/hci-unihd/plant-seg/blob/master/plantseg/predictions/functional/predictions.py) +### ***unet_predictions*** +```python +def unet_predictions(raw: np.array, + model_name: str, + patch: Tuple[int, int, int] = (80, 160, 160), + stride: Union[str, Tuple[int, int, int]] = 'Accurate (slowest)', + device: str = 'cuda', + version: str = 'best', + model_update: bool = False, + mirror_padding: Tuple[int, int, int] = (16, 32, 32), + disable_tqdm: bool = False) -> np.array: + """ + Predict boundaries predictions from raw data using a 3D U-Net model. + + Args: + raw (np.array): raw data, must be a 3D array of shape (Z, Y, X) normalized between 0 and 1. + model_name (str): name of the model to use. A complete list of available models can be found here: + patch (tuple[int, int, int], optional): patch size to use for prediction. Defaults to (80, 160, 160). + stride (Union[str, tuple[int, int, int]], optional): stride to use for prediction. + If stride is defined as a string must be one of ['Accurate (slowest)', 'Balanced', 'Draft (fastest)']. + Defaults to 'Accurate (slowest)'. + device: (str, optional): device to use for prediction. Must be one of ['cpu', 'cuda', 'cuda:1', etc.]. + Defaults to 'cuda'. + version (str, optional): version of the model to use, must be either 'best' or 'last'. Defaults to 'best'. + model_update (bool, optional): if True will update the model to the latest version. Defaults to False. + mirror_padding (tuple[int, int, int], optional): padding to use for prediction. Defaults to (16, 32, 32). + disable_tqdm (bool, optional): if True will disable tqdm progress bar. Defaults to False. + + Returns: + np.array: predictions, 3D array of shape (Z, Y, X) with values between 0 and 1. + + """ + + ... +``` +```python +# Minimal example +from plantseg.predictions.functional.predictions import unet_predictions +import numpy as np + +raw = np.random.random((128, 256, 256)) +predictions = unet_predictions(raw, + model_name='generic_confocal_3d_unet', + patch=(80, 160, 160), + device='cuda') + +``` \ No newline at end of file diff --git a/docs/chapters/python_api/data_processing.md b/docs/chapters/python_api/data_processing.md new file mode 100644 index 00000000..dee7e0f7 --- /dev/null +++ b/docs/chapters/python_api/data_processing.md @@ -0,0 +1,3 @@ +# Data Processing + +TODO diff --git a/docs/chapters/python_api/index.md b/docs/chapters/python_api/index.md new file mode 100644 index 00000000..47df20fe --- /dev/null +++ b/docs/chapters/python_api/index.md @@ -0,0 +1,3 @@ +# PlantSeg Python API + +TODO diff --git a/docs/chapters/python_api/segmentation.md b/docs/chapters/python_api/segmentation.md new file mode 100644 index 00000000..aa4b68aa --- /dev/null +++ b/docs/chapters/python_api/segmentation.md @@ -0,0 +1,134 @@ +# PlantSeg Segmentation + +In this section we will describe how to use the PlantSeg segmentation workflows from the python API. + +## API-Reference: [plantseg.predictions.functional.segmentation](https://github.com/hci-unihd/plant-seg/blob/master/plantseg/segmentation/functional/segmentation.py) +* ***dt_watershed*** +```python +def dt_watershed(boundary_pmaps: np.array, + threshold: float = 0.5, + sigma_seeds: float = 1., + stacked: bool = False, + sigma_weights: float = 2., + min_size: int = 100, + alpha: float = 1.0, + pixel_pitch: tuple[int, ...] = None, + apply_nonmax_suppression: bool = False, + n_threads: int = None, + mask: np.array = None) -> np.array: + """ Wrapper around elf.distance_transform_watershed + Args: + boundary_pmaps (np.ndarray): input height map. + threshold (float): value for the threshold applied before distance transform. + sigma_seeds (float): smoothing factor for the watershed seed map. + stacked (bool): if true the ws will be executed in 2D slice by slice, otherwise in 3D. + sigma_weights (float): smoothing factor for the watershed weight map (default: 2). + min_size (int): minimal size of watershed segments (default: 100) + alpha (float): alpha used to blend input_ and distance_transform in order to obtain the + watershed weight map (default: .9) + pixel_pitch (list-like[int]): anisotropy factor used to compute the distance transform (default: None) + apply_nonmax_suppression (bool): whether to apply non-maximum suppression to filter out seeds. + Needs nifty. (default: False) + n_threads (int): if not None, parallelize the 2D stacked ws. (default: None) + mask (np.ndarray) + Returns: + np.ndarray: watershed segmentation + """ + + ... +``` + +https://github.com/hci-unihd/plant-seg/blob/de397694f523f142d67a38d5611acefd03e33137/plantseg/segmentation/functional/segmentation.py#L23-L122 +* ***gasp*** +```python +def gasp(boundary_pmaps: np.array, + superpixels: np.array = None, + gasp_linkage_criteria: str = 'average', + beta: float = 0.5, + post_minsize: int = 100, + n_threads: int = 6) -> np.array: + """ + Implementation of the GASP algorithm for segmentation from affinities. + Args: + boundary_pmaps (np.ndarray): cell boundary predictions. + superpixels (np.ndarray): superpixel segmentation. If None, GASP will be run from the pixels. (default: None) + gasp_linkage_criteria (str): Linkage criteria for GASP. (default: 'average') + beta (float): beta parameter for GASP. A small value will steer the segmentation towards under-segmentation. + While a high-value bias the segmentation towards the over-segmentation. (default: 0.5) + post_minsize (int): minimal size of the segments after GASP. (default: 100) + n_threads (int): number of threads used for GASP. (default: 6) + Returns: + np.ndarray: GASP output segmentation + """ + + ... +``` + +* ***mutex_ws*** +```python +def mutex_ws(boundary_pmaps: np.array, + superpixels: np.array = None, + beta: float = 0.5, + post_minsize: int = 100, + n_threads: int = 6) -> np.array: + """ + Wrapper around gasp with mutex_watershed as linkage criteria. + Args: + boundary_pmaps (np.ndarray): cell boundary predictions. 3D array of shape (Z, Y, X) with values between 0 and 1. + superpixels (np.ndarray): superpixel segmentation. Must have the same shape as boundary_pmaps. + If None, GASP will be run from the pixels. (default: None) + beta (float): beta parameter for GASP. A small value will steer the segmentation towards under-segmentation. + While a high-value bias the segmentation towards the over-segmentation. (default: 0.5) + post_minsize (int): minimal size of the segments after GASP. (default: 100) + n_threads (int): number of threads used for GASP. (default: 6) + Returns: + np.ndarray: GASP output segmentation + """ + + ... +``` + +* ***multicut*** +```python +def multicut(boundary_pmaps: np.array, + superpixels: np.array, + beta: float = 0.5, + post_minsize: int = 50) -> np.array: + + """ + Multicut segmentation from boundary predictions. + Args: + boundary_pmaps (np.ndarray): cell boundary predictions, 3D array of shape (Z, Y, X) with values between 0 and 1. + superpixels (np.ndarray): superpixel segmentation. Must have the same shape as boundary_pmaps. + beta (float): beta parameter for the Multicut. A small value will steer the segmentation towards + under-segmentation. While a high-value bias the segmentation towards the over-segmentation. (default: 0.5) + post_minsize (int): minimal size of the segments after Multicut. (default: 100) + Returns: + np.ndarray: Multicut output segmentation + """ + + ... +``` + +* ***lifted_multicut_from_nuclei_segmentation*** +```python +def lifted_multicut_from_nuclei_segmentation(boundary_pmaps: np.array, + nuclei_seg: np.array, + superpixels: np.array, + beta: float = 0.5, + post_minsize: int = 50) -> np.array: + """ + Lifted Multicut segmentation from boundary predictions and nuclei segmentation. + Args: + boundary_pmaps (np.ndarray): cell boundary predictions, 3D array of shape (Z, Y, X) with values between 0 and 1. + nuclei_seg (np.array): Nuclei segmentation. Must have the same shape as boundary_pmaps. + superpixels (np.ndarray): superpixel segmentation. Must have the same shape as boundary_pmaps. + beta (float): beta parameter for the Multicut. A small value will steer the segmentation towards + under-segmentation. While a high-value bias the segmentation towards the over-segmentation. (default: 0.5) + post_minsize (int): minimal size of the segments after Multicut. (default: 100) + Returns: + np.ndarray: Multicut output segmentation + """ + + ... +``` \ No newline at end of file diff --git a/docs/intro.md b/docs/intro.md index e0638e55..6dde548b 100644 --- a/docs/intro.md +++ b/docs/intro.md @@ -4,7 +4,7 @@ PlantSeg is a tool for 3D and 2D segmentation. The methods used are very generic and can be used for any instance segmentation workflow, but they are tuned towards cell segmentation in plant tissue. The tool is fundamentally composed of two main steps. -![Main Figure](_images/main_figure.png) +![Main Figure](https://github.com/hci-unihd/plant-seg/raw/assets/images/main_figure.png) * ***Cell boundary predictions***: Where a convolutional neural network is used to extract a voxel wise boundary classification. The neural network can filter out very different types/intensities of @@ -15,7 +15,7 @@ segmentation. We implemented four different algorithms for segmentation, each wi This approach is especially well suited for segmenting densely packed cells. For a complete description of the methods used, please check out our -[manuscript](https://elifesciences.org/articles/57613) +[manuscript](https://elifesciences.org/articles/57613). If you find PlantSeg useful, please cite {cite:p}`wolny2020accurate`.