Official implementation of the paper published at MIDL 2024
Report Bug
·
Request Feature
git clone https://github.com/CVPR-KIT/NucleiSeg-in-Histopathology-Images.git
pip install -r requirements.txt
The dataset for this challenge was obtained by carefully annotating tissue images of several patients with tumors of different organs and who were diagnosed at multiple hospitals. This dataset was created by downloading H&E stained tissue images captured at 40x magnification from the TCGA archive. H&E staining is a routine protocol to enhance the contrast of a tissue section and is commonly used for tumor assessment (grading, staging, etc.). Given the diversity of nuclei appearances across multiple organs and patients, and the richness of staining protocols adopted at multiple hospitals, the training dataset will enable the development of robust and generalizable nuclei segmentation techniques that will work right out of the box.
Training data containing 30 images and around 22,000 nuclear boundary annotations have been released to the public previously as a dataset article in IEEE Transactions on Medical Imaging in 2017.
Test set images with additional 7000 nuclear boundary annotations are available here: MoNuSeg 2018 Testing data.
A training sample with a segmentation mask from the training set can be seen below:
Tissue Image | Segmentation Mask (Ground Truth) |
---|---|
The dataset contains annotations of the nuclei, and there is a need to convert them into ground truth images for segmentation. This can be done using the xmlParser.py file.
The following structure should be followed for dataset directory configuration:
MonuSegData/
├── Test/
│ ├── GroundTruth/
│ └── TissueImages/
└── Training/
├── GroundTruth/
└── TissueImages/
All modifiable parameters related to the experiment and augmentation are present in the config.sys file. Set up all parameters here before proceeding.
The images in the MoNuSeg dataset are H&E stained images that have the following properties:
For more information, refer to this guide. It is recommended to perform staining normalization before augmentation. This can be done using the stainNormalization.py file:
python auxilary/stainNormalization.py --config config.sys
After preparing the data, run the following code to generate metadata files for image training and testing sets:
python auxilary/dataValidity.py --config config.sys
python slidingAug.py --config config.sys
It will create the folder “slidingAug” with sliding augmentations. The parameters can be changed in the config.sys file.
python auxilary/trainValsplit.py --config config.sys
It will create the folder and files for Train, Validation from the augmented folder.
python image_augmentation.py --config config.sys
It will create augmentations on the slided images in a new folder named “augmentated”. Parameters can be changed in the config.sys file.
After checking the dataset information in the config.sys file, run:
python main.py --config config.sys |& tee log/log-08-07.txt
The parameters can be changed as per requirement in the config.sys file. A log file is created in the log folder.
For testing or inferencing images, ensure they are in the correct format and directory information is added correctly in the config.sys file:
python train_test.py --img_dir all --expt_dir <Outputs/experiment_dir>
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
git checkout -b feature/newFeature
)git commit -m 'Added some new feature'
)git push origin feature/newFeature
)Distributed under the MIT License. See LICENSE for more information.