{ "cells": [ { "cell_type": "markdown", "id": "bec25bff", "metadata": {}, "source": [ "# MONet Bundle\n", "\n", "In this notebook, we will demonstrate how to create a MONAI Bundle supporting nnUNet experiment for training and inference. In this step-by step tutorial, we will describe how to create all the required python code and YAML configuration files needed to train and evaluate a nnUNet model using the MONAI Bundle format.\n", "\n", "The tutorial assumes that the Spleen Dataset has been already downloaded and preprocessed as described in the [MONet Bundle Tutorial Notebook](06_monet_bundle.ipynb)." ] }, { "cell_type": "markdown", "id": "a597f118", "metadata": {}, "source": [ "## Setup environment" ] }, { "cell_type": "code", "execution_count": null, "id": "6e6fdd58", "metadata": {}, "outputs": [], "source": [ "!python -c \"import monai\" || pip install -q \"monai-weekly[pillow, tqdm]\"\n", "!python -c \"import nnunetv2\" || pip install -q nnunetv2" ] }, { "cell_type": "markdown", "id": "70a2adb6", "metadata": {}, "source": [ "## Setup imports" ] }, { "cell_type": "code", "execution_count": null, "id": "63d65cf0-56ea-4c45-88bc-e271a5e2195c", "metadata": {}, "outputs": [], "source": [ "import torch\n", "from monai.data import Dataset, DataLoader\n", "from monai.handlers import (\n", " StatsHandler,\n", " from_engine,\n", " MeanDice,\n", " ValidationHandler,\n", " LrScheduleHandler,\n", " CheckpointSaver,\n", " CheckpointLoader,\n", " TensorBoardStatsHandler,\n", " MLFlowHandler,\n", ")\n", "from monai.engines import SupervisedTrainer, SupervisedEvaluator\n", "from monai.transforms import Compose, Lambdad, Activationsd, AsDiscreted, Transposed, SaveImaged, LoadImaged, Decollated\n", "\n", "import re\n", "import pathlib\n", "import os\n", "import yaml\n", "import json\n", "from monai.bundle import ConfigParser\n", "import monai\n", "from pathlib import Path\n", "from odict import odict\n", "\n", "from monai.apps.nnunet import get_nnunet_trainer, get_nnunet_monai_predictor, convert_nnunet_to_monai_bundle, convert_monai_bundle_to_nnunet\n", "\n", "from monai.apps.nnunet import nnUNetV2Runner\n", "\n", "#from nnunetv2.utilities.dataset_name_id_conversion import maybe_convert_to_dataset_name\n", "#from nnunetv2.training.logging.nnunet_logger import nnUNetLogger\n", "import shutil\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "9d9c2ae8", "metadata": {}, "outputs": [], "source": [ "os.environ[\"MONAI_DATA_DIRECTORY\"] = \"/home/maia-user/Documents/MONAI/Data\"\n", "\n", "work_dir = os.path.join(os.environ[\"MONAI_DATA_DIRECTORY\"], \"nnUNet\")\n", "\n", "nnunet_raw = os.path.join(work_dir, \"nnUNet_raw_data_base\")\n", "nnunet_preprocessed = os.path.join(\".\", work_dir, \"nnUNet_preprocessed\")\n", "nnunet_results = os.path.join(\".\", work_dir, \"nnUNet_trained_models\")\n", "\n", "if not os.path.exists(nnunet_raw):\n", " os.makedirs(nnunet_raw)\n", "\n", "if not os.path.exists(nnunet_preprocessed):\n", " os.makedirs(nnunet_preprocessed)\n", "\n", "if not os.path.exists(nnunet_results):\n", " os.makedirs(nnunet_results)\n", "\n", "# claim environment variable\n", "os.environ[\"nnUNet_raw\"] = nnunet_raw\n", "os.environ[\"nnUNet_preprocessed\"] = nnunet_preprocessed\n", "os.environ[\"nnUNet_results\"] = nnunet_results\n", "os.environ[\"OMP_NUM_THREADS\"] = str(1)" ] }, { "cell_type": "markdown", "id": "c6757489", "metadata": {}, "source": [ "## nnUNet Trainer\n", "\n", "The core component for the nnUNet MONAI Bundle is the `get_nnunet_trainer` function. This function is responsible for creating the nnUNet trainer object from the native nnUNetv2 implementation. From the nnUNet trainer object, we can access the training components, such as the data loaders, model, learning rate scheduler, optimizer, and loss function, and perform training and inference tasks." ] }, { "cell_type": "code", "execution_count": null, "id": "b4090170-b7c4-402b-9d70-9b59c463354b", "metadata": {}, "outputs": [], "source": [ "nnunet_config = {\n", " \"dataset_name_or_id\": \"009\",\n", " \"configuration\": \"3d_fullres\",\n", " \"trainer_class_name\": \"nnUNetTrainer_10epochs\",\n", " \"plans_identifier\": \"nnUNetPlans\",\n", " \"fold\": 0,\n", "}\n", "\n", "\n", "nnunet_trainer = get_nnunet_trainer(**nnunet_config)" ] }, { "cell_type": "markdown", "id": "f355baa4", "metadata": {}, "source": [ "The function `get_nnunet_trainer` accepts the following parameters:\n", "\n", "- `dataset_name_or_id`: The dataset name or ID to be used for training and evaluation.\n", "- `fold`: The fold number for the cross-validation experiment.\n", "- `configuration`: The training configuration for the nnUNet trainer, usually `3d_fullres`.\n", "- `trainer_class_name`: The nnUNet trainer class name to be used for training, e.g. `nnUNetTrainer`.\n", "- `plans_identifier`: The nnUNet plans identifier for the dataset, e.g. `nnUNetPlans`." ] }, { "cell_type": "markdown", "id": "765619ea", "metadata": {}, "source": [ "## Train and Val Data Loaders" ] }, { "cell_type": "code", "execution_count": null, "id": "fb31b29c", "metadata": {}, "outputs": [], "source": [ "dataset_key = \"case_identifier\"" ] }, { "cell_type": "code", "execution_count": null, "id": "f8e60cdb", "metadata": {}, "outputs": [], "source": [ "train_dataloader = nnunet_trainer.dataloader_train\n", "train_data = [{dataset_key: k} for k in nnunet_trainer.dataloader_train.generator._data.identifiers]\n", "train_dataset = Dataset(data=train_data)" ] }, { "cell_type": "code", "execution_count": null, "id": "9b80cc9a", "metadata": {}, "outputs": [], "source": [ "val_dataloader = nnunet_trainer.dataloader_val\n", "val_data = [{dataset_key: k} for k in nnunet_trainer.dataloader_val.generator._data.identifiers]\n", "val_dataset = Dataset(data=val_data)" ] }, { "cell_type": "markdown", "id": "3c7756d7", "metadata": {}, "source": [ "## Network, Optimizer, and Loss Function" ] }, { "cell_type": "code", "execution_count": null, "id": "26b88a8e", "metadata": {}, "outputs": [], "source": [ "device = nnunet_trainer.device\n", "\n", "network = nnunet_trainer.network\n", "optimizer = nnunet_trainer.optimizer\n", "lr_scheduler = nnunet_trainer.lr_scheduler\n", "loss = nnunet_trainer.loss" ] }, { "cell_type": "markdown", "id": "5d7d6023", "metadata": {}, "source": [ "## Prepare Batch Function\n", "\n", "The nnUnet `DataLoader` returns a dictionary with the `data` and `target` keys. Since the `SupervisedTrainer` used in the MONAI Bundle expects the data and target to be separate tensors, we need to create a custom prepare batch function to extract the data and target tensors from the dictionary." ] }, { "cell_type": "code", "execution_count": null, "id": "155e9460-9f69-4b15-bfb9-eae032afbc92", "metadata": {}, "outputs": [], "source": [ "def prepare_nnunet_batch(batch, device, non_blocking):\n", " data = batch[\"data\"].to(device, non_blocking=non_blocking)\n", " if isinstance(batch[\"target\"], list):\n", " target = [i.to(device, non_blocking=non_blocking) for i in batch[\"target\"]]\n", " else:\n", " target = batch[\"target\"].to(device, non_blocking=non_blocking)\n", " return data, target" ] }, { "cell_type": "code", "execution_count": null, "id": "70774a2a-84ff-4297-98e5-6452567f13a1", "metadata": {}, "outputs": [], "source": [ "image, label = prepare_nnunet_batch(next(iter(train_dataloader)), device=\"cpu\", non_blocking=True)" ] }, { "cell_type": "markdown", "id": "54b5c684", "metadata": {}, "source": [ "## MONAI Supervised Trainer\n", "\n", "The `SupervisedTrainer` class from MONAI is used to train the nnUNet model. For a minimal setup, we need to provide the model, optimizer, loss function, data loaders, number of epochs and the device to run the training." ] }, { "cell_type": "code", "execution_count": null, "id": "d480fbe6", "metadata": {}, "outputs": [], "source": [ "train_handlers = [StatsHandler(output_transform=from_engine([\"loss\"], first=True), tag_name=\"train_loss\")]" ] }, { "cell_type": "code", "execution_count": null, "id": "844bb28a", "metadata": {}, "outputs": [], "source": [ "iterations = 10\n", "epochs = 1" ] }, { "cell_type": "code", "execution_count": null, "id": "415bfc68", "metadata": {}, "outputs": [], "source": [ "trainer = SupervisedTrainer(\n", " amp=True,\n", " device=device,\n", " epoch_length=iterations,\n", " loss_function=loss,\n", " max_epochs=epochs,\n", " network=network,\n", " prepare_batch=prepare_nnunet_batch,\n", " optimizer=optimizer,\n", " train_data_loader=train_dataloader,\n", " train_handlers=train_handlers,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "ba2ce831", "metadata": {}, "outputs": [], "source": [ "trainer.run()" ] }, { "cell_type": "markdown", "id": "c41fcf2a", "metadata": {}, "source": [ "## Adding Validation and Validation Metrics\n", "\n", "For a complete training setup, we need to add the validation data loader and the validation metrics to the `SupervisedTrainer`. Using the MONAI class `SupervisedEvaluator`, we can evaluate the model on the validation data loader and calculate the validation metrics (`Dice Score`)." ] }, { "cell_type": "code", "execution_count": null, "id": "e5f713a2", "metadata": {}, "outputs": [], "source": [ "val_key_metric = MeanDice(output_transform=from_engine([\"pred\", \"label\"]), reduction=\"mean\", include_background=False)\n", "\n", "additional_metrics = {\n", " \"Val_Dice_Per_Class\": MeanDice(\n", " output_transform=from_engine([\"pred\", \"label\"]),\n", " reduction=\"mean_batch\",\n", " include_background=False,\n", " )\n", "}" ] }, { "cell_type": "markdown", "id": "fa8287fb", "metadata": {}, "source": [ "Additionally, in order to compute the Mean Dice score over the batch, we need to apply a pos-processing transformtation to the nnUNet model output. Since `MeanDice` accepts `y` and `y_preds` as Batch-first tensors (BCHW[D]), we need to create a custom post-processing transform to convert the nnUNet model output to the required format." ] }, { "cell_type": "code", "execution_count": null, "id": "9e28ec37", "metadata": {}, "outputs": [], "source": [ "num_classes = 2\n", "\n", "postprocessing = Compose(\n", " transforms=[\n", " ## Extract only high-res predictions from Deep Supervision\n", " Lambdad(keys=[\"pred\", \"label\"], func=lambda x: x[0]),\n", " ## Apply Softmax to the predictions\n", " Activationsd(keys=\"pred\", softmax=True),\n", " ## Binarize the predictions\n", " AsDiscreted(keys=\"pred\", threshold=0.5),\n", " ## Convert the labels to one-hot\n", " AsDiscreted(keys=\"label\", to_onehot=num_classes),\n", " ]\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "7a90e728", "metadata": {}, "outputs": [], "source": [ "val_handlers = [StatsHandler(iteration_log=False)]" ] }, { "cell_type": "code", "execution_count": null, "id": "e9586476", "metadata": {}, "outputs": [], "source": [ "val_iterations = 10\n", "val_interval = 1" ] }, { "cell_type": "code", "execution_count": null, "id": "e8081ce4", "metadata": {}, "outputs": [], "source": [ "evaluator = SupervisedEvaluator(\n", " amp=True,\n", " device=device,\n", " epoch_length=val_iterations,\n", " network=network,\n", " key_val_metric={\"Val_Dice\": val_key_metric},\n", " prepare_batch=prepare_nnunet_batch,\n", " val_data_loader=val_dataloader,\n", " val_handlers=val_handlers,\n", " postprocessing=postprocessing,\n", " additional_metrics=additional_metrics,\n", ")" ] }, { "cell_type": "markdown", "id": "aadfd315", "metadata": {}, "source": [ "And finally, we add the evaluator to the `SupervisedTrainer` to calculate the validation metrics during training." ] }, { "cell_type": "code", "execution_count": null, "id": "e2bc29ad", "metadata": {}, "outputs": [], "source": [ "train_handlers.append(ValidationHandler(epoch_level=True, interval=val_interval, validator=evaluator))" ] }, { "cell_type": "markdown", "id": "3c904b0a", "metadata": {}, "source": [ "We can also add the `MeanDice` metric to the `SupervisedTrainer` to calculate the mean dice score over the batch during training." ] }, { "cell_type": "code", "execution_count": null, "id": "a9d44de3", "metadata": {}, "outputs": [], "source": [ "train_key_metric = MeanDice(output_transform=from_engine([\"pred\", \"label\"]), reduction=\"mean\", include_background=False)\n", "\n", "additional_metrics = {\n", " \"Train_Dice_Per_Class\": MeanDice(\n", " output_transform=from_engine([\"pred\", \"label\"]),\n", " reduction=\"mean_batch\",\n", " include_background=False,\n", " )\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "a339901b", "metadata": {}, "outputs": [], "source": [ "trainer = SupervisedTrainer(\n", " amp=True,\n", " device=device,\n", " epoch_length=iterations,\n", " loss_function=loss,\n", " max_epochs=epochs,\n", " network=network,\n", " prepare_batch=prepare_nnunet_batch,\n", " optimizer=optimizer,\n", " train_data_loader=train_dataloader,\n", " train_handlers=train_handlers,\n", " key_train_metric={\"Train_Dice\": train_key_metric},\n", " postprocessing=postprocessing,\n", " additional_metrics=additional_metrics,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "8f1869f5", "metadata": {}, "outputs": [], "source": [ "trainer.run()" ] }, { "cell_type": "markdown", "id": "fbfb0762", "metadata": {}, "source": [ "## Learning Rate Scheduler\n", "\n", "One last component to add to the `SupervisedTrainer`, in order to replicate the training behaviour of the native nnUNet, is the learning rate scheduler." ] }, { "cell_type": "code", "execution_count": null, "id": "9b92598c", "metadata": {}, "outputs": [], "source": [ "train_handlers.append(LrScheduleHandler(lr_scheduler=lr_scheduler, print_lr=True))" ] }, { "cell_type": "code", "execution_count": null, "id": "54efe274", "metadata": {}, "outputs": [], "source": [ "trainer = SupervisedTrainer(\n", " amp=True,\n", " device=device,\n", " epoch_length=iterations,\n", " loss_function=loss,\n", " max_epochs=epochs,\n", " network=network,\n", " prepare_batch=prepare_nnunet_batch,\n", " optimizer=optimizer,\n", " train_data_loader=train_dataloader,\n", " train_handlers=train_handlers,\n", " key_train_metric={\"Train_Dice\": train_key_metric},\n", " postprocessing=postprocessing,\n", " additional_metrics=additional_metrics,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "1f687bc6", "metadata": {}, "outputs": [], "source": [ "trainer.run()" ] }, { "cell_type": "code", "execution_count": null, "id": "fac3c8c5", "metadata": {}, "outputs": [], "source": [ "train_handlers[-1].lr_scheduler.get_last_lr()" ] }, { "cell_type": "markdown", "id": "52a36367", "metadata": {}, "source": [ "## Checkpointing\n", "\n", "To save the model weights during training, we can use the `CheckpointSaver` callback from MONAI. This callback saves the model weights after each epoch.\n", "We can later use the `CheckpointLoader` to load the model weights and perform inference or resume training." ] }, { "cell_type": "code", "execution_count": null, "id": "54c4fb21", "metadata": {}, "outputs": [], "source": [ "ckpt_dir = \"MONetBundle/models/fold_0\"\n", "\n", "val_handlers.append(\n", " CheckpointSaver(\n", " save_dir=ckpt_dir,\n", " save_dict={\n", " \"network_weights\": nnunet_trainer.network._orig_mod\n", " \"optimizer_state\": nnunet_trainer.optimizer,\n", " \"scheduler\": nnunet_trainer.lr_scheduler,\n", " },\n", " # save_final= True,\n", " save_interval=1,\n", " save_key_metric=True,\n", " # final_filename= \"model_final.pt\",\n", " #key_metric_filename= \"model.pt\",\n", " n_saved=1,\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "977f723d", "metadata": {}, "outputs": [], "source": [ "trainer = SupervisedTrainer(\n", " amp=True,\n", " device=device,\n", " epoch_length=iterations,\n", " loss_function=loss,\n", " max_epochs=epochs+1,\n", " network=network,\n", " prepare_batch=prepare_nnunet_batch,\n", " optimizer=optimizer,\n", " train_data_loader=train_dataloader,\n", " train_handlers=train_handlers,\n", " key_train_metric={\"Train_Dice\": train_key_metric},\n", " postprocessing=postprocessing,\n", " additional_metrics=additional_metrics,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "3f0d6c31", "metadata": {}, "outputs": [], "source": [ "trainer.run()" ] }, { "cell_type": "markdown", "id": "2a10c5d9", "metadata": {}, "source": [ "## Reload Checkpoint\n", "\n", "When resuming the training from a checkpoint, we also want to restart the training from the same epoch. To do this, we need to load the checkpoint and update the `trainer.state.epoch` and `trainer.state.iteration` parameter in the `SupervisedTrainer`." ] }, { "cell_type": "code", "execution_count": null, "id": "91c0a0ce", "metadata": {}, "outputs": [], "source": [ "def subfiles(folder, prefix=None, suffix=None, join=True, sort=True):\n", " files = [f for f in os.listdir(folder) if os.path.isfile(os.path.join(folder, f))]\n", " if prefix is not None:\n", " files = [f for f in files if f.startswith(prefix)]\n", " if suffix is not None:\n", " files = [f for f in files if f.endswith(suffix)]\n", " if sort:\n", " files.sort()\n", " if join:\n", " files = [os.path.join(folder, f) for f in files]\n", " return files\n", "\n", "\n", "def get_checkpoint(epoch, ckpt_dir):\n", " if epoch == \"latest\":\n", "\n", " latest_checkpoints = subfiles(ckpt_dir, prefix=\"checkpoint_epoch\", sort=True, join=False)\n", " epochs = []\n", " for latest_checkpoint in latest_checkpoints:\n", " epochs.append(int(latest_checkpoint[len(\"checkpoint_epoch=\") : -len(\".pt\")]))\n", "\n", " epochs.sort()\n", " if len(epochs) == 0:\n", " return None\n", " latest_epoch = epochs[-1]\n", " return latest_epoch\n", " else:\n", " return epoch\n", "\n", "\n", "def reload_checkpoint(trainer, epoch, num_train_batches_per_epoch, ckpt_dir, lr_scheduler=None):\n", "\n", " epoch_to_load = get_checkpoint(epoch, ckpt_dir)\n", " trainer.state.epoch = epoch_to_load\n", " trainer.state.iteration = (epoch_to_load * num_train_batches_per_epoch) + 1\n", "\n", " if lr_scheduler is not None:\n", " lr_scheduler.ctr = epoch_to_load\n", " lr_scheduler.step(epoch_to_load)" ] }, { "cell_type": "code", "execution_count": null, "id": "ece9e988", "metadata": {}, "outputs": [], "source": [ "reload_checkpoint_epoch = \"latest\"\n", "\n", "train_handlers.append(\n", " CheckpointLoader(\n", " load_path=os.path.join(\n", " ckpt_dir, \"checkpoint_epoch=\" + str(get_checkpoint(reload_checkpoint_epoch, ckpt_dir)) + \".pt\"\n", " ),\n", " load_dict={\n", " \"network_weights\": nnunet_trainer.network._orig_mod,\n", " \"optimizer_state\": nnunet_trainer.optimizer,\n", " \"scheduler\": nnunet_trainer.lr_scheduler,\n", " },\n", " map_location=device,\n", " )\n", ")" ] }, { "cell_type": "markdown", "id": "78db7528", "metadata": {}, "source": [ "## Initial nnUNet Checkpoint\n", "\n", "In order to provide compatibility with the native nnUNet, we need to save the nnUNet-specific configuration, together the regular MONAI checkpoint. This is done only once, before the training starts. At the end of the training, we will have a MONAI checkpoint and a nnUNet checkpoint. To be able to convert the MONAI checkpoint to a nnUNet checkpoint at any time, we can then combine the two checkpoints." ] }, { "cell_type": "code", "execution_count": null, "id": "8131d003", "metadata": {}, "outputs": [], "source": [ "checkpoint = {\n", " \"inference_allowed_mirroring_axes\": nnunet_trainer.inference_allowed_mirroring_axes,\n", " \"init_args\": nnunet_trainer.my_init_kwargs,\n", " \"trainer_name\": nnunet_trainer.__class__.__name__,\n", "}\n", "checkpoint_filename = os.path.join(Path(ckpt_dir).parent, \"nnunet_checkpoint.pth\")\n", "\n", "torch.save(checkpoint, checkpoint_filename)" ] }, { "cell_type": "markdown", "id": "c0e26cdb", "metadata": {}, "source": [ "## MLFlow and Tensorboard Monitoring\n", "\n", "To monitor the training process, we can use MLFlow and Tensorboard. We can log the training metrics, hyperparameters, and model weights to MLFlow, and visualize the training metrics using Tensorboard." ] }, { "cell_type": "code", "execution_count": null, "id": "b2976402", "metadata": {}, "outputs": [], "source": [ "log_dir = \"MONetBundle/logs\"\n", "\n", "train_handlers.append(\n", " TensorBoardStatsHandler(log_dir=log_dir, output_transform=from_engine([\"loss\"], first=True), tag_name=\"train_loss\")\n", ")\n", "\n", "val_handlers.append(TensorBoardStatsHandler(log_dir=log_dir, iteration_log=False))" ] }, { "cell_type": "code", "execution_count": null, "id": "86d54487", "metadata": {}, "outputs": [], "source": [ "def mlflow_transform(state_output):\n", " return state_output[0][\"loss\"]\n", "\n", "\n", "class MLFlownnUNetHandler(MLFlowHandler):\n", " def __init__(self, label_dict, **kwargs):\n", " super(MLFlownnUNetHandler, self).__init__(**kwargs)\n", " self.label_dict = label_dict\n", "\n", " def _default_epoch_log(self, engine) -> None:\n", " \"\"\"\n", " Execute epoch level log operation.\n", " Default to track the values from Ignite `engine.state.metrics` dict and\n", " track the values of specified attributes of `engine.state`.\n", "\n", " Args:\n", " engine: Ignite Engine, it can be a trainer, validator or evaluator.\n", "\n", " \"\"\"\n", " log_dict = engine.state.metrics\n", " if not log_dict:\n", " return\n", "\n", " current_epoch = self.global_epoch_transform(engine.state.epoch)\n", "\n", " new_log_dict = {}\n", "\n", " for metric in log_dict:\n", " if type(log_dict[metric]) == torch.Tensor:\n", " for i, val in enumerate(log_dict[metric]):\n", " new_log_dict[metric+\"_{}\".format(self.label_dict[list(self.label_dict.keys())[i+1]])] = val\n", " else:\n", " new_log_dict[metric] = log_dict[metric]\n", " self._log_metrics(new_log_dict, step=current_epoch)\n", "\n", " if self.state_attributes is not None:\n", " attrs = {attr: getattr(engine.state, attr, None) for attr in self.state_attributes}\n", " self._log_metrics(attrs, step=current_epoch)" ] }, { "cell_type": "code", "execution_count": null, "id": "6e8fdc2e", "metadata": {}, "outputs": [], "source": [ "def create_mlflow_experiment_params(params_file, custom_params=None):\n", " params_dict = {}\n", " config_values = monai.config.deviceconfig.get_config_values()\n", " for k in config_values:\n", " params_dict[re.sub(\"[()]\", \" \", str(k))] = config_values[k]\n", "\n", " optional_config_values = monai.config.deviceconfig.get_optional_config_values()\n", " for k in optional_config_values:\n", " params_dict[re.sub(\"[()]\", \" \", str(k))] = optional_config_values[k]\n", "\n", " gpu_info = monai.config.deviceconfig.get_gpu_info()\n", " for k in gpu_info:\n", " params_dict[re.sub(\"[()]\", \" \", str(k))] = str(gpu_info[k])\n", "\n", " yaml_config_files = [params_file]\n", " # %%\n", " monai_config = {}\n", " for config_file in yaml_config_files:\n", " with open(config_file, \"r\") as file:\n", " monai_config.update(yaml.safe_load(file))\n", "\n", " monai_config[\"bundle_root\"] = str(Path(Path(params_file).parent).parent)\n", "\n", " parser = ConfigParser(monai_config, globals={\"os\": \"os\", \"pathlib\": \"pathlib\", \"json\": \"json\", \"ignite\": \"ignite\"})\n", "\n", " parser.parse(True)\n", "\n", " for k in monai_config:\n", " params_dict[k] = parser.get_parsed_content(k, instantiate=True)\n", "\n", " if custom_params is not None:\n", " for k in custom_params:\n", " params_dict[k] = custom_params[k]\n", " return params_dict" ] }, { "cell_type": "code", "execution_count": null, "id": "8fb3858f", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/mlflow_params.yaml\n", "\n", "dataset_name_or_id: \"009\"\n", "nnunet_trainer_class_name: \"nnUNetTrainer\"\n", "nnunet_plans_identifier: \"nnUNetPlans\"\n", "\n", "num_classes: 2\n", "label_dict:\n", " 0: \"background\"\n", " 1: \"spleen\"\n", " \n", "tracking_uri: \"http://localhost:5000\"\n", "mlflow_experiment_name: \"MONet_Bundle_Spleen\"\n", "mlflow_run_name: \"MONet_Bundle_Spleen\"\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "174505fe", "metadata": {}, "outputs": [], "source": [ "mlflow_experiment_name = \"MONet_Bundle_Spleen\"\n", "mlflow_run_name = \"MONet_Bundle_Spleen\"\n", "label_dict = {0: \"background\", 1: \"Spleen\"}\n", "tracking_uri = \"http://localhost:5000\"\n", "\n", "params_file = \"MONetBundle/mlflow_params.yaml\"\n", "\n", "\n", "train_handlers.append(\n", " MLFlownnUNetHandler(\n", " dataset_dict={\"train\": train_dataset},\n", " dataset_keys=dataset_key,\n", " experiment_param=create_mlflow_experiment_params(params_file),\n", " experiment_name=mlflow_experiment_name,\n", " label_dict=label_dict,\n", " output_transform=mlflow_transform,\n", " run_name=mlflow_run_name,\n", " state_attributes=[\"best_metric\", \"best_metric_epoch\"],\n", " tag_name=\"Train_Loss\",\n", " tracking_uri=tracking_uri,\n", " )\n", ")\n", "\n", "val_handlers.append(\n", " MLFlownnUNetHandler(\n", " experiment_name=mlflow_experiment_name,\n", " iteration_log=False,\n", " label_dict=label_dict,\n", " output_transform=mlflow_transform,\n", " run_name=mlflow_run_name,\n", " state_attributes=[\"best_metric\", \"best_metric_epoch\"],\n", " tracking_uri=tracking_uri,\n", " )\n", ")" ] }, { "cell_type": "markdown", "id": "20722504", "metadata": {}, "source": [ "To start the MLFlow server, we can run the following command in the terminal:\n", "\n", "```bash\n", "cd MLFlow && mlflow server\n", "```\n", "To run Tensorboard, we can use the following command:\n", "\n", "```bash\n", "tensorboard --logdir MONetBundle/logs\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "0d11d8c8", "metadata": {}, "outputs": [], "source": [ "trainer = SupervisedTrainer(\n", " amp=True,\n", " device=device,\n", " epoch_length=iterations,\n", " loss_function=loss,\n", " max_epochs=epochs+2,\n", " network=network,\n", " prepare_batch=prepare_nnunet_batch,\n", " optimizer=optimizer,\n", " train_data_loader=train_dataloader,\n", " train_handlers=train_handlers,\n", " key_train_metric={\"Train_Dice\": train_key_metric},\n", " postprocessing=postprocessing,\n", " additional_metrics=additional_metrics,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "fcc921bf", "metadata": {}, "outputs": [], "source": [ "trainer.run()" ] }, { "cell_type": "markdown", "id": "b353be42", "metadata": {}, "source": [ "## Create MONAI Bundle" ] }, { "cell_type": "code", "execution_count": null, "id": "564700ff", "metadata": {}, "outputs": [], "source": [ "%%bash \n", "\n", "python -m monai.bundle init_bundle MONetBundle\n", "\n", "mkdir -p MONetBundle/nnUNet\n", "mkdir -p MONetBundle/src\n", "mkdir -p MONetBundle/nnUNet/evaluator\n", "which tree && tree MONetBundle || true" ] }, { "cell_type": "code", "execution_count": null, "id": "70a37817", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/configs/logging.conf\n", "[loggers]\n", "keys=root\n", "\n", "[handlers]\n", "keys=consoleHandler\n", "\n", "[formatters]\n", "keys=fullFormatter\n", "\n", "[logger_root]\n", "level=INFO\n", "handlers=consoleHandler\n", "\n", "[handler_consoleHandler]\n", "class=StreamHandler\n", "level=INFO\n", "formatter=fullFormatter\n", "args=(sys.stdout,)\n", "\n", "[formatter_fullFormatter]\n", "format=%(asctime)s - %(name)s - %(levelname)s - %(message)s\n" ] }, { "cell_type": "code", "execution_count": null, "id": "4c25feaf", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/configs/metadata.json\n", "\n", "{\n", " \"schema\": \"https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json\",\n", " \"version\": \"0.1.0\",\n", " \"changelog\": {\n", " \"0.1.0\": \"Initial release\",\n", " },\n", " \"monai_version\": \"1.4.0\",\n", " \"pytorch_version\": \"2.3.0\",\n", " \"numpy_version\": \"1.21.2\",\n", " \"required_packages_version\": {\"nnunetv2\": \"2.6.0\"},\n", " \"task\": \"Decathlon spleen segmentation with nnUNet\",\n", " \"description\": \"A pre-trained nnUNet model for volumetric (3D) segmentation of the spleen from CT image\",\n", " \"authors\": \"Simone Bendazzoli\",\n", " \"copyright\": \"Copyright (c) MONAI Consortium\",\n", " \"data_source\": \"Task09_Spleen.tar from http://medicaldecathlon.com/\",\n", " \"data_type\": \"nifti\",\n", " \"image_classes\": \"single channel data, intensity scaled to [0, 1]\",\n", " \"label_classes\": \"single channel data, 1 is spleen, 0 is everything else\",\n", " \"pred_classes\": \"2 channels OneHot data, channel 1 is spleen, channel 0 is background\",\n", " \"eval_metrics\": {\n", " \"mean_dice\": 0.97\n", " },\n", " \"intended_use\": \"This is an example, not to be used for diagnostic purposes\",\n", " \"references\": [\n", " \"Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.\"\n", " ],\n", " \"network_data_format\":{\n", " \"inputs\": {\n", " \"image\": {\n", " \"type\": \"image\",\n", " \"format\": \"hounsfield\",\n", " \"modality\": \"CT\",\n", " \"num_channels\": 1,\n", " \"spatial_shape\": ['*', '*', '*'],\n", " \"dtype\": \"float32\",\n", " \"value_range\": [-1024, 1024],\n", " \"is_patch_data\": false,\n", " \"channel_def\": {\"0\": \"image\"}\n", " }\n", " },\n", " \"outputs\":{\n", " \"pred\": {\n", " \"type\": \"image\",\n", " \"format\": \"segmentation\",\n", " \"num_channels\": 1,\n", " \"spatial_shape\": ['*', '*', '*'],\n", " \"dtype\": \"float32\",\n", " \"value_range\": [0,1],\n", " \"is_patch_data\": false,\n", " \"channel_def\": {\"0\": \"background\", \"1\": \"spleen\"}\n", " }\n", " }\n", " }\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "cb7aa3da", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/global.yaml\n", "\n", "iterations: $@nnunet_trainer.num_iterations_per_epoch\n", "device: $@nnunet_trainer.device\n", "epochs: $@nnunet_trainer.num_epochs\n", "\n", "fold_id: 0\n", "\n", "bundle_root: .\n", "ckpt_dir: \"$@bundle_root + '/models/fold_'+str(@fold_id)\"" ] }, { "cell_type": "code", "execution_count": null, "id": "33f32c17", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/params.yaml\n", "\n", "\n", "dataset_name_or_id: \"100\"\n", "nnunet_trainer_class_name: \"nnUNetTrainer\"\n", "nnunet_plans_identifier: \"nnUNetPlans\"\n", "nnunet_configuration: \"3d_fullres\"\n", "\n", "num_classes: 2\n", "label_dict:\n", " 0: \"background\"\n", " 1: \"class1\"\n", " \n", "tracking_uri: \"http://localhost:5000\"\n", "mlflow_experiment_name: \"nnUNet_Bundle\"\n", "mlflow_run_name: \"nnUNet_Bundle\"\n", "log_dir: \"$@bundle_root + '/logs'\"" ] }, { "cell_type": "code", "execution_count": null, "id": "e31c3314", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/imports.yaml\n", "\n", "imports:\n", "- $import glob\n", "- $import os\n", "- $import ignite\n", "- $import torch\n", "- $import shutil\n", "- $import json\n", "- $import src\n", "- $import nnunetv2\n", "- $import src.mlflow\n", "- $import src.trainer\n", "- $from pathlib import Path" ] }, { "cell_type": "code", "execution_count": null, "id": "dc7fc76e", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/run.yaml\n", "\n", "run:\n", "- \"$torch.save(@checkpoint,@checkpoint_filename)\"\n", "- \"$shutil.copy(Path(@nnunet_model_folder).joinpath('dataset.json'), @bundle_root+'/models/dataset.json')\"\n", "- \"$shutil.copy(Path(@nnunet_model_folder).joinpath('plans.json'), @bundle_root+'/models/plans.json')\"\n", "- \"$@train#pbar.attach(@train#trainer,output_transform=lambda x: {'loss': x[0]['loss']})\"\n", "- \"$@validate#pbar.attach(@validate#evaluator,metric_names=['Val_Dice'])\"\n", "- $@train#trainer.run()\n", "\n", "initialize:\n", "- $monai.utils.set_determinism(seed=123)" ] }, { "cell_type": "code", "execution_count": null, "id": "c6302d85", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/nnunet_trainer.yaml\n", "\n", "nnunet_trainer:\n", " _target_ : get_nnunet_trainer\n", " dataset_name_or_id: \"@dataset_name_or_id\"\n", " configuration: \"@nnunet_configuration\"\n", " fold: '@fold_id'\n", " trainer_class_name: \"@nnunet_trainer_class_name\"\n", " plans_identifier: \"@nnunet_plans_identifier\"\n", "\n", "loss: $@nnunet_trainer.loss\n", "lr_scheduler: $@nnunet_trainer.lr_scheduler\n", "\n", "network: $@nnunet_trainer.network\n", "\n", "optimizer: $@nnunet_trainer.optimizer\n", "\n", "checkpoint:\n", " init_args: '$@nnunet_trainer.my_init_kwargs'\n", " trainer_name: '$@nnunet_trainer.__class__.__name__'\n", " inference_allowed_mirroring_axes: '$@nnunet_trainer.inference_allowed_mirroring_axes'\n", "\n", "checkpoint_filename: \"$@bundle_root+'/models/nnunet_checkpoint.pth'\"\n", "\n", "dataset_name: \"$nnunetv2.utilities.dataset_name_id_conversion.maybe_convert_to_dataset_name(@dataset_name_or_id)\"\n", "nnunet_model_folder: \"$os.path.join(os.environ['nnUNet_results'], @dataset_name, @nnunet_trainer_class_name+'__'+@nnunet_plans_identifier+'__'+@nnunet_configuration)\"" ] }, { "cell_type": "code", "execution_count": null, "id": "f8718d0b", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/train_metrics.yaml\n", "\n", "train_key_metric:\n", " Train_Dice:\n", " _target_: \"MeanDice\"\n", " include_background: False\n", " output_transform: \"$monai.handlers.from_engine(['pred', 'label'])\"\n", " reduction: \"mean\"\n", "\n", "train_additional_metrics:\n", " Train_Dice_per_class:\n", " _target_: \"MeanDice\"\n", " include_background: False\n", " output_transform: \"$monai.handlers.from_engine(['pred', 'label'])\"\n", " reduction: \"mean_batch\"" ] }, { "cell_type": "code", "execution_count": null, "id": "8bd28f89", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/train_postprocessing.yaml\n", "\n", "train_postprocessing:\n", " _target_: \"Compose\"\n", " transforms:\n", " - _target_: Lambdad\n", " keys:\n", " - \"pred\"\n", " - \"label\"\n", " func: \"$lambda x: x[0]\"\n", " - _target_: Activationsd\n", " keys:\n", " - \"pred\"\n", " softmax: True\n", " - _target_: AsDiscreted\n", " keys:\n", " - \"pred\"\n", " threshold: 0.5\n", " - _target_: AsDiscreted\n", " keys:\n", " - \"label\"\n", " to_onehot: \"@num_classes\"\n", " \n", "train_postprocessing_region_based:\n", " _target_: \"Compose\"\n", " transforms:\n", " - _target_: Lambdad\n", " keys:\n", " - \"pred\"\n", " - \"label\"\n", " func: \"$lambda x: x[0]\"\n", " - _target_: Activationsd\n", " keys:\n", " - \"pred\"\n", " sigmoid: True\n", " - _target_: AsDiscreted\n", " keys:\n", " - \"pred\"\n", " threshold: 0.5" ] }, { "cell_type": "code", "execution_count": null, "id": "7268a30a", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/train.yaml\n", "\n", "dataset_key: \"case_identifier\"\n", "train:\n", " pbar:\n", " _target_: \"ignite.contrib.handlers.tqdm_logger.ProgressBar\"\n", " dataloader: $@nnunet_trainer.dataloader_train\n", " train_data: \"$[{'@dataset_key':k} for k in @nnunet_trainer.dataloader_train.generator._data.identifiers]\"\n", " train_dataset:\n", " _target_: Dataset\n", " data: \"@train#train_data\"\n", " inferer:\n", " _target_: SimpleInferer\n", " trainer:\n", " _target_: SupervisedTrainer\n", " amp: true\n", " device: '@device'\n", " additional_metrics: \"@train_additional_metrics\"\n", " epoch_length: \"@iterations\"\n", " inferer: '@train#inferer'\n", " key_train_metric: '@train_key_metric'\n", " loss_function: '@loss'\n", " max_epochs: '@epochs'\n", " network: '@network'\n", " prepare_batch: \"$src.trainer.prepare_nnunet_batch\"\n", " optimizer: '@optimizer'\n", " postprocessing: '@train_postprocessing'\n", " train_data_loader: '@train#dataloader'\n", " train_handlers: '@train_handlers#handlers'" ] }, { "cell_type": "code", "execution_count": null, "id": "944f75b6", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/train_handlers.yaml\n", "\n", "train_handlers:\n", " handlers:\n", " - _target_: \"$src.mlflow.MLFlownnUNetHandler\"\n", " label_dict: \"@label_dict\"\n", " tracking_uri: \"@tracking_uri\"\n", " experiment_name: \"@mlflow_experiment_name\"\n", " run_name: \"@mlflow_run_name\"\n", " output_transform: \"$src.mlflow.mlflow_transform\"\n", " dataset_dict:\n", " train: \"@train#train_dataset\"\n", " dataset_keys: '@dataset_key'\n", " state_attributes:\n", " - \"iteration\"\n", " - \"epoch\"\n", " tag_name: 'Train_Loss'\n", " experiment_param: \"$src.mlflow.create_mlflow_experiment_params( @bundle_root + '/nnUNet/params.yaml')\"\n", " #artifacts=None\n", " optimizer_param_names: 'lr'\n", " #close_on_complete: False\n", " - _target_: LrScheduleHandler\n", " lr_scheduler: '@lr_scheduler'\n", " print_lr: true\n", " - _target_: ValidationHandler\n", " epoch_level: true\n", " interval: '@val_interval'\n", " validator: '@validate#evaluator'\n", " #- _target_: StatsHandler\n", " # output_transform: $monai.handlers.from_engine(['loss'], first=True)\n", " # tag_name: train_loss\n", " - _target_: TensorBoardStatsHandler\n", " log_dir: '@log_dir'\n", " output_transform: $monai.handlers.from_engine(['loss'], first=True)\n", " tag_name: train_loss" ] }, { "cell_type": "code", "execution_count": null, "id": "4933773b", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/configs/train_resume.yaml\n", "\n", "run:\n", "- '$src.trainer.reload_checkpoint(@train#trainer,@reload_checkpoint_epoch,@iterations,@ckpt_dir)'\n", "- \"$@train#pbar.attach(@train#trainer,output_transform=lambda x: {'loss': x[0]['loss']})\"\n", "- \"$@validate#pbar.attach(@validate#evaluator,metric_names=['Val_Dice'])\"\n", "- $@train#trainer.run()\n", "\n", "train_handlers:\n", " handlers:\n", " - _target_: \"$src.mlflow.MLFlownnUNetHandler\"\n", " label_dict: \"@label_dict\"\n", " tracking_uri: \"@tracking_uri\"\n", " experiment_name: \"@mlflow_experiment_name\"\n", " run_name: \"@mlflow_run_name\"\n", " output_transform: \"$src.mlflow.mlflow_transform\"\n", " dataset_dict:\n", " train: \"@train#train_dataset\"\n", " dataset_keys: '@dataset_key'\n", " state_attributes:\n", " - \"iteration\"\n", " - \"epoch\"\n", " tag_name: 'Train_Loss'\n", " experiment_param: \"$src.mlflow.create_mlflow_experiment_params( @bundle_root + '/nnUNet/params.yaml')\"\n", " #artifacts=None\n", " optimizer_param_names: 'lr'\n", " #close_on_complete: False\n", " - _target_: LrScheduleHandler\n", " lr_scheduler: '@lr_scheduler'\n", " print_lr: true\n", " - _target_: ValidationHandler\n", " epoch_level: true\n", " interval: '@val_interval'\n", " validator: '@validate#evaluator'\n", " #- _target_: StatsHandler\n", " # output_transform: $monai.handlers.from_engine(['loss'], first=True)\n", " # tag_name: train_loss\n", " - _target_: TensorBoardStatsHandler\n", " log_dir: '@log_dir'\n", " output_transform: $monai.handlers.from_engine(['loss'], first=True)\n", " tag_name: train_loss\n", " - _target_: CheckpointLoader\n", " load_dict:\n", " network_weights: '$@nnunet_trainer.network'\n", " optimizer_state: '$@nnunet_trainer.optimizer'\n", " scheduler: '$@nnunet_trainer.lr_scheduler'\n", " load_path: '$@ckpt_dir+\"/checkpoint_epoch=\"+str(src.trainer.get_checkpoint(@reload_checkpoint_epoch, @ckpt_dir))+\".pt\"'\n", " map_location: '@device'" ] }, { "cell_type": "code", "execution_count": null, "id": "f55c6569", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/val_metrics.yaml\n", "\n", "val_key_metric:\n", " Val_Dice:\n", " _target_: \"MeanDice\"\n", " output_transform: \"$monai.handlers.from_engine(['pred', 'label'])\"\n", " reduction: \"mean\"\n", " include_background: False\n", " \n", "val_additional_metrics:\n", " Val_Dice_per_class:\n", " _target_: \"MeanDice\"\n", " output_transform: \"$monai.handlers.from_engine(['pred', 'label'])\"\n", " reduction: \"mean_batch\"\n", " include_background: False" ] }, { "cell_type": "code", "execution_count": null, "id": "21af5ce1", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/val_handlers.yaml\n", "\n", "val_handlers:\n", " handlers:\n", " - _target_: StatsHandler\n", " iteration_log: false\n", " - _target_: TensorBoardStatsHandler\n", " iteration_log: false\n", " log_dir: '@log_dir'\n", " - _target_: \"$src.mlflow.MLFlownnUNetHandler\"\n", " label_dict: \"@label_dict\"\n", " tracking_uri: \"@tracking_uri\"\n", " experiment_name: \"@mlflow_experiment_name\"\n", " run_name: \"@mlflow_run_name\"\n", " output_transform: \"$src.mlflow.mlflow_transform\"\n", " iteration_log: False\n", " state_attributes:\n", " - \"best_metric\"\n", " - \"best_metric_epoch\"\n", " - _target_: \"CheckpointSaver\"\n", " save_dir: \"@ckpt_dir\"\n", " save_interval: 1\n", " n_saved: 1\n", " save_key_metric: true\n", " save_dict:\n", " network_weights: '$@nnunet_trainer.network'\n", " optimizer_state: '$@nnunet_trainer.optimizer'\n", " scheduler: '$@nnunet_trainer.lr_scheduler'" ] }, { "cell_type": "code", "execution_count": null, "id": "4d3b2a5f", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/validate.yaml\n", "\n", "val_interval: 1\n", "validate:\n", " pbar:\n", " _target_: \"ignite.contrib.handlers.tqdm_logger.ProgressBar\"\n", " dataloader: $@nnunet_trainer.dataloader_val\n", " evaluator:\n", " _target_: SupervisedEvaluator\n", " additional_metrics: '@val_additional_metrics'\n", " amp: true\n", " epoch_length: $@nnunet_trainer.num_val_iterations_per_epoch\n", " device: '@device'\n", " inferer: '@validate#inferer'\n", " key_val_metric: '@val_key_metric'\n", " network: '@network'\n", " postprocessing: '@train_postprocessing'\n", " val_data_loader: '@validate#dataloader'\n", " val_handlers: '@val_handlers#handlers'\n", " prepare_batch: \"$src.trainer.prepare_nnunet_batch\"\n", " inferer:\n", " _target_: SimpleInferer\n" ] }, { "cell_type": "code", "execution_count": null, "id": "51fae1b1", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/nnUNet/evaluator/evaluator.yaml\n", "\n", "run:\n", "- \"$@validate#pbar.attach(@validate#evaluator,metric_names=['Val_Dice'])\"\n", "- $@validate#evaluator.run()\n", "\n", "initialize:\n", "- \"$setattr(torch.backends.cudnn, 'benchmark', True)\"" ] }, { "cell_type": "markdown", "id": "fd192db9", "metadata": {}, "source": [ "## Adding Python Utility Scripts\n", "\n", "We finally add the MLFlow and Training utility scripts to the MONAI Bundle." ] }, { "cell_type": "code", "execution_count": null, "id": "0f4fb6da", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/src/__init__.py\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "7cea0e36", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/src/mlflow.py\n", "\n", "import re\n", "from monai.handlers import MLFlowHandler\n", "import yaml\n", "from monai.bundle import ConfigParser\n", "from pathlib import Path\n", "import monai\n", "import torch\n", "\n", "def mlflow_transform(state_output):\n", " \"\"\"\n", " Extracts the 'loss' value from the first element of the state_output list.\n", "\n", " Parameters\n", " ----------\n", " state_output : list of dict\n", " A list where each element is a dictionary containing various metrics, including 'loss'.\n", "\n", " Returns\n", " -------\n", " float\n", " The 'loss' value from the first element of the state_output list.\n", " \"\"\"\n", " return state_output[0]['loss']\n", "\n", "class MLFlownnUNetHandler(MLFlowHandler):\n", " \"\"\"\n", " A handler for logging nnUNet metrics to MLFlow.\n", " Parameters\n", " ----------\n", " label_dict : dict\n", " A dictionary mapping label indices to label names.\n", " **kwargs : dict\n", " Additional keyword arguments passed to the parent class.\n", " \"\"\"\n", " def __init__(self, label_dict, **kwargs):\n", " super(MLFlownnUNetHandler, self).__init__(**kwargs)\n", " self.label_dict = label_dict\n", " \n", " def _default_epoch_log(self, engine) -> None:\n", " \"\"\"\n", " Logs the metrics and state attributes at the end of each epoch.\n", "\n", " Parameters\n", " ----------\n", " engine : Engine\n", " The engine object that contains the state and metrics to be logged.\n", "\n", " Returns\n", " -------\n", " None\n", " \"\"\"\n", " log_dict = engine.state.metrics\n", " if not log_dict:\n", " return\n", "\n", " current_epoch = self.global_epoch_transform(engine.state.epoch)\n", "\n", " new_log_dict = {}\n", "\n", " for metric in log_dict:\n", " if type(log_dict[metric]) == torch.Tensor:\n", " for i,val in enumerate(log_dict[metric]):\n", " new_log_dict[metric+\"_{}\".format(self.label_dict[list(self.label_dict.keys())[i+1]])] = val\n", " else:\n", " new_log_dict[metric] = log_dict[metric]\n", " self._log_metrics(new_log_dict, step=current_epoch)\n", "\n", " if self.state_attributes is not None:\n", " attrs = {attr: getattr(engine.state, attr, None) for attr in self.state_attributes}\n", " self._log_metrics(attrs, step=current_epoch)\n", "\n", "def create_mlflow_experiment_params(params_file, custom_params=None):\n", " \"\"\"\n", " Create a dictionary of parameters for an MLflow experiment.\n", "\n", " This function reads configuration values from MONAI, GPU information, and a YAML configuration file,\n", " and combines them into a single dictionary. Optionally, custom parameters can also be added to the dictionary.\n", "\n", " Parameters\n", " ----------\n", " params_file : str\n", " Path to the YAML configuration file.\n", " custom_params : dict, optional\n", " A dictionary of custom parameters to be added to the final parameters dictionary (default is None).\n", "\n", " Returns\n", " -------\n", " dict\n", " A dictionary containing all the combined parameters.\n", " \"\"\"\n", " params_dict = {}\n", " config_values = monai.config.deviceconfig.get_config_values()\n", " for k in config_values:\n", " params_dict[re.sub(\"[()]\",\" \",str(k))] = config_values[k]\n", "\n", " optional_config_values = monai.config.deviceconfig.get_optional_config_values()\n", " for k in optional_config_values:\n", " params_dict[re.sub(\"[()]\",\" \",str(k))] = optional_config_values[k]\n", "\n", " gpu_info = monai.config.deviceconfig.get_gpu_info()\n", " for k in gpu_info:\n", " params_dict[re.sub(\"[()]\",\" \",str(k))] = str(gpu_info[k])\n", "\n", " yaml_config_files = [params_file]\n", " # %%\n", " monai_config = {}\n", " for config_file in yaml_config_files:\n", " with open(config_file, 'r') as file:\n", " monai_config.update(yaml.safe_load(file))\n", "\n", " monai_config[\"bundle_root\"] = str(Path(Path(params_file).parent).parent)\n", "\n", " parser = ConfigParser(monai_config, globals={\"os\": \"os\",\n", " \"pathlib\": \"pathlib\",\n", " \"json\": \"json\",\n", " \"ignite\": \"ignite\"\n", " })\n", "\n", " parser.parse(True)\n", "\n", " for k in monai_config:\n", " params_dict[k] = parser.get_parsed_content(k,instantiate=True)\n", "\n", " if custom_params is not None:\n", " for k in custom_params:\n", " params_dict[k] = custom_params[k]\n", " return params_dict\n" ] }, { "cell_type": "code", "execution_count": null, "id": "d679a387", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/src/trainer.py\n", "\n", "import os\n", "\n", "def subfiles(directory, prefix=None, suffix=None, join=True, sort=True):\n", " \"\"\"\n", " List files in a directory with optional filtering by prefix and/or suffix.\n", " \n", " Parameters\n", " ----------\n", " directory : str\n", " The path to the directory to list files from.\n", " prefix : str, optional\n", " If specified, only files starting with this prefix will be included.\n", " suffix : str, optional\n", " If specified, only files ending with this suffix will be included.\n", " join : bool, optional\n", " If True, the directory path will be joined with the filenames. Default is True.\n", " sort : bool, optional\n", " If True, the list of files will be sorted. Default is True.\n", " \n", " Returns\n", " -------\n", " list of str\n", " A list of filenames (with full paths if `join` is True) that match the specified criteria.\n", " \"\"\"\n", "\n", " \n", " files = [f for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f))]\n", " if prefix is not None:\n", " files = [f for f in files if f.startswith(prefix)]\n", " if suffix is not None:\n", " files = [f for f in files if f.endswith(suffix)]\n", " if join:\n", " files = [os.path.join(directory, f) for f in files]\n", " if sort:\n", " files.sort()\n", " return files\n", "\n", "def prepare_nnunet_batch(batch, device, non_blocking):\n", " \"\"\"\n", " Prepares a batch of data and targets for nnU-Net training by transferring them to the specified device.\n", "\n", " Parameters\n", " ----------\n", " batch : dict\n", " A dictionary containing the data and target tensors. The key \"data\" corresponds to the input data tensor,\n", " and the key \"target\" corresponds to the target tensor or a list of target tensors.\n", " device : torch.device\n", " The device to which the data and target tensors should be transferred (e.g., 'cuda' or 'cpu').\n", " non_blocking : bool\n", " If True, allows non-blocking data transfer to the device.\n", "\n", " Returns\n", " -------\n", " tuple\n", " A tuple containing the data tensor and the target tensor(s) after being transferred to the specified device.\n", " \"\"\"\n", " data = batch[\"data\"].to(device, non_blocking=non_blocking)\n", " if isinstance(batch[\"target\"], list):\n", " target = [i.to(device, non_blocking=non_blocking) for i in batch[\"target\"]]\n", " else:\n", " target = batch[\"target\"].to(device, non_blocking=non_blocking)\n", " return data, target\n", "\n", "def get_checkpoint(epoch, ckpt_dir):\n", " \"\"\"\n", " Retrieves the checkpoint for a given epoch from the checkpoint directory.\n", "\n", " Parameters\n", " ----------\n", " epoch : int or str\n", " The epoch number to retrieve. If 'latest', the function will return the latest checkpoint.\n", " ckpt_dir : str\n", " The directory where checkpoints are stored.\n", "\n", " Returns\n", " -------\n", " int\n", " The epoch number of the checkpoint to be retrieved. If 'latest', returns the latest epoch number.\n", " \"\"\"\n", " if epoch == \"latest\":\n", "\n", " latest_checkpoints = subfiles(ckpt_dir, prefix=\"checkpoint_epoch\", sort=True,\n", " join=False)\n", " epochs = []\n", " for latest_checkpoint in latest_checkpoints:\n", " epochs.append(int(latest_checkpoint[len(\"checkpoint_epoch=\"):-len(\".pt\")]))\n", "\n", " epochs.sort()\n", " latest_epoch = epochs[-1]\n", " return latest_epoch\n", " else:\n", " return epoch\n", "\n", "def reload_checkpoint(trainer, epoch, num_train_batches_per_epoch, ckpt_dir, lr_scheduler=None):\n", " \"\"\"\n", " Reloads the checkpoint for a given epoch and updates the trainer's state.\n", "\n", " Parameters\n", " ----------\n", " trainer : object\n", " The trainer object whose state needs to be updated.\n", " epoch : int\n", " The epoch number to load the checkpoint from.\n", " num_train_batches_per_epoch : int\n", " The number of training batches per epoch.\n", " ckpt_dir : str\n", " The directory where the checkpoints are stored.\n", " lr_scheduler : object, optional\n", " The learning rate scheduler to be updated (default is None).\n", "\n", " Returns\n", " -------\n", " None\n", " \"\"\"\n", "\n", " epoch_to_load = get_checkpoint(epoch, ckpt_dir)\n", " trainer.state.epoch = epoch_to_load\n", " trainer.state.iteration = (epoch_to_load* num_train_batches_per_epoch) +1\n", " if lr_scheduler is not None:\n", " lr_scheduler.ctr = epoch_to_load\n", " lr_scheduler.step(epoch_to_load)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "7122458f", "metadata": {}, "outputs": [], "source": [ "def create_config(config_folder, output_file):\n", " config_files = [f.path for f in os.scandir(config_folder) if f.path.endswith(\".yaml\")]\n", " config = {}\n", " for config_file in config_files:\n", " with open(config_file, \"r\") as file:\n", " config.update(yaml.safe_load(file))\n", "\n", " if output_file.endswith(\".yaml\"):\n", " with open(output_file, \"w\") as file:\n", " yaml.dump(config, file)\n", " if output_file.endswith(\".json\"):\n", " with open(output_file, \"w\") as file:\n", " json.dump(config, file)\n", "\n", " return config" ] }, { "cell_type": "code", "execution_count": null, "id": "8218fc23", "metadata": {}, "outputs": [], "source": [ "import os \n", "import yaml\n", "config = create_config(\"MONetBundle/nnUNet\", \"MONetBundle/configs/train.yaml\")" ] }, { "cell_type": "code", "execution_count": null, "id": "6453aa35", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "export nnUNet_results=/home/maia-user/Data/nnUNet/nnUNet_trained_models\n", "export nnUNet_raw=/home/maia-user/Data/nnUNet/nnUNet_raw_data_base\n", "export nnUNet_preprocessed=/home/maia-user/Data/nnUNet/nnUNet_preprocessed\n", "export nnUNet_def_n_proc=2\n", "export nnUNet_n_proc_DA=2\n", "export BUNDLE_ROOT=MONetBundle\n", "export PYTHONPATH=$PYTHONPATH:$BUNDLE_ROOT\n", "\n", "python -m monai.bundle run \\\n", " --bundle_root $BUNDLE_ROOT \\\n", " --reload_checkpoint_epoch \"latest\" \\\n", " --iterations 10 \\\n", " --epochs 10 \\\n", " --config_file $BUNDLE_ROOT/configs/train.yaml\n", " \n", "\n", "# Option to resume training\n", "#--config_file \"['$BUNDLE_ROOT/configs/train.yaml','$BUNDLE_ROOT/configs/train_resume.yaml']\"\n", "#\n", "# Log to Local MLFlow\n", "#--tracking_uri mlruns\n" ] }, { "cell_type": "markdown", "id": "3f221410", "metadata": {}, "source": [ "## Inference\n", "\n", "After training the nnUNet model, we can then perform inference on new data. We use a `ModelnnUNetWrapper` as a wrapper around the nnUNet model to perform inference from the MONAI Bundle. In this way, the nnUNet preprocessing, inference and postprocessing steps are handled by the `ModelnnUNetWrapper`, with the Bundle blocks only needing to handle the input data loading and sending to the nnUnet block and the nnUNet prediction postprocessing.\n", "\n", "The `ModelnnUNetWrapper` receives as input the data dictionary loaded by the DataLoader, and returns the model predictions as a MetaTensor.\n", "\n", "To get the `ModelnnUNetWrapper` object, we can use the `get_nnunet_monai_predictor` function, which receives the following parameters:\n", "\n", "- `model_folder`: The path to the nnUNet model folder.\n", "- `model_name`: [Optional] The name of the model to be loaded. If not provided, the function will load the checkpoint named `model.pt`." ] }, { "cell_type": "code", "execution_count": null, "id": "f22a7868", "metadata": {}, "outputs": [], "source": [ "# To Select the lastest checkpoint\n", "\n", "from MONetBundle.src.trainer import get_checkpoint\n", "\n", "ckpt_epoch = get_checkpoint(\"latest\", \"MONetBundle/models/fold_0\")" ] }, { "cell_type": "code", "execution_count": null, "id": "6adfa0a0", "metadata": {}, "outputs": [], "source": [ "nnunet_config = {\n", " \"model_folder\": \"MONetBundle/models/fold_0\",\n", "}\n", "\n", "monai_predictor = get_nnunet_monai_predictor(**nnunet_config, model_name=f\"checkpoint_epoch={ckpt_epoch}.pt\")" ] }, { "cell_type": "markdown", "id": "d9258c3b", "metadata": {}, "source": [ "## Test Data Preparation\n", "\n", "The Bundle accepts the test dataset in the following format:\n", "\n", "```bash\n", "Dataset\n", "├── Case1\n", "│ └── Case1.nii.gz\n", "├── Case2\n", "│ └── Case2.nii.gz\n", "└── Case3\n", " └── Case3.nii.gz\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "07229e86", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "mkdir -p nnUNetBundle/test_input/spleen_1\n", "mkdir -p nnUNetBundle/test_output\n", "\n", "cp /home/maia-user/Documents/MONAI/Data/Task09_Spleen/imagesTs/spleen_1.nii.gz nnUNetBundle/test_input/spleen_1" ] }, { "cell_type": "code", "execution_count": null, "id": "308abf67", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "tree nnUNetBundle/test_input" ] }, { "cell_type": "markdown", "id": "6d7e5b73", "metadata": {}, "source": [ "### Data Loading" ] }, { "cell_type": "code", "execution_count": null, "id": "f10d59d7", "metadata": {}, "outputs": [], "source": [ "def get_subfolder_dataset(data_dir, modality_conf):\n", " data_list = []\n", " for f in os.scandir(data_dir):\n", "\n", " if f.is_dir():\n", " subject_dict = {\n", " key: str(pathlib.Path(f.path).joinpath(f.name + modality_conf[key][\"suffix\"])) for key in modality_conf\n", " }\n", " data_list.append(subject_dict)\n", " return data_list" ] }, { "cell_type": "code", "execution_count": null, "id": "5f3972b9", "metadata": {}, "outputs": [], "source": [ "modalities = {\n", " \"image\": {\"suffix\": \".nii.gz\"},\n", "}\n", "\n", "data = get_subfolder_dataset(\"MONetBundle/test_input\", modalities)" ] }, { "cell_type": "code", "execution_count": null, "id": "9e24a629", "metadata": {}, "outputs": [], "source": [ "preprocessing = LoadImaged(keys=[\"image\"], ensure_channel_first=True, image_only=False)\n", "\n", "\n", "test_dataset = Dataset(data, transform=preprocessing)\n", "\n", "test_loader = DataLoader(test_dataset, batch_size=1)" ] }, { "cell_type": "markdown", "id": "8c637fba", "metadata": {}, "source": [ "### Test ModelnnUNetWrapper\n", "\n", "To test the `ModelnnUNetWrapper`, we can provide a test case to the `ModelnnUNetWrapper` and extract the model predictions returned by the wrapper." ] }, { "cell_type": "code", "execution_count": null, "id": "2386fc9c", "metadata": {}, "outputs": [], "source": [ "batch = next(iter(test_loader))\n", "\n", "pred = monai_predictor(batch[\"image\"])" ] }, { "cell_type": "markdown", "id": "3e010b7e", "metadata": {}, "source": [ "### Postprocessing and Save Predictions\n", "\n", "After obtaining the model predictions, we can apply postprocessing transformations to the predictions and save the results to disk.\n", "\n", "The `Transposed` transform is required to unify the axis order convention between MONAI and nnUNet. The nnUNet model uses the `zyx` axis order, while MONAI uses the `xyz` axis order." ] }, { "cell_type": "code", "execution_count": null, "id": "ccd5a438", "metadata": {}, "outputs": [], "source": [ "postprocessing = Compose(\n", " [\n", " #Decollated(keys=None, detach=True),\n", " #Transposed(keys=\"pred\", indices=[0, 3, 2, 1]),\n", " SaveImaged(\n", " keys=\"pred\",\n", " output_dir=\"nnUNetBundle/test_output\",\n", " output_postfix=\"prediction\",\n", " meta_keys=\"image_meta_dict\",\n", " ),\n", " ]\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "d82112a2", "metadata": {}, "outputs": [], "source": [ "postprocessing({\"pred\": pred})" ] }, { "cell_type": "markdown", "id": "9c85dd88", "metadata": {}, "source": [ "## Evaluator\n", "\n", "Combining everything together, we can create an `Evaluator` that encapsulates the data loading, model inference, postprocessing, and evaluation steps. The `Evaluator` can be used to evaluate the model on the test dataset ." ] }, { "cell_type": "code", "execution_count": null, "id": "bbf1fec9", "metadata": {}, "outputs": [], "source": [ "validator = SupervisedEvaluator(\n", " val_data_loader=test_loader, device=\"cuda:0\", network=monai_predictor, postprocessing=postprocessing\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "67970bc2", "metadata": {}, "outputs": [], "source": [ "validator.run()" ] }, { "cell_type": "code", "execution_count": null, "id": "f63d8dd8", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/configs/inference.yaml\n", "\n", "imports: \n", " - $import json\n", " - $from pathlib import Path\n", " - $import os\n", " - $from ignite.contrib.handlers.tqdm_logger import ProgressBar\n", " - $import shutil\n", " - $import src\n", " - $import src.dataset\n", "\n", "\n", "output_dir: \".\"\n", "bundle_root: \".\"\n", "data_list_file : \".\"\n", "data_dir: \".\"\n", "\n", "fold_id: 0\n", "model_name: \"model.pt\"\n", "prediction_suffix: \"prediction\"\n", "\n", "\n", "modality_conf:\n", " image:\n", " suffix: \".nii.gz\"\n", "\n", "test_data_list: \"$src.dataset.get_subfolder_dataset(@data_dir,@modality_conf)\"\n", "#test_data_list: \"$monai.data.load_decathlon_datalist(@data_list_file, is_segmentation=True, data_list_key='testing', base_dir=@data_dir)\"\n", "image_modality_keys: \"$list(@modality_conf.keys())\"\n", "image_key: \"image\"\n", "image_suffix: \"@image_key\"\n", "\n", "preprocessing:\n", " _target_: Compose\n", " transforms:\n", " - _target_: LoadImaged\n", " keys: \"image\"\n", " ensure_channel_first: True\n", " image_only: False\n", "\n", "test_dataset:\n", " _target_: Dataset\n", " data: \"$@test_data_list\"\n", " transform: \"@preprocessing\"\n", "\n", "test_loader:\n", " _target_: DataLoader\n", " dataset: \"@test_dataset\"\n", " batch_size: 1\n", "\n", "\n", "device: \"$torch.device('cuda')\"\n", "\n", "nnunet_trainer_class_name: nnUNetTrainer\n", "nnunet_config_ckpt: \n", "plans: \n", "dataset_json: \n", " \n", "nnunet_config_dict:\n", " model_folder: \"$@bundle_root + '/models/fold_'+str(@fold_id)\"\n", " model_name: \"@model_name\"\n", " nnunet_config: \"@nnunet_config_ckpt\"\n", " plans: \"@plans\"\n", " dataset_json: \"@dataset_json\"\n", "\n", "network_def: \"$monai.apps.nnunet.nnunet_bundle.get_nnunet_monai_predictor(**@nnunet_config_dict)\"\n", "\n", "postprocessing:\n", " _target_: \"Compose\"\n", " transforms:\n", " #- _target_: Transposed\n", " # keys: \"pred\"\n", " # indices:\n", " # - 0\n", " # - 3\n", " # - 2\n", " # - 1\n", " - _target_: SaveImaged\n", " keys: \"pred\"\n", " resample: False\n", " output_postfix: \"@prediction_suffix\"\n", " output_dir: \"@output_dir\"\n", " meta_keys: \"image_meta_dict\"\n", "\n", "\n", "testing:\n", " dataloader: \"$@test_loader\"\n", " pbar:\n", " _target_: \"ignite.contrib.handlers.tqdm_logger.ProgressBar\"\n", " test_inferer: \"$@inferer\"\n", "\n", "inferer: \n", " _target_: \"SimpleInferer\"\n", "\n", "validator:\n", " _target_: \"SupervisedEvaluator\"\n", " postprocessing: \"$@postprocessing\"\n", " device: \"$@device\"\n", " inferer: \"$@testing#test_inferer\"\n", " val_data_loader: \"$@testing#dataloader\"\n", " network: \"@network_def\"\n", " val_handlers:\n", " - _target_: \"CheckpointLoader\"\n", " load_path: \"$@bundle_root+'/models/fold_'+str(@fold_id)+'/'+@model_name\"\n", " load_dict:\n", " network_weights: '$@network_def.network_weights'\n", "run:\n", " - \"$@testing#pbar.attach(@validator)\"\n", " - \"$@validator.run()\"\n", "\n", "nnunet_config_ckpt:\n", " trainer_name: \"@nnunet_trainer_class_name\"\n", " inference_allowed_mirroring_axes:\n", " - 0\n", " - 1\n", " - 2\n", " configuration: \"3d_fullres\"" ] }, { "cell_type": "code", "execution_count": null, "id": "62668710", "metadata": {}, "outputs": [], "source": [ "%%writefile MONetBundle/src/dataset.py\n", "\n", "import pathlib\n", "import os\n", "\n", "def get_subfolder_dataset(data_dir,modality_conf):\n", " data_list = []\n", " for f in os.scandir(data_dir):\n", "\n", " if f.is_dir():\n", " subject_dict = {key:str(pathlib.Path(f.path).joinpath(f.name+modality_conf[key]['suffix'])) for key in modality_conf}\n", " data_list.append(subject_dict)\n", " return data_list" ] }, { "cell_type": "code", "execution_count": null, "id": "b218dfd3", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "export BUNDLE_ROOT=MONetBundle\n", "export PYTHONPATH=$PYTHONPATH:$BUNDLE_ROOT\n", "\n", "python -m monai.bundle run \\\n", " --config-file $BUNDLE_ROOT/configs/inference.yaml \\\n", " --bundle-root $BUNDLE_ROOT \\\n", " --data-dir $HOME/Data/Samples/NIFTI/Spleen \\\n", " --output-dir $HOME/Data/Samples/NIFTI/Spleen_pred \\\n", " --model-name \"checkpoint_epoch=1000.pt\" \\\n", " --logging-file $BUNDLE_ROOT/configs/logging.conf" ] }, { "cell_type": "markdown", "id": "0f80f44e", "metadata": {}, "source": [ "## Utilities" ] }, { "cell_type": "markdown", "id": "e7d75b56-80b8-412c-97d8-b79e1111caa0", "metadata": {}, "source": [ "### MONAI Bundle to nnUNet Conversion" ] }, { "cell_type": "markdown", "id": "d65a50b3-063d-412b-a8e0-5ec45c003925", "metadata": {}, "source": [ "To convert a MONAI Bundle to a nnUNet Bundle, we need to combine the MONAI checkpoint with the nnUNet checkpoint. This is done by loading the MONAI checkpoint and the nnUNet checkpoint, and updating the nnUNet model weights with the MONAI model weights." ] }, { "cell_type": "code", "execution_count": null, "id": "de36e0d6", "metadata": {}, "outputs": [], "source": [ "os.environ[\"nnUNet_results\"] = \"MONAI/Data/nnUNet/nnUNet_trained_models\"\n", "os.environ[\"nnUNet_raw\"] = \"MONAI/Data/nnUNet/nnUNet_raw_data_base\"\n", "os.environ[\"nnUNet_preprocessed\"] = \"MONAI/Data/nnUNet/nnUNet_preprocessed\"\n", "\n", "nnunet_config = {\n", " \"dataset_name_or_id\": \"009\",\n", " \"nnunet_trainer\": \"nnUNetTrainer\",\n", "}\n", "\n", "convert_monai_bundle_to_nnunet(nnunet_config, \"MONetBundle\")" ] }, { "cell_type": "markdown", "id": "47355821", "metadata": {}, "source": [ "### Testing the nnUNet Model\n", "\n", "We now test the nnUNet model by performing inference on the test dataset and evaluating the model predictions." ] }, { "cell_type": "code", "execution_count": null, "id": "d2dd2920", "metadata": {}, "outputs": [], "source": [ "root_dir = \"MONAI/Data\"\n", "nnunet_root_dir = os.path.join(root_dir, \"nnUNet\")\n", "\n", "os.makedirs(nnunet_root_dir, exist_ok=True)\n", "\n", "data_src_cfg = os.path.join(nnunet_root_dir, \"data_src_cfg.yaml\")\n", "data_src = {\n", " \"modality\": \"CT\",\n", " \"dataset_name_or_id\": \"09\",\n", " \"datalist\": os.path.join(root_dir, \"Task09_Spleen/msd_task09_spleen_folds.json\"),\n", " \"dataroot\": os.path.join(root_dir, \"Task09_Spleen\"),\n", "}\n", "\n", "ConfigParser.export_config_file(data_src, data_src_cfg)\n", "\n", "runner = nnUNetV2Runner(input_config=data_src_cfg, trainer_class_name=\"nnUNetTrainer\", work_dir=nnunet_root_dir)" ] }, { "cell_type": "code", "execution_count": null, "id": "3559f95d", "metadata": {}, "outputs": [], "source": [ "runner.train_single_model(config=\"3d_fullres\", fold=0, val=\"\")" ] }, { "cell_type": "code", "execution_count": null, "id": "e4802257", "metadata": {}, "outputs": [], "source": [ "runner.find_best_configuration(configs=[\"3d_fullres\"], folds=[0], allow_ensembling=False, num_processes=1)" ] }, { "cell_type": "code", "execution_count": null, "id": "efd19e64", "metadata": {}, "outputs": [], "source": [ "runner.predict_ensemble_postprocessing(folds=[0], run_ensemble=False, run_postprocessing=False)" ] }, { "cell_type": "markdown", "id": "8190d533", "metadata": {}, "source": [ "### nnUNet to MONAI Bundle Conversion\n", "\n", "To convert a nnUNet trained Model to a MONAI Bundle, we need to separate the MONAI checkpoint from the nnUNet checkpoint. This is done by loading the nnUNet checkpoint and the MONAI checkpoint, and updating the MONAI model weights with the nnUNet model weights." ] }, { "cell_type": "code", "execution_count": null, "id": "d4e79429", "metadata": {}, "outputs": [], "source": [ "os.environ[\"nnUNet_results\"] = \"MONAI/Data/nnUNet/nnUNet_trained_models\"\n", "os.environ[\"nnUNet_raw\"] = \"MONAI/Data/nnUNet/nnUNet_raw_data_base\"\n", "os.environ[\"nnUNet_preprocessed\"] = \"MONAI/Data/nnUNet/nnUNet_preprocessed\"\n", "\n", "nnunet_config = {\n", " \"dataset_name_or_id\": \"009\",\n", " \"nnunet_trainer\": \"nnUNetTrainer_10epochs\",\n", "}\n", "\n", "bundle_root = \"MONetBundle\"\n", "\n", "convert_nnunet_to_monai_bundle(nnunet_config, bundle_root, 0)" ] }, { "cell_type": "markdown", "id": "053b657f", "metadata": {}, "source": [ "## Integration with NVFlare\n", "\n", "\n", "At the beginning of the NVFLare ScatterAndGather workflow, the server creates and distributes the global model, to be used by the clients for local training. When using nnUNet in FL, the global model needs to match the chosen nnUNet model architecture. For this reason, we adapt the nnUNet MONAI Bundle on the server side to be able to create the global model and distribute it to the clients, starting from the nnUNet plans and dataset files, produced during the nnUNet `plan_and_preprocessing` phase." ] }, { "cell_type": "markdown", "id": "e3a777f6", "metadata": {}, "source": [ "In `train.yaml`:\n", "\n", "```yaml\n", "network: $@nnunet_trainer.network._orig_mod\n", "network_def_fl:\n", " _target_: $monai.apps.nnunet.nnunet_bundle.get_network_from_nnunet_plans\n", " plans_file: \"$@bundle_root+'/models/plans.json'\"\n", " dataset_file: \"$@bundle_root+'/models/dataset.json'\"\n", " configuration: '@nnunet_configuration'\n", "```" ] }, { "cell_type": "markdown", "id": "f53ce240", "metadata": {}, "source": [ "## Integration with MONAI Deploy\n", "\n", "\n", "When using the nnUNet MONAI Bundle with MONAI Deploy, we need to specify where to load the checkpoint weights in the nnuNet network definition. " ] }, { "cell_type": "markdown", "id": "cd8ee64e", "metadata": {}, "source": [ "In `inference.yaml`:\n", "\n", "```yaml\n", "network_def_predictor: \"$@network_def.network_weights\"\n", "```" ] }, { "cell_type": "markdown", "id": "c943860f", "metadata": {}, "source": [ "## Integration with MONAI Label\n", "\n", "```yaml\n", "displayable_configs:\n", " dataset_name_or_id: '@dataset_name_or_id'\n", " fold_id: '@fold_id'\n", " mlflow_run_name: '@mlflow_run_name'\n", " nnunet_configuration: '@nnunet_configuration'\n", " nnunet_plans_identifier: '@nnunet_plans_identifier'\n", " nnunet_trainer_class_name: '@nnunet_trainer_class_name'\n", " num_classes: '@num_classes'\n", " region_class_order: ''\n", " tracking_experiment_name: '@mlflow_experiment_name'\n", " tracking_uri: '@tracking_uri'\n", " modality_list: 'CT'\n", " Label_0: '@label_dict.0'\n", " iterations: '@iterations'\n", "```" ] }, { "cell_type": "markdown", "id": "b467a02e", "metadata": {}, "source": [ "## Prepare the Bundle for Packaging\n", "\n", "To prepare the Bundle for packaging, we need to create a `metadata.json` file that describes the Bundle and its contents. The `metadata.json` file should follow the official MONAI Bundle format and include the following fields:\n", "```json\n", "\n", "{\n", " \"schema\": \"https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json\",\n", " \"version\": \"0.1.0\",\n", " \"changelog\": {\n", " \"0.1.0\": \"Initial release\",\n", " },\n", " \"monai_version\": \"1.4.0\",\n", " \"pytorch_version\": \"2.3.0\",\n", " \"numpy_version\": \"1.21.2\",\n", " \"required_packages_version\": {\"nnunetv2\": \"2.6.0\"},\n", " \"task\": \"Decathlon spleen segmentation with nnUNet\",\n", " \"description\": \"A pre-trained nnUNet model for volumetric (3D) segmentation of the spleen from CT image\",\n", " \"authors\": \"Simone Bendazzoli\",\n", " \"copyright\": \"Copyright (c) MONAI Consortium\",\n", " \"data_source\": \"Task09_Spleen.tar from http://medicaldecathlon.com/\",\n", " \"data_type\": \"nifti\",\n", " \"image_classes\": \"single channel data, intensity scaled to [0, 1]\",\n", " \"label_classes\": \"single channel data, 1 is spleen, 0 is everything else\",\n", " \"pred_classes\": \"2 channels OneHot data, channel 1 is spleen, channel 0 is background\",\n", " \"eval_metrics\": {\n", " \"mean_dice\": 0.97\n", " },\n", " \"intended_use\": \"This is an example, not to be used for diagnostic purposes\",\n", " \"references\": [\n", " \"Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.\"\n", " ],\n", " \"network_data_format\":{\n", " \"inputs\": {\n", " \"image\": {\n", " \"type\": \"image\",\n", " \"format\": \"hounsfield\",\n", " \"modality\": \"CT\",\n", " \"num_channels\": 1,\n", " \"spatial_shape\": [\"*\", \"*\", \"*\"],\n", " \"dtype\": \"float32\",\n", " \"value_range\": [-1024, 1024],\n", " \"is_patch_data\": false,\n", " \"channel_def\": {\"0\": \"image\"}\n", " }\n", " },\n", " \"outputs\":{\n", " \"pred\": {\n", " \"type\": \"image\",\n", " \"format\": \"segmentation\",\n", " \"num_channels\": 1,\n", " \"spatial_shape\": [\"*\", \"*\", \"*\"],\n", " \"dtype\": \"float32\",\n", " \"value_range\": [0,1],\n", " \"is_patch_data\": false,\n", " \"channel_def\": {\"0\": \"background\", \"1\": \"spleen\"}\n", " }\n", " }\n", " }\n", "}\n", "```\n", "For more details on the MONAI Bundle format, please refer to the [MONAI Bundle documentation](https://docs.monai.io/en/stable/mb_specification.html). " ] }, { "cell_type": "markdown", "id": "2dacf2e0", "metadata": {}, "source": [ "## Generate TorchScript\n", "\n", "To convert the MONAI Bundle checkpoints to the TorchScript format, you can use the [convert_ckpt_to_ts.py](./convert_ckpt_to_ts.py) script. This script takes the MONAI Bundle checkpoint and converts it to the TorchScript format, which can be used for inference in production environments. The script accepts the following parameters:\n", "```bash\n", "python convert_ckpt_to_ts.py --bundle_root --checkpoint_name --nnunet_trainer_name --fold \n", "```" ] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 5 }