juooo1117

Cross domain - Medical Image Segmentation 본문

Artificial Intelligence/Research Paper

Cross domain - Medical Image Segmentation

Hyo__ni 2025. 2. 3. 22:21

Cross-modality medical image segmentation

서로 다른 종류의 의료 영상을 활용해서(e.g., MRI, CT, X-ray ...) 분석하는 방법으로, 다른 모달리티 간의 차이를 극복하고 공통된 특징을 추출하는 것이 중요하다. 한 가지 모달리티 만으로는 충분히 정확해지기 어려우므로, 다양한 의료 영상을 결합하면 더 정확한 진단이 가능하므로 이 방법을 사용할 수 있다.

 

Research Paper

Mutual Information-based Disentangled Neural Networks for Classifying Unseen Categories in Different Domains: Application to Fetal Ultrasound Imaging

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7116845/


Abstract

Learning generalizable features that can form universal categorical decision boundaries across domain is difficult challenge. This problem occurs frequently in medical imaging applications when attempts are made to deploy and improve deep learning models across different image acquisition devices or if some classes are unavailable in new training databases. 

 

To solve this problem, we propose Mutual Information-based Disentangled Neural Networks (MIDNet),

which extract generalizable categorical features to transfer knowledge to unseen categories in a target domain.

 

MIDNet adopts a semi-supervised learning paradigm to alleviate the dependency on labeled data.

 

Introduction

  - We introduce the supervision from labeled images for an enhanced disentanglement(; 향상된 분리) via a feature clustering module, which estimates the similarity of categorical features from both domains. (즉, 라벨이 존재하는 데이터로부터 학습을 해서, 두 도메인간 클래스별 특징의 유사성을 측정하는 것!)

  - We structure a categorical feature space by considering inter-class relationships. (즉, 클래스 간의 관계를 고려해서 새로운 feature space를 만드는것!)

  - The method is a non-adversarial method → which mitigates the difficulty and instability of adversarial model training. (즉, GAN의 불안정성을 해결하고, semi-supervised learning 이므로 적은 labeled data로도 학습이 가능!)

  - We use sparsely labeled data during training. (일부 데이터만 라벨이 있는 상태에서 학습 가능하므로 소량의 라벨 데이터로도 학습이 가능하다, semi-supervised learning 이므로!)

 

 

Domain Adaptation

 - One-to-one domain adaptation : single source domain, single target domain

 - Unsupervised domain adaptation : typical one-to-one domain adaptation, which require plenty of labeled smaples from the source domain during training and categories in both domains are the same.

 - Multi-source domain adaptation : leanrs universal knowledge from multiple source domains to a single target domain.

 

Domain generalization is a special case of multi-source domain adaptation,

which learns knowledge for an unseen target domain from many labeled samples of multiple source domains.

 

 - Multi-target domain adaptation : learns knowledge from a single source domain to multiple target domains.

 

Domain agnostic learning is a multi-target domain adaptation method, which also requires plenty of labeled samples from the source domain. 

In domain agnostic learning, all categories in target domains have been seen in the source domain. (즉, target 의 카데고리는 전부 source에 있었어야 함)

위 내용을 정리한 테이블

In this paper, we consider a one-to-one domain adaptation setting, the categories in the target domain are a subset of the categories in the source domain. 

Our goal is to learn categorical-discriminative knowledge from available categories in both domains to seperate unseen categories in the target domain. 

  ⇒ 소스 도메인에서 학습한 정보를 활용해서 타겟 도메인에서 새로운 클래스를 구별하는 방법을 사용!

 

 

Transfer Learning

Transfer learning is a broader field that tackles domain shift and transfers knowledge between different datasets. 

  → Domain adaptation is a subcategory of transfer learning.

 

Problem settings with different tasks between source and target domains are close to inductive transfer learning(like supervised learning..?).

Unsupervised transfer learning is a special case of inductive transfer learning, where only unlabeled data is available in both domains during training(즉, 소스도메인, 타겟도메인 모두에서 레이블이 없는 데이터만 사용!)

 

We focus on transferring knowledge from a source domain to a target domain for a single task and tackle covariate shift.

Method

Model can classify the categories in the target domain which have not been seen during training.

To solve this task, we propose MIDNet combination with semi-supervised learning. The architecture of our model is shown in figure below.

 - Two independent encoders E are utilized to repectively extract categorical features 'Fc' and domain features 'Fd' from labeled data {Xl, Yl} and unlabeled data 'Xu'.

 

 - The classifier 'C' is responsible for predicting class distributions from 'Fc' for both 'Xl' and 'Xu' while the decoder 'D' combines 'Fc' and 'Fd' for the reconstruction of input images. 

 

 - The mixer 'M' aims to linearly mix labeled and unlabeled samples → so that the model is trained to show linear behavior(; 선형적 특성) between samples for further leveraging of unlabeled data.  

  ⇒ 즉, 혼합된 데이터가 기존 데이터의 단순한 연장선이 되며 (linear behavior를 학습해서), 이를 통해 unlabeled data도 더 자연스럽게 활용이 가능하다.

 

 - For representation disentanglement(; 데이터의 다양한 feature를 분리해서 독립적인 정보 표현을 학습하는것), mutual information between 'Fc', 'Fd' is minimized to encourage 'Fc' to beocme domain-invariant and maximally informative for categorical classification. 

   mutual information minimizing...?, Fc(범주관련 특징)이 Fd(도메인 관련 특징)에 의존하지 않도록 만들기 위해  두 특징간의 상호정보량을 최소화하는것. 이렇게 함으로써 모델이 domain에 영향을 받지 않고 class를 잘 분류할 수 있다.

 

 - Feature clustering contains feature alignment and distance metric learning.

 - Feature alignment aims at keeping the feature consistency between labeled images to promote the independence of 'Fc'.

   즉, 같은 class sample끼리 도메인에 상관없이 유사한 특징을 갖도록 하는 것. 이를 통해 'Fc'가 도메인 정보와 무관하게 순수하게 범주 정보만 포함하게 된다. 

 

 - Distance metric learning considers inter-class relationships, which clusters similar smaples while separating dissimilar samples to optimize 'Fc' for improving classification performance.

 

 

Mutual Information Disentanglement

We minimize the mutual information between 'Fc' and 'Fd'. 

This minimization forces 'Fc' to contain less domain information and thus separates categorical features from domain features.

Mutual information is defined as,

 

Reference

 

GitHub - VisionLearningGroup/DAL: Domain agnostic learning with disentangled representations

Domain agnostic learning with disentangled representations - VisionLearningGroup/DAL

github.com