SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM

Shuang Chen, Haozheng Zhang, Amir Atapour-Abarghouei and Hubert P. H. Shum
Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025

 Oral Paper (Top 8.2% of 2458 Submissions)H5-Index: 131#Core A Conference

SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM
‡ According to Core Ranking 2023
# According to Google Scholar 2025

Abstract

Image inpainting aims to repair a partially damaged image based on the information from known regions of the images. Achieving semantically plausible inpainting results is particularly challenging because it requires the reconstructed regions to exhibit similar patterns to the semanticly consistent regions. This requires a model with a strong capacity to capture long-range dependencies. Existing models struggle in this regard due to the slow growth of receptive field for Convolutional Neural Networks (CNNs) based methods and patch-level interactions in Transformer-based methods, which are ineffective for capturing long-range dependencies. Motivated by this, we propose SEM-Net, a novel visual State Space model (SSM) vision network, modelling corrupted images at the pixel level while capturing long-range dependencies (LRDs) in state space, achieving a linear computational complexity. To address the inherent lack of spatial awareness in SSM, we introduce the Snake Mamba Block (SMB) and Spatially-Enhanced Feedforward Network. These innovations enable SEM-Net to outperform state-of-the-art inpainting methods on two distinct datasets, showing significant improvements in capturing LRDs and enhancement in spatial consistency. Additionally, SEM-Net achieves state-of-the-art performance on motion deblurring, demonstrating its generalizability.


Downloads


YouTube


Cite This Research

Plain Text

Shuang Chen, Haozheng Zhang, Amir Atapour-Abarghouei and Hubert P. H. Shum, "SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM," in WACV '25: Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 461-471, Arizona, USA, IEEE/CVF, 2025.

BibTeX

@inproceedings{chen25sem,
 author={Chen, Shuang and Zhang, Haozheng and Atapour-Abarghouei, Amir and Shum, Hubert P. H.},
 booktitle={Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision},
 series={WACV '25},
 title={SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM},
 year={2025},
 pages={461--471},
 doi={10.1109/WACV61041.2025.00055},
 publisher={IEEE/CVF},
 location={Arizona, USA},
}

RIS

TY  - CONF
AU  - Chen, Shuang
AU  - Zhang, Haozheng
AU  - Atapour-Abarghouei, Amir
AU  - Shum, Hubert P. H.
T2  - Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision
TI  - SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM
PY  - 2025
SP  - 461
EP  - 471
DO  - 10.1109/WACV61041.2025.00055
PB  - IEEE/CVF
ER  - 


Supporting Grants


Similar Research

Shuang Chen, Amir Atapour-Abarghouei, Haozheng Zhang and Hubert P. H. Shum, "MxT: Mamba x Transformer for Image Inpainting", Proceedings of the 2024 British Machine Vision Conference (BMVC), 2024
Shuang Chen, Amir Atapour-Abarghouei and Hubert P. H. Shum, "HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention", IEEE Transactions on Multimedia (TMM), 2024

HomeGoogle ScholarYouTubeLinkedInTwitter/XGitHubORCIDResearchGateEmail
 
Print