XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

Ho Kei Cheng, Alexander Schwing. Published in ECCV, 2022.

Also presented in Computer Vision for Metaverse Workshop 2022, Workshop on Computer Vision in the Wild , Workshop on AI for Creative Video Editing and Understanding, and In-vehicle Sensing and Monitorization Workshop.

We develop a multi-store memory model to untie accuracy with memory consumption -- achieving good results in both short and long videos.

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang. Published in NeurIPS, 2021.

Rethinks mask tracking as an image correspondence problem and uses L2 similarity to encourage diversified voting -- simpler, better, and more efficient.

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang. Published in CVPR, 2021.

A more user-friendly and efficient paradigm of iVOS in which interactions and propagations are decoupled, with the user’s intention captured by a novel difference-aware fusion module.
Used by: [Sieve], [Trioscope]

CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement

Ho Kei Cheng*, Jihoon Chung*, Yu-Wing Tai, Chi-Keung Tang. Published in CVPR, 2020.

Refines segmentations (4K and beyond) in a class-agnostic manner without using any high-resolution training data through a set of carefully designed cascade operations.

Icons from Icons8