The official implementation of NarVid — a framework that enhances text-video retrieval by leveraging frame-level captions (narration) to improve semantic understanding and retrieval accuracy. NarVid ...
A state-of-the-art face liveness detection system using multi-stage ensemble architecture with quality-aware processing. The pipeline combines Global (MiniFASNetV2) and Local (DeepPixBiS) branches to ...