Syed Hesham Syed Ariff | Video Understanding & Efficient AI

Education & Experience

Jan 2024 — Present

Doctor of Philosophy (PhD)

Nanyang Technological University, Singapore

School of Electrical and Electronic Engineering
Topic: Effective and Label-Efficient Visual Perception with Large Foundation Models
Advisor: Prof. Xudong Jiang

GPA: 5.0/5.0 A*STAR CIS Scholar

Jan 2024 — Present

A*STAR Research Scholar (ACIS)

A*STAR - Agency for Science, Technology and Research

Institute for Infocomm Research (I²R), Fusionopolis, Singapore
Computing & Intelligence Systems Programme

Conducting cutting-edge research in AI-powered video understanding and deep learning
Pushing boundaries of image and video understanding using foundation models

Jul 2020 — Dec 2023

Bachelor of Engineering (Honours)

Nanyang Technological University, Singapore

Electrical and Electronic Engineering
FYP: Deep Features based Real-Time SLAM

GPA: >4.9/5.0 First-Class Honours Dean's List ×4

Nov 2020 — Jan 2024

Research Staff (Computer Vision)

Republic Polytechnic, School of Engineering

Concurrent with Bachelor's studies (full-time work + part-time degree)

Led R&D team developing cost-effective near-field pose-estimation system, securing >$200K government funding
Published multiple conference and journal papers in computer vision and AI applications
Supervised and mentored 40+ students, achieving competition wins
Implemented real-time SLAM visualization pipeline using Python, C++, C#, and ROS

Mar 2018 — Aug 2018

Research Intern

Continental Automotive R&D, Singapore

Collaborated on 3+ research projects: UV bus passenger monitoring, autonomous pick & place robot, iOS indoor localization with Bluetooth Beacons, RTK GPS evaluation
Managed project showcases during company open house events

2016 — 2019

Diploma in Electrical & Electronic Engineering

Republic Polytechnic, Singapore

Final Year Project: Improving Recognition Performance for Low-Resolution Images Using DBPN

GPA: >3.9/4.0 Director's Roll of Honor 4× Module Prize 15/19 Distinctions

Publications

🏆 Top-Tier Venues

TV3S: Exploiting Temporal State Space Sharing for Video Semantic Segmentation

Syed Ariff Syed Hesham, Yun Liu, Guolei Sun, Henghui Ding, Jing Yang, Ender Konukoglu, Xue Geng, Xudong Jiang

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025 CVPR 2025 22.1% Acc. Rate

Efficient video segmentation using shared temporal state spaces in Mamba architecture for reduced computational cost.

Paper Code

PIX2PT Map for Transfer-Based Few-Shot Learning

Syed Ariff Syed Hesham, Sui JinZhou*, Gabrielle Ee Song Xin, Lek Chen Ping, Lijun Jiang

IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2021 ICME Workshop

Pixel-to-prototype mapping framework for improving transfer learning in few-shot classification tasks.

Paper

📚 Journal Publications

Evaluating SAM2 for Video Semantic Segmentation

Syed Hesham Syed Ariff, Yun Liu, Guolei Sun, Jing Yang, Henghui Ding, Xue Geng, Xudong Jiang

Machine Intelligence Research, 2025 Journal

Benchmark and analysis of Segment Anything Model 2 for video-level semantic segmentation tasks.

Leveraging on Few-Shot Learning for Tire Pattern Classification in Forensics

Lijun Jiang, Syed Ariff Syed Hesham*, Keng Pang Lim, Changyun Wen

Journal of Automation and Intelligence, 2023 Journal

Few-shot learning approach for tire pattern identification in forensic investigations with limited training samples.

Paper

📝 Preprints & Under Review

STAC: Selective Spatiotemporal Aggregation and Compression for Video Reasoning Segmentation

Syed Ariff Syed Hesham, Yun Liu, Guolei Sun, Jing Yang, Henghui Ding, Xue Geng, Xudong Jiang

CVPR 2026 Conference Submission, December 2025 Under Review

Efficient state-space compression framework that achieves 85% token reduction while maintaining competitive accuracy with 1.8× speedup in both training and inference.

A Comprehensive Survey on Video Scene Parsing: Advances, Challenges, and Prospects

Guohuan Xie, Syed Ariff Syed Hesham*, Wenya Guo, Bing Li, Ming-Ming Cheng, Guolei Sun, Yun Liu

arXiv:2506.13552, 2025 Under Review

Holistic review of video scene parsing covering semantic, instance, panoptic segmentation and open-vocabulary methods.

arXiv

🎤 Conference Publications

Tracking and Monitoring of Underwater Object with SLAM

Lijun Jiang, Syed Ariff Syed Hesham*, Lan JunHang, Seah Kai Wen Kelvin, Yuhan Jiang, Yao Mengdi, Wei Dongliang, Bo Jiang

IEEE 19th Conference on Industrial Electronics and Applications (ICIEA), 2024 ICIEA 2024

Visual SLAM system for real-time tracking and monitoring of underwater objects in challenging aquatic environments.

Adapted Lightweight MobileNet for Tire Pattern Classification

Syed Ariff Syed Hesham, Lijun Jiang, Keng Pang Lim, et al.

IEEE 18th Conference on Industrial Electronics and Applications (ICIEA), 2023 ICIEA 2023

Efficient mobile-optimized deep learning model for automated tire pattern recognition and classification.

Paper

A Versatile Application for Visual SLAM with Object Detection

Lijun Jiang, Syed Ariff Syed Hesham*, Keng Pang Lim, Yusong Wang, Hongkai Lin, Yuhang Zhao

IEEE 17th Conference on Industrial Electronics and Applications (ICIEA), 2022 ICIEA 2022

Integrated framework combining visual SLAM with real-time object detection for versatile robotic applications.

YOLO Based Thermal Screening Using AI for Instinctive Human Facial Detection

Lijun Jiang, Syed Ariff Syed Hesham, Keng Pang Lim, Krishnadas Manoj*, et al.

IEEE 17th Conference on Industrial Electronics and Applications (ICIEA), 2022 ICIEA 2022

YOLO-based AI system for automated thermal screening and facial detection in public health monitoring.

Borescope Tracking and Visualization of Internal Aero-Structure with VSLAM

Lijun Jiang, Syed Ariff Syed Hesham*, Keng Pang Lim, Sui JinZhou, Bo Jiang

IEEE 17th Conference on Industrial Electronics and Applications (ICIEA), 2022 ICIEA 2022

Visual SLAM-based borescope navigation system for 3D reconstruction and inspection of internal aircraft structures.

Improving Recognition Performance for Low-Resolution Images Using DBPN

Lijun Jiang, Keng Pang Lim, Syed Ariff Syed Hesham*

IEEE 16th Conference on Industrial Electronics and Applications (ICIEA), 2021 ICIEA 2021

Deep Back-Projection Network for super-resolution enhancement of low-resolution images to improve recognition accuracy.

*denotes corresponding/equal contribution

→ View all publications on Google Scholar

Advancing Scalable Video Understanding with Specialized Spatiotemporal Architectures

News

Education & Experience

Doctor of Philosophy (PhD)

Nanyang Technological University, Singapore

A*STAR Research Scholar (ACIS)

A*STAR - Agency for Science, Technology and Research

Bachelor of Engineering (Honours)

Nanyang Technological University, Singapore

Research Staff (Computer Vision)

Republic Polytechnic, School of Engineering

Research Intern

Continental Automotive R&D, Singapore

Diploma in Electrical & Electronic Engineering

Republic Polytechnic, Singapore

Honors & Awards

Skills

Deep Learning

Computer Vision

Programming

Tools & Systems

Languages

Publications

🏆 Top-Tier Venues

TV3S: Exploiting Temporal State Space Sharing for Video Semantic Segmentation

PIX2PT Map for Transfer-Based Few-Shot Learning

📚 Journal Publications

Evaluating SAM2 for Video Semantic Segmentation

Leveraging on Few-Shot Learning for Tire Pattern Classification in Forensics

📝 Preprints & Under Review

STAC: Selective Spatiotemporal Aggregation and Compression for Video Reasoning Segmentation

A Comprehensive Survey on Video Scene Parsing: Advances, Challenges, and Prospects

🎤 Conference Publications

Tracking and Monitoring of Underwater Object with SLAM

Adapted Lightweight MobileNet for Tire Pattern Classification

A Versatile Application for Visual SLAM with Object Detection

YOLO Based Thermal Screening Using AI for Instinctive Human Facial Detection

Borescope Tracking and Visualization of Internal Aero-Structure with VSLAM

Improving Recognition Performance for Low-Resolution Images Using DBPN

Get in Touch

Email

Location

Lab