Voor de beste ervaring schakelt u JavaScript in en gebruikt u een moderne browser!
Je gebruikt een niet-ondersteunde browser. Deze site kan er anders uitzien dan je verwacht.
Ehsan Aghapour will defend his thesis 'Efficient deep learning inference on end devices'. Supervisor: Prof. Dr. A.D. Pimentel. Co-supervisors: Dr. ing. A. Pathania, Dr. D. Sapra.
Event details of PhD Defence Ehsan Aghapour
Date
7 May 2025
Time
16:00 -17:30
Location
Agnietenkapel

Abstract

Deep Learning (DL) has become a cornerstone of modern Artificial Intelligence (AI), powering applications across healthcare, computer vision, and autonomous systems. However, executing DL inference on resource-constrained end devices—such as smartphones and IoT hardware—poses challenges due to limited computational resources, energy constraints, and real-time requirements.
This thesis addresses the optimization of DL inference on Heterogeneous Multi-Processing System-on-Chips (HMPSoCs), which integrate CPUs, GPUs, and Neural Processing Units (NPUs). It explores strategies that collaboratively utilize these processors to enhance inference efficiency in terms of latency, throughput, and power.
A layer-wise switching strategy is proposed to assign each layer of a DL model to the processor that minimizes inference latency, improving responsiveness for time-sensitive applications like AR/VR. To enhance power efficiency, Dynamic Voltage and Frequency Scaling (DVFS) is combined with layer-switching, ensuring performance within battery constraints. Additionally, selective quantization is introduced to leverage NPUs without sacrificing model accuracy, assigning quantized and full-precision layers to appropriate processors. For throughput, a pipelined execution approach partitions models across CPU clusters, GPUs, and NPUs to process frames concurrently, meeting high FPS demands.
As a key outcome, the ARM-CO-UP framework is developed to support profiling, processor switching, pipelining, and DVFS, enabling flexible and cooperative execution across processors.
This work contributes toward enabling efficient DL deployment on everyday devices, balancing performance, energy, and accuracy. The proposed methods and framework provide a practical foundation for continued research in efficient AI computing at the edge.

Agnietenkapel

Oudezijds Voorburgwal 229 - 231
1012 EZ Amsterdam