Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAs

Gil-Costa, Veronica; Loor, Fernando; Molina, Romina; Maria Nardini, Franco; Perego, Raffaele; Trani, Salvatore; Hagen, M; Verberne, S; Macdonald, C; Seifert, C; Balog, K; Norvag, K; Setty, V

Abstract

We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine-learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks, we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on low-cost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9x up to 19.8x less energy than an equivalent multi-threaded CPU implementation.

Más información

Título según WOS: ID WOS:000784672700018 Not found in local WOS DB
Título de la Revista: BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II
Volumen: 13185
Editorial: SPRINGER INTERNATIONAL PUBLISHING AG
Fecha de publicación: 2022
Página de inicio: 260
Página final: 273
DOI:

10.1007/978-3-030-99736-6_18

Notas: ISI