Efficient traversal of decision tree ensembles with FPGAs
Abstract
System-on-Chip (SoC) based Field Programmable Gate Arrays (FPGAs) provide a hardware acceleration technology that can be rapidly deployed and tuned, thus providing a flexible solution adaptable to specific design requirements and to changing demands. In this paper, we present three SoC architecture designs for speeding-up inference tasks based on machine learned ensembles of decision trees. We focus on QUICKSCORER, the state-of-the-art algorithm for the efficient traversal of tree ensembles and present the issues and the advantages related to its deployment on two SoC devices with different capacities. The results of the experiments conducted using publicly available datasets show that the solution proposed is very efficient and scalable. More importantly, it provides almost constant inference times, independently of the number of trees in the model and the number of instances to score. This allows the SoC solution deployed to be fine tuned on the basis of the accuracy and latency constraints of the application scenario considered. (C) 2021 Elsevier Inc. All rights reserved.
Más información
Título según WOS: | ID WOS:000656871100004 Not found in local WOS DB |
Título de la Revista: | JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING |
Volumen: | 155 |
Editorial: | ACADEMIC PRESS INC ELSEVIER SCIENCE |
Fecha de publicación: | 2021 |
Página de inicio: | 38 |
Página final: | 49 |
DOI: |
10.1016/j.jpdc.2021.04.008 |
Notas: | ISI |