A scalable and energy efficient GPU thread map for m-simplex domains

Navarro, Cristobal A.; Quezada, Felipe A.; Bustos, Benjamin; Hitschfeld, Nancy; Kindelan, Rolando

Abstract

This work proposes a new GPU thread map for m-simplex domains that improves its speedup along with the m-dimension and is energy efficient compared to other state of the art approaches. The main contributions of this work are (i) the formulation of an improved new block-space map 7-L : Z(m) bar right arrow Z(m) for regular orthogonal simplex domains, which is analyzed in terms of resource usage, and (ii) the experimental evaluation in terms of speedup and energy efficiency with respect to a bounding box approach. Results from the analysis show that 7-L has a potential speedup of up to 2x and 6x for 2 and 3-simplices, respectively. Experimental evaluation shows that 7-L is competitive for 2-simplices, reaching 1.2x similar to 2.0x of speedup for different tests, which is on par with the fastest state of the art approaches. For 3-simplices 7-L reaches up to 1.3x similar to 6.0x of speedup making it the fastest. The extension of 7-L to higher dimensional m-simplices is feasible and has a potential speedup that scales as m! given a proper selection of parameters r, beta which are the scaling and replication factors of the geometry, respectively. In terms of energy consumption, although 7-L is among the highest in power consumption, it compensates by its short duration, making it one of the most energy efficient approaches. The results of this work show that 7-L is a scalable and energy efficient map that improves the efficiency of GPU applications that need to process m-simplex domains, such as Cellular Automata or PDE simulations, among others. (c) 2022 Elsevier B.V. All rights reserved.

Más información

Título según WOS: ID WOS:000911453900001 Not found in local WOS DB
Título de la Revista: FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
Volumen: 141
Editorial: Elsevier
Fecha de publicación: 2023
Página de inicio: 651
Página final: 662
DOI:

10.1016/j.future.2022.12.020

Notas: ISI