Multi-core processors featuring tens of cores are becoming pervasive in those areas where power consumption and/or computational throughput are strong requirements. These processors usually execute workloads that present a high degree of thread-level parallelism which suits a multi-core architecture, where a thread is usually paired to a core. There are several ways to pair a thread to a particular core, the simpler ones following a round-robin (RRB) fashion or selecting the available core with the lowest index (LIF). These load-balancing techniques result in a quite different power, performance and thermal behavior of the processor, especially when low-power techniques like power gating are applied to the individual cores.
In this work, a load-balancing technique that provides low overhead in performance and energy with respect to the highest performance LIF, yet featuring a smooth temperature distribution close to the optimal RRB is presented. An uneven temperature distribution leads to thermal hot spots which affect both the reliability of the processor (by stressing some parts of the die more than others), and the cost of the processor (since the package has to be designed to handle the worst hot spot).