Brock University Researcher Pioneers Open-Source Approach to Enhance Scientific Computing
A Brock University physicist has created a free open-source package, or collection of reusable programming components, that enables researchers to improve the performance of their scientific software.
Assistant Professor of Physics Barak Shoshany says computers haven’t been getting significantly faster in the last few decades. Instead, what makes it seem like they’ve sped up is that they can do more things simultaneously.
“Early computers could only execute tasks sequentially, one step at a time,” he says. “They have since evolved to contain multiple processing units, or cores, which can execute tasks in parallel — one task per core.”
He says the number of cores can range from a few in budget home computers to hundreds of thousands in supercomputers, with more cores meaning faster performance.
But programmers must write specialized code to take advantage of these multiple cores. Shoshany’s recently published package allows programmers to do this quickly and easily using the programming language C++, which is commonly used in high-performance computing.
In order for a program to be able to run on multiple cores in parallel, he says, it must spawn different threads of execution. A thread can be thought of as a separate sub-program that runs independently of the main program.
“This functionality already exists in the C++ language but requires creating a new thread for each individual task — an inefficient and time-consuming process,” says Shoshany.
His package uses a thread pool to mitigate this issue. Instead of creating a new thread every time the program needs to execute a task, the program creates a pool of threads just once, typically one thread per core.
The threads can then be “recycled” to execute multiple tasks, considerably increasing performance, he says.
Using Shoshany’s package, programmers can split calculations, simulations, data analysis and other scientific algorithms into individual tasks and submits these tasks to a queue. The thread pool package then handles the execution of the tasks in the individual threads in the most efficient way possible.
“When a thread finishes a task, the pool will automatically give the next task in the queue to that thread, so there is no wasted idle time,” he says.
While C++ thread pool packages have already been in use by the scientific community before, Shoshany’s package uses modern programming practices that allow it to be easier to use and faster to run. It also comes with user-friendly instructions, including many examples.
Details of Shoshany’s thread pool software are explained in his paper “A C++17 thread pool for high-performance scientific computing,” published in the May edition of the computer science journal SoftwareX.
His thread pool package is freely available on GitHub, a hosting service for open-source software, where Shoshany also provides ongoing user support. It has already been starred, or favourited, by 2,000 GitHub users.
Shoshany says he makes all of his scientific papers, data and software freely available online in the interest of keeping science open.
“I enjoy seeing how scientists are making use of my C++ package to speed up their scientific software, and it feels good to know that I am doing my part to advance all fields of science — not just physics,” he says.