Communication-Avoiding Algorithms and Fast Matrix Multiplication

955
14.5
Следующее
22.06.16 – 97050:40
Non-Convex Robust PCA
Популярные
176 дней – 21 3845:48
AutoGen Update: Complex Tasks and Agents
Опубликовано 22 июня 2016, 0:01
As our computing capabilities grow, the size and complexity of numerical simulations and data analysis that today's computational scientists conduct continue to increase. The gap between the peak capabilities of the hardware and the achieved performance of many of these computations is caused in large part by the high cost of communication, or movement of data, between processors and throughout the memory hierarchy of a single processor. As expected, much of this communication cannot be avoided when using parallelism to solve our problems; however, many standard algorithms in linear algebra move more data than necessary. We have proved lower bounds on the communication required by a wide class of algorithms and can determine whether standard approaches attain these bounds. In several cases where gaps exist between algorithms and lower bounds, we have developed new algorithms that communicate asymptotically less than previous ones and outperform them on a variety of platforms. These include algorithms for solving linear systems, least squares problems, eigenvalue problems, and parallelization of Strassen's matrix multiplication algorithm. In particular, not only does Strassen's algorithm require less computation that the classical matrix multiplication algorithm, it also requires less communication. In fact, fast matrix multiplication algorithms with smaller exponent than Strassen's in their computational complexity require even less communication. I'll talk about recent development in implementations of fast algorithms that exploit this fact and where fast matrix multiplication, often thought of as a theoretical pursuit, can be practically useful.
автотехномузыкадетское