1Cademy - Example of a Synchronous vs. Asynchronous Increment Benchmark in MXNet

Learn Before

Example

Example of a Synchronous vs. Asynchronous Increment Benchmark in MXNet

A practical demonstration of the performance benefit of asynchronous scheduling in MXNet involves incrementing a variable by 1 a total of 10,000 times, comparing synchronous and asynchronous modes. Using the d2l.Benchmark context manager to measure elapsed time, the synchronous version inserts a wait_to_read() barrier after every addition, forcing the frontend to block until each individual y = x + 1 operation completes before issuing the next; this took approximately 3.16 seconds. In the asynchronous version, all 10,000 additions are enqueued without any per-iteration barrier, and only a single global npx.waitall() is called after the loop; this completed in roughly 0.93 seconds—over three times faster. The speedup arises because asynchronous execution allows the frontend to continuously feed tasks into the backend queue while the backend processes them in parallel, eliminating the per-iteration round-trip overhead of synchronization.

0

1

Updated 2026-07-03

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn Before

Related