Example

Example of Synchronous vs. Asynchronous Increment Benchmark

A practical demonstration of the performance benefit of asynchronous scheduling involves incrementing a variable by 11 a total of 1000010000 times, comparing synchronous and asynchronous modes. Using the d2l.Benchmark context manager to measure elapsed time, the synchronous version inserts a wait_to_read() barrier after every addition, forcing the frontend to block until each individual y = x + 1 operation completes before issuing the next; this took approximately 3.163.16 seconds. In the asynchronous version, all 1000010000 additions are enqueued without any per-iteration barrier, and only a single global npx.waitall() is called after the loop; this completed in roughly 0.930.93 seconds—over three times faster. The speedup arises because asynchronous execution allows the frontend to continuously feed tasks into the backend queue while the backend processes them in parallel, eliminating the per-iteration round-trip overhead of synchronization.

0

1

Updated 2026-05-18

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L