Skip to content

Control¤

Pipeline control flow and execution management utilities. These modules handle asynchronous operations and data prefetching for optimal performance.

Components¤

Component Purpose Benefit
Prefetcher Async data loading Hide I/O latency

★ Insight ─────────────────────────────────────

  • Prefetching loads next batch while GPU processes current
  • Overlaps I/O and compute for better throughput
  • Most useful when I/O is the bottleneck
  • Works automatically with Pipeline

─────────────────────────────────────────────────

Quick Start¤

from datarax.control import Prefetcher

# Wrap iterator with prefetching
prefetcher = Prefetcher(
    iterator=pipeline,
    prefetch_count=2,  # Keep 2 batches ready
)

for batch in prefetcher:
    # Next batch loads while this one processes
    train_step(batch)

Modules¤

  • prefetcher - Asynchronous data prefetching for pipeline optimization

How Prefetching Works¤

Without prefetching:
[Load B1] [Process B1] [Load B2] [Process B2] ...
          ^-- GPU idle during load

With prefetching:
[Load B1] [Load B2   ] [Load B3   ] ...
          [Process B1] [Process B2] ...
          ^-- GPU always busy

Integration with DAG¤

from datarax.pipeline import Pipeline

# Prefetching is built into Pipeline
pipeline = Pipeline(source=source, stages=[], batch_size=32, rngs=nnx.Rngs(0))

See Also¤