Skip to content

Examples Overview¤

Datarax provides a full set of examples organized by complexity and topic. Each example follows a consistent structure with learning goals, prerequisites, and expected outcomes.

Quick Start¤

Example Categories¤

Core Pipeline¤

Essential examples for understanding Datarax fundamentals.

Example Level Description
Simple Pipeline Beginner Basic pipeline with memory source and operators
Pipeline Tutorial Intermediate Thorough guide to operators and composition
Operators Tutorial Intermediate Deep dive into operator types and patterns
CIFAR-10 Quick Reference Beginner CIFAR-10 dataset loading and preprocessing
Augmentation Quick Reference Beginner Image augmentation techniques
MNIST Tutorial Intermediate Complete MNIST training pipeline
Fashion Augmentation Intermediate Fashion-MNIST with advanced augmentation
Composition Strategies Intermediate All 11 operator composition patterns
Advanced Operators Intermediate Probabilistic, selector, and patch dropout operators

Integration¤

Connect Datarax with external data sources and libraries.

Example Level Description
HuggingFace Quick Reference Beginner Load datasets from HuggingFace Hub
HuggingFace Tutorial Intermediate Advanced HF usage and training pipelines
IMDB Example Beginner Text classification with IMDB dataset
TFDS Quick Reference Beginner Load datasets from TensorFlow Datasets
ArrayRecord Quick Reference Intermediate Google's ArrayRecord format integration

Differentiable Pipelines (Why Datarax)¤

Flagship examples demonstrating datarax's unique differentiable pipeline capabilities.

  • Learned Augmentation (DADA)


    10,000x faster augmentation policy search via gradient descent

    Advanced Guide

  • Learned ISP for Detection


    End-to-end differentiable image signal processing pipeline

    Advanced Guide

  • DDSP Audio Synthesis


    Custom operators for differentiable digital signal processing

    Advanced Guide

Advanced¤

Production-ready patterns and optimization techniques.

Example Level Description
MixUp & CutMix Tutorial Intermediate Batch-level mixing augmentations
Checkpoint Quick Reference Intermediate Save and restore pipeline state
Resumable Training Guide Advanced Full checkpointing workflow
DAG Fundamentals Guide Advanced Deep dive into DAG pipeline architecture
Branching DAG Cookbook Intermediate Branch / Merge / Parallel recipes via Pipeline.from_dag
Sharding Quick Reference Intermediate Multi-device data distribution
Sharding Guide Advanced Advanced distributed training patterns
Interleaved Tutorial Intermediate Multiple data source mixing
Optimization Guide Advanced Performance tuning and profiling
Sampling Tutorial Intermediate Sequential, shuffle, range, and epoch-aware samplers
End-to-End CIFAR-10 Advanced Complete training pipeline with all features
DADA Learned Augmentation Advanced Differentiable augmentation policy search
Learned ISP Guide Advanced End-to-end ISP optimization for object detection
DDSP Audio Synthesis Advanced Custom operators for differentiable audio processing

Documentation Tiers¤

Datarax examples follow a three-tier documentation pattern:

Tier 1: Quick Reference (~5-10 min)¤

  • Minimal code, maximum clarity
  • Single focused concept
  • Copy-paste ready snippets
  • Ideal for: Getting started, quick lookups

Tier 2: Tutorial (~30-60 min)¤

  • Step-by-step instruction
  • Multiple related concepts
  • Hands-on practice exercises
  • Ideal for: Learning new features

Tier 3: Advanced Guide (~60+ min)¤

  • Deep dive into internals
  • Performance optimization
  • Production considerations
  • Ideal for: Expert users, complex use cases

Feature Coverage¤

The examples cover all major Datarax features:

Feature Area Examples Coverage
Data Sources Memory, HuggingFace, TFDS, ArrayRecord Complete
Operators Element, Batch, Probabilistic, Selector, Patch Dropout Complete
Composition Linear stages and branching DAGs via Pipeline and Pipeline.from_dag Complete
Samplers Sequential, Shuffle, Range, EpochAware Complete
DAG Pipeline Linear stages=[...] and Pipeline.from_dag topologies Complete
Distributed Sharding, Multi-device Complete
Checkpointing State save/restore, Resumable training Complete
Monitoring Metrics, Reporters, Callbacks Complete
Differentiable Pipelines DADA, ISP, DDSP Complete

Running Examples¤

All examples are available as both Python scripts and Jupyter notebooks.

As Python Scripts¤

# Activate environment
source activate.sh

# Run any example
python examples/core/01_simple_pipeline.py

As Jupyter Notebooks¤

# Start Jupyter
uv run jupyter lab

# Navigate to examples/ directory

Generating Notebooks from Scripts¤

# Convert a single file
python scripts/jupytext_converter.py py-to-nb examples/core/01_simple_pipeline.py

# Batch convert directory
python scripts/jupytext_converter.py batch-py-to-nb examples/core/

Prerequisites¤

Before running examples, ensure you have:

  1. Datarax installed: uv pip install datarax
  2. JAX configured: GPU support recommended for performance
  3. Environment activated: source activate.sh

For external data sources:

  • HuggingFace: uv pip install "datarax[data]"
  • TFDS: uv pip install "datarax[data]"
  • ArrayRecord: uv pip install "datarax[data]" array-record

For Contributors¤

Want to add your own examples? We welcome contributions!

  • Documentation Design Guide


    Complete standards for creating educational examples and tutorials

    Read the Guide

  • Example Template


    Start from our template with proper structure and formatting

    View Template

Quick Start for Contributors¤

  1. Read the Example Documentation Design Guide
  2. Copy the template from examples/_templates/example_template.py
  3. Follow the 7-part structure and quality checklist
  4. Submit a PR with .py, generated .ipynb, and .md documentation files

Next Steps¤