A literature review examining automatic differentiation (AD) theory and its implementation across modern machine learning frameworks, covering the design trade-offs between TensorFlow, PyTorch, and JAX.
Topics Covered
- • Forward mode and reverse mode AD theory
- • Connection between reverse-mode AD and backpropagation
- • Framework design trade-offs (TensorFlow, PyTorch, JAX)
- • Compilation-based AD (XLA, TorchInductor, AOT Autograd)
- • Emerging directions: forward gradients, neural ODEs, SSA-form AD
- TypeLiterature Review
- Year2026
- ScopeAD Theory, Framework Design, Compilation
Framework Analysis
- • TensorFlow: Static computation graphs, XLA compilation, tf.function tracing
- • PyTorch: Dynamic define-by-run graphs, TorchInductor/AOT Autograd compilation
- • JAX: Functional composable transformations (grad, jit, vmap, pmap)
Advanced Topics
Explored differentiating through control flow and loops, efficient AD for linear algebra operations, and compilation-based approaches that fuse operations for better hardware utilisation. Discussed emerging research directions including forward gradients (Baydin et al.), neural ODEs (Chen et al.), and SSA-form differentiation for compiler-level AD.