L2 - System & Driver Runtime¶

L2 covers the runtime bridge between hardware and ML frameworks: drivers, kernel APIs, device memory management, container integration, scheduling hooks, and low-level libraries. It is where hardware becomes programmable by the rest of the stack.

L2System & Driver Runtime

Drivers
Kernel APIs
Memory management
Cluster scheduling hooks

What belongs here¶

L2 is lower than model graph compilation. It asks whether accelerated hardware can be addressed, scheduled, isolated, monitored, and shared safely by higher layers.

Representative projects¶

Project	Why it might fit	Adjacent layers
NVIDIA CUDA	Programming model, libraries, and runtime for NVIDIA GPU computing.	L1 GPUs, L3 compilation
AMD ROCm	Open software stack for AMD GPU acceleration.	L1 accelerators, L3 frameworks
Apple Metal	Low-level graphics and compute API used by Apple silicon workloads.	L1 devices, L7 local inference
Intel oneAPI	Cross-architecture programming model and toolkits.	L1 CPU/GPU, L3 compilation
NVIDIA GPU Operator	Kubernetes integration for GPU drivers, runtime components, and monitoring.	L2 runtime, L12 policy

Boundary questions¶

Are scheduler plugins part of L2 because they expose devices, or L12 because they enforce placement policy?
When a runtime library performs graph optimization, should it move up to L3?
How should AILIS represent local-device inference where L1 and L2 are hidden inside the operating system?

Signals to watch¶

Better accelerator virtualization and multi-tenant isolation.
Runtime support for confidential or attested AI workloads.
Scheduling APIs that expose power, memory, and accelerator topology to higher layers.

L2 - System & Driver Runtime¶

What belongs here¶

Representative projects¶

Boundary questions¶

Signals to watch¶

Links¶