AILIS: A Proposed Layer Model for AI Systems¶

Version 0.1 - Early Draft for Discussion

What if we had a common language for the AI stack?¶

The AI ecosystem is evolving rapidly, with countless tools, platforms, and frameworks emerging daily. Yet we lack a shared mental model—a common way to discuss where different capabilities sit, how they interact, and where the gaps might be.

AILIS (AI Layer Interface Specification) is a proposal exploring whether a layered model—similar to the OSI model in networking—might help us better understand and discuss AI system architectures.

/// note | 🌱 This is exploratory We don't claim to have all the answers. This proposal is intentionally incomplete and likely wrong in places. We're sharing it early because we believe the best ideas emerge from open discussion. ///

Why explore this?¶

In networking, the OSI model gave us a shared vocabulary. When someone says "Layer 3," everyone understands we're discussing routing and addressing, not physical cables or application logic.

Could something similar help in AI? We're not sure, but we think it's worth exploring together.

Quick Start¶

/// tip "New to AILIS?" Get oriented in 3 steps:

Read the Primer (10 min) - Full layer model overview
Check the Cheat Sheet (2 min) - Quick reference
Join the Discussion - Share your thoughts ///

Contributing

See CONTRIBUTING.md for the RFC process and how to submit proposals
Feedback Needed

Check FEEDBACK.md for specific areas where we need community input
Issues & Ideas

Browse open issues to join active discussions
Local Development

Run mkdocs serve to build and preview the site locally (setup guide)

The AILIS 16+ Proposal¶

We're proposing a 16-layer model (plus cross-cutting concerns) that attempts to map the AI stack from physical infrastructure up through application logic:

Infrastructure Foundation (L0-L2)¶

L0 – Facilities & Power: Datacenters, power/cooling, physical security
L1 – Compute Fabric: GPUs/TPUs/NPUs/CPUs, memory, interconnects
L2 – System & Driver Runtime: CUDA/ROCm/Metal, device memory management

Model & Inference Stack (L3-L7)¶

L3 – ML Graph & Compilation: XLA/TVM/TensorRT-LLM/ONNX Runtime
L4 – Numeric & Quantization: FP16/FP8/INT4, sparsity, calibration
L5 – Tokenization & Encoders: BPE tokenizers, CLIP, audio patchifiers
L6 – Model Parameters & Architecture: Base/foundation weights, MoE, diffusion
L7 – Inference Engine & Decoding: Serving runtimes, caching, speculative decoding

AI Application Interface (L8-L10)¶

L8 – Context Construction & Prompting: System prompts, templates, few-shot
L9 – Knowledge & Retrieval: Vector/graph indexes, rerankers, grounding, citations
L10 – Tool & Function Invocation: Typed tool I/O (MCP), function calling, API bindings

The Orchestration Layers (L11-L15)¶

L11 – Addressing & Registry: Signed manifests, discovery, capability vectors, fingerprints
L12 – Routing, Planning & Policy: Rule DSL + bandits, budgets, privacy, fallback/parallel
L13 – Transport & Flow Semantics: Idempotent runs, streaming, CANCEL/RESUME, multiplex
L14 – Session, Identity & Memory: Portable session envelope, capability tokens, memory tiers
L15 – Governance, Safety & Schema: Redaction, validation/repair, schema change control, audit

Application Layer (L16)¶

L16 – Application & Domain Logic: Product UX, workflows, agent frameworks

Cross-Cutting Planes¶

Control: Policy/configuration management
Management/Observability: Telemetry, evaluations, monitoring
Security: mTLS, key management, PII protection

What we're asking¶

We're particularly interested in:

/// question "Questions for the community"

Does this framing resonate or feel forced?
What's missing or miscategorized?
Are there better ways to think about these boundaries?
What existing work should we learn from?

///

Explore the Proposal¶

AILIS Primer

Comprehensive overview of the 16+ layer model
Quick Reference

One-page cheat sheet for the layer model
Reference Code

Implementation examples and tools
Case Studies

Real-world system mappings and analyses

Join the Conversation¶

This is an open invitation to think together about how we might better organize our understanding of AI systems. Whether you're building infrastructure, developing applications, or researching new approaches, your perspective would be valuable.

How to Contribute¶

We follow an RFC-style process for proposals:

💡 Have an idea? Open an issue to discuss
📝 Ready to propose? Create a draft/your-proposal branch
👀 Get feedback During 4-week review period
✅ Reach consensus Proposal accepted or declined with rationale

Quick ways to help:

/// tip "Ways to contribute"

Share feedback - Use our issue templates
Submit use cases - Show us real-world examples
Propose alternatives - Challenge our assumptions
Review proposals - Help evaluate new ideas

///

See CONTRIBUTING.md for the full process, and FEEDBACK.md for specific areas where we're seeking input.

Context and Origins¶

This proposal emerges from practical experience building AI tools and observing the challenges of interoperability in the current ecosystem. We found ourselves wishing for a clearer map to understand how different capabilities relate to each other.

Rather than create Yet Another Stack Diagram™, we wondered: could we contribute something more broadly useful to the community?

License¶

Documentation and specifications: CC-BY 4.0
Code and examples: Apache 2.0

We chose these licenses to enable the widest possible collaboration and adoption, should any of these ideas prove useful.

Acknowledgments¶

This proposal draws inspiration from:

The OSI network model
The work of countless AI infrastructure teams
Open specifications like OpenAPI and GraphQL
The Model Context Protocol (MCP) community

/// warning "Early Draft Status" Version 0.1 - Everything here is subject to change based on community feedback. We're not trying to create a standard—we're trying to start a useful conversation. ///

Questions? Thoughts? We'd love to hear from you. Start a discussion →