At WWDC 2025, Apple introduced the Foundation Models framework, offering a powerful, privacy-first Swift API for harnessing large‑language‑model capabilities. With this, developers can integrate features like text generation, guided structured output, tool calling, multi‑turn conversations, and streaming—all running on-device (or privately in the cloud).
Model Variants
- On‑device model: A compact (~3 billion parameter, 2‑bit quantized) LLM optimized for Apple silicon. It handles summarization, classification, extraction, creative text, and short dialogue—without inflating app size or requiring internet access .
- Private Cloud Compute model: A larger model for complex or up-to-date knowledge tasks, securely accessible via Apple’s encrypted private cloud .
System Requirements
- OS: iOS/iPadOS 18, macOS Sequoia (15), visionOS soon .
- Hardware: Apple Intelligence–enabled devices (A17 Pro, M-series) .
- Framework: Use import FoundationModels.
- Privacy: Default on-device. Cloud inference requires user permission and encryption .
- Languages: Supports 15+ languages (initially English, plus Simplified Chinese) .
Example Code
import FoundationModels
let systemModel = SystemLanguageModel.default
guard systemModel.isAvailable else {
switch systemModel.availability {
case .available:
break
case .unavailable(let reason):
print("Model unavailable: \(reason)")
return
}
}
let model = FoundationModel()
let config = FoundationModel.Configuration(
task: .textGeneration(prompt: "Describe the advantages of Swift."),
streaming: true
)
let session = try await model.makeSession(configuration: config)
for try await chunk in session.streamResults() {
print(chunk.text, terminator: "")
}
✅ This covers environment checks, streaming, and proper API usage.
Advanced Features
🔧 Guided (Structured) Generation
Use the @Generable macro with @Guide annotations to constrain the model’s output to typed Swift data structures. This ensures reliable, structured responses via constrained decoding .
🧩 Tool Calling
Define app-specific tools with the Tool protocol. The model can autonomously invoke functions (e.g., fetch POIs or call APIs), then integrate the tool responses into its output .
🧠 Stateful Sessions
LanguageModelSession maintains conversation context across multiple prompts, tracks transcripts, and handles context limits. Supports instructions separate from prompts to improve reliability and security .
⏱ Streaming Snapshots
Instead of token “deltas,” FoundationModels streams structured “snapshots”—partial objects matching your schema—allowing UI updates as content evolves .
🛠 Debugging & Profiling
Use the new FoundationModels Instruments tool to prewarm sessions, optimize schema inclusion, and measure latency for smoother experience .
⚙️ Fine-tuning via Adapters
Advanced users can train small adapters to add task-specific skills to the ~3B model, compatible via Python toolkit .
Pros & Cons
✅ Strengths | ⚠️ Limitations |
---|---|
Privacy-first: local inference, offline support | Requires newer hardware (A17 Pro, M-series) |
Native Swift API; structured output; tool use | Smaller scale than GPT‑4—suited to in-app tasks |
Session support and streaming snapshots | Debugging and tooling still evolving |
Cloud inference is encrypted and free | Language support still growing |
Summary
Apple’s Foundation Models framework empowers developers to embed intelligent features—like guided structured generation, dynamic tool interactions, and conversational flow—directly into apps, all within a privacy-first, on-device paradigm.