Reference — Complete Foundation Models framework guide covering LanguageModelSession, @Generable, @Guide, Tool protocol, streaming, dynamic schemas, built-in use cases, and all WWDC 2025 code examples
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
The Foundation Models framework provides access to Apple's on-device Large Language Model (3 billion parameters, 2-bit quantized) with a Swift API. This reference covers every API, all WWDC 2025 code examples, and comprehensive implementation patterns.
Technical Details:
SystemLanguageModel.default.supportedLanguagesOptimized For:
NOT Optimized For:
Privacy & Performance:
Use this reference when:
Related Skills:
foundation-models — Discipline skill with anti-patterns, pressure scenarios, decision treesfoundation-models-diag — Diagnostic skill for troubleshooting issuesLanguageModelSession is the core class for interacting with the model. It maintains conversation history (transcript), handles multi-turn interactions, and manages model state.
Basic Creation:
import FoundationModels
let session = LanguageModelSession()
With Custom Instructions:
let session = LanguageModelSession(instructions: """
You are a friendly barista in a pixel art coffee shop.
Respond to the player's question concisely.
"""
)
With Tools:
let session = LanguageModelSession(
tools: [GetWeatherTool()],
instructions: "Help user with weather forecasts."
)
With Specific Model/Use Case:
let session = LanguageModelSession(
model: SystemLanguageModel(useCase: .contentTagging)
)
Instructions:
Prompts:
respond(to:) adds prompt to transcriptSecurity Consideration:
Basic Text Generation:
func respond(userInput: String) async throws -> String {
let session = LanguageModelSession(instructions: """
You are a friendly barista in a world full of pixels.
Respond to the player's question.
"""
)
let response = try await session.respond(to: userInput)
return response.content
}
Return Type: Response<String> with .content property
Structured Output with @Generable:
@Generable
struct SearchSuggestions {
@Guide(description: "A list of suggested search terms", .count(4))
var searchTerms: [String]
}
let prompt = """
Generate a list of suggested search terms for an app about visiting famous landmarks.
"""
let response = try await session.respond(
to: prompt,
generating: SearchSuggestions.self
)
print(response.content) // SearchSuggestions instance
Return Type: Response<SearchSuggestions> with .content property
Deterministic Output (Greedy Sampling):
let response = try await session.respond(
to: prompt,
options: GenerationOptions(sampling: .greedy)
)
Low Variance (Conservative):
let response = try await session.respond(
to: prompt,
options: GenerationOptions(temperature: 0.5)
)
High Variance (Creative):
let response = try await session.respond(
to: prompt,
options: GenerationOptions(temperature: 2.0)
)
Skip Schema in Prompt (Optimization):
let response = try await session.respond(
to: prompt,
generating: Person.self,
options: GenerationOptions(includeSchemaInPrompt: false)
)
Use when: Subsequent requests with same @Generable type. Reduces token count and latency.
let session = LanguageModelSession()
// First turn
let firstHaiku = try await session.respond(to: "Write a haiku about fishing")
print(firstHaiku.content)
// Silent waters gleam,
// Casting lines in morning mist—
// Hope in every cast.
// Second turn - model remembers context
let secondHaiku = try await session.respond(to: "Do another one about golf")
print(secondHaiku.content)
// Silent morning dew,
// Caddies guide with gentle words—
// Paths of patience tread.
print(session.transcript) // Shows full history
How it works:
respond() call adds entry to transcriptlet transcript = session.transcript
for entry in transcript.entries {
print("Entry: \(entry.content)")
}
Use cases:
struct HaikuView: View {
@State private var session = LanguageModelSession()
@State private var haiku: String?
var body: some View {
if let haiku {
Text(haiku)
}
Button("Go!") {
Task {
haiku = try await session.respond(
to: "Write a haiku about something you haven't yet"
).content
}
}
// Gate on `isResponding`
.disabled(session.isResponding)
}
}
Why important: Prevents multiple concurrent requests, which could cause errors or unexpected behavior.
@Generable enables structured output from the model using Swift types. The macro generates a schema at compile-time and uses constrained decoding to guarantee structural correctness.
On Structs:
@Generable
struct Person {
let name: String
let age: Int
}
let response = try await session.respond(
to: "Generate a person",
generating: Person.self
)
let person = response.content // Type-safe Person instance
On Enums:
@Generable
struct NPC {
let name: String
let encounter: Encounter
@Generable
enum Encounter {
case orderCoffee(String)
case wantToTalkToManager(complaint: String)
}
}
Primitives:
StringInt, Float, Double, DecimalBoolCollections:
[ElementType] (arrays)Composed Types:
@Generable
struct Itinerary {
var destination: String
var days: Int
var budget: Float
var rating: Double
var requiresVisa: Bool
var activities: [String]
var emergencyContact: Person
var relatedItineraries: [Itinerary] // Recursive!
}
Natural Language Description:
@Generable
struct NPC {
@Guide(description: "A full name")
let name: String
}
Numeric Range:
@Generable
struct Character {
@Guide(.range(1...10))
let level: Int
}
Array Count:
@Generable
struct Suggestions {
@Guide(.count(3))
let attributes: [Attribute]
}
Array Maximum Count:
@Generable
struct Result {
@Guide(.maximumCount(3))
let topics: [String]
}
Regex Patterns:
@Generable
struct NPC {
@Guide(Regex {
Capture {
ChoiceOf {
"Mr"
"Mrs"
}
}
". "
OneOrMore(.word)
})
let name: String
}
session.respond(to: "Generate a fun NPC", generating: NPC.self)
// > {name: "Mrs. Brewster"}
How it works:
@Generable macro generates schema at compile-timeFrom WWDC 286: "Constrained decoding prevents structural mistakes. Model is prevented from generating invalid field names or wrong types."
Benefits:
Properties generated in order declared:
@Generable
struct Itinerary {
var name: String // Generated FIRST
var days: [DayPlan] // Generated SECOND
var summary: String // Generated LAST
}
Why it matters:
Foundation Models uses snapshot streaming (not delta streaming). Instead of raw deltas, the framework streams PartiallyGenerated types with optional properties that fill in progressively.
The @Generable macro automatically creates a PartiallyGenerated nested type:
@Generable
struct Itinerary {
var name: String
var days: [DayPlan]
}
// Compiler generates:
extension Itinerary {
struct PartiallyGenerated {
var name: String? // All properties optional!
var days: [DayPlan]?
}
}
@Generable
struct Itinerary {
var name: String
var days: [Day]
}
let stream = session.streamResponse(
to: "Craft a 3-day itinerary to Mt. Fuji.",
generating: Itinerary.self
)
for try await partial in stream {
print(partial) // Incrementally updated Itinerary.PartiallyGenerated
}
Return Type: AsyncSequence<Itinerary.PartiallyGenerated>
struct ItineraryView: View {
let session: LanguageModelSession
let dayCount: Int
let landmarkName: String
@State
private var itinerary: Itinerary.PartiallyGenerated?
var body: some View {
VStack {
if let name = itinerary?.name {
Text(name).font(.title)
}
if let days = itinerary?.days {
ForEach(days, id: \.self) { day in
DayView(day: day)
}
}
Button("Start") {
Task {
do {
let prompt = """
Generate a \(dayCount) itinerary \
to \(landmarkName).
"""
let stream = session.streamResponse(
to: prompt,
generating: Itinerary.self
)
for try await partial in stream {
self.itinerary = partial
}
} catch {
print(error)
}
}
}
}
}
}
1. Use SwiftUI animations:
if let name = itinerary?.name {
Text(name)
.transition(.opacity)
}
2. View identity for arrays:
// ✅ GOOD - Stable identity
ForEach(days, id: \.id) { day in
DayView(day: day)
}
// ❌ BAD - Identity changes
ForEach(days.indices, id: \.self) { index in
DayView(day: days[index])
}
3. Property order optimization:
// ✅ GOOD - Title first for streaming
@Generable
struct Article {
var title: String // Shows immediately
var summary: String // Shows second
var fullText: String // Shows last
}
Tools let the model autonomously execute your custom code to fetch external data or perform actions. Tools integrate with MapKit, WeatherKit, Contacts, EventKit, or any custom API.
protocol Tool {
var name: String { get }
var description: String { get }
associatedtype Arguments: Generable
func call(arguments: Arguments) async throws -> ToolOutput
}
import FoundationModels
import WeatherKit
import CoreLocation
struct GetWeatherTool: Tool {
let name = "getWeather"
let description = "Retrieve the latest weather information for a city"
@Generable
struct Arguments {
@Guide(description: "The city to fetch the weather for")
var city: String
}
func call(arguments: Arguments) async throws -> ToolOutput {
let places = try await CLGeocoder().geocodeAddressString(arguments.city)
let weather = try await WeatherService.shared.weather(for: places.first!.location!)
let temperature = weather.currentWeather.temperature.value
let content = GeneratedContent(properties: ["temperature": temperature])
let output = ToolOutput(content)
// Or if your tool's output is natural language:
// let output = ToolOutput("\(arguments.city)'s temperature is \(temperature) degrees.")
return output
}
}
let session = LanguageModelSession(
tools: [GetWeatherTool()],
instructions: "Help the user with weather forecasts."
)
let response = try await session.respond(
to: "What is the temperature in Cupertino?"
)
print(response.content)
// It's 71˚F in Cupertino!
How it works:
getWeather(city: "Tokyo")call() methodFrom WWDC 301: "Model autonomously decides when and how often to call tools. Can call multiple tools per request, even in parallel."
import FoundationModels
import Contacts
struct FindContactTool: Tool {
let name = "findContact"
let description = "Finds a contact from a specified age generation."
@Generable
struct Arguments {
let generation: Generation
@Generable
enum Generation {
case babyBoomers
case genX
case millennial
case genZ
}
}
func call(arguments: Arguments) async throws -> ToolOutput {
let store = CNContactStore()
let keysToFetch = [CNContactGivenNameKey, CNContactBirthdayKey] as [CNKeyDescriptor]
let request = CNContactFetchRequest(keysToFetch: keysToFetch)
var contacts: [CNContact] = []
try store.enumerateContacts(with: request) { contact, stop in
if let year = contact.birthday?.year {
if arguments.generation.yearRange.contains(year) {
contacts.append(contact)
}
}
}
guard let pickedContact = contacts.randomElement() else {
return ToolOutput("Could not find a contact.")
}
return ToolOutput(pickedContact.givenName)
}
}
Tools can maintain state across calls using class instead of struct:
class FindContactTool: Tool {
let name = "findContact"
let description = "Finds a contact from a specified age generation."
var pickedContacts = Set<String>() // State!
@Generable
struct Arguments {
let generation: Generation
@Generable
enum Generation {
case babyBoomers
case genX
case millennial
case genZ
}
}
func call(arguments: Arguments) async throws -> ToolOutput {
// Fetch contacts...
contacts.removeAll(where: { pickedContacts.contains($0.givenName) })
guard let pickedContact = contacts.randomElement() else {
return ToolOutput("Could not find a contact.")
}
pickedContacts.insert(pickedContact.givenName) // Update state
return ToolOutput(pickedContact.givenName)
}
}
Why: Tool instance persists for session lifetime. Can track what's been called.
import FoundationModels
import EventKit
struct GetContactEventTool: Tool {
let name = "getContactEvent"
let description = "Get an event with a contact."
let contactName: String
@Generable
struct Arguments {
let day: Int
let month: Int
let year: Int
}
func call(arguments: Arguments) async throws -> ToolOutput {
// Fetch events from Calendar...
let eventStore = EKEventStore()
// ... implementation ...
return ToolOutput(/* event details */)
}
}
Two forms:
return ToolOutput("Temperature is 71°F")
let content = GeneratedContent(properties: ["temperature": 71])
return ToolOutput(content)
DO:
getWeather, findContactget, find, fetch, createDON'T:
gtWthrFrom WWDC 301: "Tool name and description put verbatim in prompt. Longer strings mean more tokens, which increases latency."
let session = LanguageModelSession(
tools: [
GetWeatherTool(),
FindRestaurantTool(),
FindHotelTool()
],
instructions: "Plan travel itineraries."
)
// Model autonomously decides which tools to call and when
Key facts:
From WWDC 301: "When tools called in parallel, your call method may execute concurrently. Keep this in mind when accessing data."
DynamicGenerationSchema enables creating schemas at runtime instead of compile-time. Useful for user-defined structures, level creators, or dynamic forms.
@Generable
struct Riddle {
let question: String
let answers: [Answer]
@Generable
struct Answer {
let text: String
let isCorrect: Bool
}
}
If this structure is only known at runtime:
struct LevelObjectCreator {
var properties: [DynamicGenerationSchema.Property] = []
mutating func addStringProperty(name: String) {
let property = DynamicGenerationSchema.Property(
name: name,
schema: DynamicGenerationSchema(type: String.self)
)
properties.append(property)
}
mutating func addBoolProperty(name: String) {
let property = DynamicGenerationSchema.Property(
name: name,
schema: DynamicGenerationSchema(type: Bool.self)
)
properties.append(property)
}
mutating func addArrayProperty(name: String, customType: String) {
let property = DynamicGenerationSchema.Property(
name: name,
schema: DynamicGenerationSchema(
arrayOf: DynamicGenerationSchema(referenceTo: customType)
)
)
properties.append(property)
}
var root: DynamicGenerationSchema {
DynamicGenerationSchema(
name: name,
properties: properties
)
}
}
// Create riddle schema
var riddleBuilder = LevelObjectCreator(name: "Riddle")
riddleBuilder.addStringProperty(name: "question")
riddleBuilder.addArrayProperty(name: "answers", customType: "Answer")
// Create answer schema
var answerBuilder = LevelObjectCreator(name: "Answer")
answerBuilder.addStringProperty(name: "text")
answerBuilder.addBoolProperty(name: "isCorrect")
let riddleDynamicSchema = riddleBuilder.root
let answerDynamicSchema = answerBuilder.root
let schema = try GenerationSchema(
root: riddleDynamicSchema,
dependencies: [answerDynamicSchema]
)
let session = LanguageModelSession()
let response = try await session.respond(
to: "Generate a fun riddle about coffee",
schema: schema
)
let generatedContent = response.content // GeneratedContent
let question = try generatedContent.value(String.self, forProperty: "question")
let answers = try generatedContent.value([GeneratedContent].self, forProperty: "answers")
Use @Generable when:
Use Dynamic Schemas when:
From WWDC 301: "Compile-time @Generable gives type safety. Dynamic schemas give runtime flexibility. Both use same constrained decoding guarantees."
Random Sampling (Default):
let response = try await session.respond(to: prompt)
// Different output each time
Greedy Sampling (Deterministic):
let response = try await session.respond(
to: prompt,
options: GenerationOptions(sampling: .greedy)
)
// Same output for same prompt (given same model version)
Use greedy for:
Caveat: Only deterministic for same model version. OS updates may change model, changing output.
Low variance (focused, conservative):
let response = try await session.respond(
to: prompt,
options: GenerationOptions(temperature: 0.5)
)
// Predictable, focused output
High variance (creative, diverse):
let response = try await session.respond(
to: prompt,
options: GenerationOptions(temperature: 2.0)
)
// Varied, creative output
Temperature scale:
0.1-0.5: Very focused1.0 (default): Balanced1.5-2.0: Very creativeSpecialized adapter for:
@Generable
struct Result {
let topics: [String]
}
let session = LanguageModelSession(
model: SystemLanguageModel(useCase: .contentTagging)
)
let response = try await session.respond(
to: articleText,
generating: Result.self
)
With custom instructions:
@Generable
struct Top3ActionEmotionResult {
@Guide(.maximumCount(3))
let actions: [String]
@Guide(.maximumCount(3))
let emotions: [String]
}
let session = LanguageModelSession(
model: SystemLanguageModel(useCase: .contentTagging),
instructions: "Tag the 3 most important actions and emotions in the given input text."
)
let response = try await session.respond(
to: text,
generating: Top3ActionEmotionResult.self
)
exceededContextWindowSize:
do {
let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
// Context limit (4096 tokens) exceeded
// Solution: Condense transcript, create new session
}
guardrailViolation:
do {
let response = try await session.respond(to: userInput)
} catch LanguageModelSession.GenerationError.guardrailViolation {
// Content policy triggered
// Solution: Show graceful message, don't generate
}
unsupportedLanguageOrLocale:
do {
let response = try await session.respond(to: userInput)
} catch LanguageModelSession.GenerationError.unsupportedLanguageOrLocale {
// Language not supported
// Solution: Check supported languages, show message
}
var session = LanguageModelSession()
do {
let response = try await session.respond(to: prompt)
print(response.content)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
// New session, no history
session = LanguageModelSession()
}
do {
let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
// New session with some history
session = newSession(previousSession: session)
}
private func newSession(previousSession: LanguageModelSession) -> LanguageModelSession {
let allEntries = previousSession.transcript.entries
var condensedEntries = [Transcript.Entry]()
if let firstEntry = allEntries.first {
condensedEntries.append(firstEntry) // Instructions
if allEntries.count > 1, let lastEntry = allEntries.last {
condensedEntries.append(lastEntry) // Recent context
}
}
let condensedTranscript = Transcript(entries: condensedEntries)
// Note: transcript includes instructions
return LanguageModelSession(transcript: condensedTranscript)
}
struct AvailabilityExample: View {
private let model = SystemLanguageModel.default
var body: some View {
switch model.availability {
case .available:
Text("Model is available").foregroundStyle(.green)
case .unavailable(let reason):
Text("Model is unavailable").foregroundStyle(.red)
Text("Reason: \(reason)")
}
}
}
let supportedLanguages = SystemLanguageModel.default.supportedLanguages
guard supportedLanguages.contains(Locale.current.language) else {
// Show message
return
}
Device Requirements:
Region Requirements:
User Requirements:
Access: Instruments app → Foundation Models template
Metrics:
From WWDC 286: "New Instruments profiling template lets you observe areas of optimization and quantify improvements."
Problem: First request takes 1-2s to load model
Solution: Create session before user interaction
class ViewModel: ObservableObject {
private var session: LanguageModelSession?
init() {
// Prewarm on init
Task {
self.session = LanguageModelSession(instructions: "...")
}
}
func generate(prompt: String) async throws -> String {
let response = try await session!.respond(to: prompt)
return response.content
}
}
From WWDC 259: "Prewarming session before user interaction reduces initial latency."
Time saved: 1-2 seconds off first generation
Problem: Large @Generable schemas increase token count
Solution: Skip schema insertion for subsequent requests
// First request - schema inserted
let first = try await session.respond(
to: "Generate first person",
generating: Person.self
)
// Subsequent requests - skip schema
let second = try await session.respond(
to: "Generate another person",
generating: Person.self,
options: GenerationOptions(includeSchemaInPrompt: false)
)
From WWDC 259: "Setting includeSchemaInPrompt to false decreases token count and latency for subsequent requests."
Time saved: 10-20% per request
// ✅ GOOD - Important properties first
@Generable
struct Article {
var title: String // Shows in 0.2s (streaming)
var summary: String // Shows in 0.8s
var fullText: String // Shows in 2.5s
}
// ❌ BAD - Important properties last
@Generable
struct Article {
var fullText: String // User waits 2.5s
var summary: String
var title: String
}
UX impact: Perceived latency drops from 2.5s to 0.2s with streaming
let feedback = LanguageModelFeedbackAttachment(
input: [
// Input tokens/prompts
],
output: [
// Output tokens/content
],
sentiment: .negative,
issues: [
LanguageModelFeedbackAttachment.Issue(
category: .incorrect,
explanation: "Model hallucinated facts"
)
],
desiredOutputExamples: [
[
// Example of desired output
]
]
)
let data = try JSONEncoder().encode(feedback)
// Attach to Feedback Assistant report
Use for:
Xcode Playgrounds enable rapid iteration on prompts without rebuilding entire app.
import FoundationModels
import Playgrounds
#Playground {
let session = LanguageModelSession()
let response = try await session.respond(
to: "What's a good name for a trip to Japan? Respond only with a title"
)
}
import FoundationModels
import Playgrounds
#Playground {
let session = LanguageModelSession()
for landmark in ModelData.shared.landmarks {
let response = try await session.respond(
to: "What's a good name for a trip to \(landmark.name)? Respond only with a title"
)
}
}
Benefit: Can access types defined in your app (like @Generable structs)
| Class | Purpose |
|---|---|
LanguageModelSession | Main interface for model interaction |
SystemLanguageModel | Access to model availability and use cases |
GenerationOptions | Configure sampling, temperature, schema inclusion |
ToolOutput | Return value from Tool.call() |
GeneratedContent | Dynamic structured output |
DynamicGenerationSchema | Runtime schema definition |
Transcript | Conversation history |
| Protocol | Purpose |
|---|---|
Tool | Define custom tools for model to call |
Generable | (Not direct protocol) Macro-generated conformance |
| Macro | Purpose |
|---|---|
@Generable | Enable structured output for types |
@Guide | Add constraints to @Generable properties |
| Enum | Purpose |
|---|---|
SystemLanguageModel.Availability | .available or .unavailable(reason) |
GenerationError | Error types (context exceeded, guardrail, language) |
SamplingMethod | .greedy or .random |
| Method | Return | Purpose |
|---|---|---|
session.respond(to:) | Response<String> | Generate text |
session.respond(to:generating:) | Response<T> | Generate structured output |
session.streamResponse(to:generating:) | AsyncSequence<T.PartiallyGenerated> | Stream structured output |
| Property | Type | Purpose |
|---|---|---|
session.transcript | Transcript | Conversation history |
session.isResponding | Bool | Whether currently generating |
SystemLanguageModel.default.availability | Availability | Model availability status |
SystemLanguageModel.default.supportedLanguages | [Language] | Supported languages |
When to migrate:
When NOT to migrate:
Before:
let prompt = "Generate person as JSON"
let response = try await session.respond(to: prompt)
let data = response.content.data(using: .utf8)!
let person = try JSONDecoder().decode(Person.self, from: data)
After:
@Generable
struct Person {
let name: String
let age: Int
}
let response = try await session.respond(
to: "Generate a person",
generating: Person.self
)
Benefits:
WWDC 2025 Sessions:
Apple Documentation:
Axiom Skills:
foundation-models — Discipline skill with anti-patterns, pressure scenariosfoundation-models-diag — Diagnostic skill for troubleshootingLast Updated: 2025-12-03 Version: 1.0.0 Skill Type: Reference Content: All WWDC 2025 code examples included