Technology7 min read

How Némos Uses Apple's Foundation Models Framework — A Technical Breakdown

Némos is one of the first note apps built entirely on Apple's Foundation Models framework. Here's exactly how LanguageModelSession powers auto-titling, classification, Smart Spaces, and AI chat — all on your device, with zero server calls.

May 29, 2026·By Taha Baalla

When Apple released the Foundation Models framework with iOS 26, most developers added it as a single feature — a smarter autocomplete here, a quick summary there. Némos took a different approach: every AI capability in the app runs on the Foundation Models framework. No cloud calls. No hybrid fallback. Zero server roundtrips.

This isn't a marketing claim. It's a technical constraint the team chose deliberately — and understanding *why* explains a lot about how Némos is built.

What Is the Apple Foundation Models Framework?

The Foundation Models framework (introduced in iOS 26) exposes Apple's on-device language models to third-party developers. These are the same models powering Apple Intelligence — Writing Tools, Smart Summaries, Priority Notifications — but available via API to apps like Némos.

Key technical properties:

Runs on the Neural Engine (A17 Pro, A18, M-series chips) — not CPU or GPU
~3B parameter language model running entirely on-device
LanguageModelSession API — streaming token generation, structured output via Codable conformance, guided generation
No internet required — works offline, in airplane mode, in basements, on planes
No data leaves the device — Apple's privacy guarantee is architectural, not policy

What makes Foundation Models different from calling OpenAI or Anthropic from Swift: the model lives on your phone's chip. The API call never leaves the process boundary. There is no network request. There is no server log of your content.

For note apps that handle personal content — voice memos from therapy sessions, meeting notes with sensitive strategy, private thoughts about work relationships — this architectural difference isn't minor. It's the whole product.

How Némos Uses Foundation Models — Feature by Feature

Némos uses the Foundation Models framework for five distinct AI tasks that fire silently as you use the app.

Auto-titling. When you create a memo, Némos generates a descriptive title automatically. No more "Untitled Note 47." The model reads the first 400 tokens of content and produces a title using structured output — the response schema forces the model to return a single String with no filler text. This uses LanguageModelSession with a system prompt specifying title format and a @Generable struct for output. Inference runs in approximately 200ms on A17 Pro.

Voice transcription classification. After on-device transcription via the Speech framework, Némos passes the transcript through Foundation Models to extract: topic, intent (action item / idea / meeting note / reminder), and suggested Smart Space. This classification step fires silently in the background — by the time you finish recording, your memo is already filed.

Smart Spaces auto-organization. Smart Spaces are AI-generated folders that cluster related memos without manual tagging. Foundation Models generates candidate clusters by comparing memo content semantically, then proposes Space names and membership. The user never sees this happen; memos just appear in the right place.

AI Chat with documents. In the chat interface, user questions are answered by passing the relevant memo content plus the question to a Foundation Models session with retrieval context. The session uses streaming output for real-time token display. Context window management is handled by the app — chunking long documents into retrievable segments.

OCR content understanding. After Vision framework OCR extracts text from a screenshot, Foundation Models categorizes the content: receipt, article, social post, code snippet, map location. This category tag drives both display and searchability.

Why Foundation Models Over a Cloud API?

The honest answer: the team had a choice.

Cloud AI (OpenAI, Claude, Gemini) gives you larger models, longer context windows, and multimodal input. For many apps, that trade-off makes sense. For Némos, it doesn't — for three reasons.

The content is personal. Voice memos from meetings. Screenshots of private messages. Travel plans. Journal entries. This content should not transit a server. Not because of paranoia, but because users shouldn't have to trust a company's infrastructure choices with intimate data. On-device processing removes the trust question entirely.

Offline-first is a real use case. Subways, planes, conference rooms with unreliable WiFi, international travel without a data plan. If the AI requires internet, half the capture moments fail. Némos AI features work in airplane mode because they run on the Neural Engine, not a data center.

There are no compute costs. Cloud AI costs money per token. Every AI feature in Némos runs on your chip, not theirs. This is why Némos can be free with unlimited saves — there is nothing to bill you for on the AI side.

Current Limitations

Foundation Models has real constraints that are worth being honest about.

The context window (~4K tokens) limits how much content can be processed at once — long documents require chunking. The on-device model isn't as capable as GPT-4o or Claude 4 for complex multi-step reasoning. Multimodal input isn't supported yet — images can't be passed directly to the Foundation Models API, which is why Némos uses a Vision framework OCR pre-processing step before passing text to the language model.

Némos works within these constraints by choosing tasks where Foundation Models excels: short-form summarization, classification, structured output generation, and single-turn Q&A with bounded context. It avoids tasks where the model struggles: long-document synthesis, multi-document comparison, or complex instruction-following chains.

What's Next

iOS 26.1 is expected to expand Foundation Models capabilities — longer context windows, potentially tool-use APIs. Némos is designed to adopt these immediately: no cloud infrastructure to migrate, no prompts engineered for a different model's quirks.

The on-device AI roadmap is clear: more capabilities, running on the same chip, with the same privacy guarantee. Every Apple chip generation makes the on-device model faster and more capable. The gap between on-device and cloud AI is narrowing with each chip revision.

Némos is one of the first apps to build its entire AI layer on Foundation Models. As the framework matures, the app gets meaningfully better without any changes to infrastructure.

---

For more on how on-device AI works in practice, see what on-device AI means for your privacy. For a full breakdown of what Némos captures and organizes, visit the AI note-taking app overview. The Taha Baalla author page has more context on the technical choices behind Némos.

FAQ

Does Némos use Apple Intelligence? Yes. Némos uses the Foundation Models framework, which is Apple's public API that exposes the same on-device language model powering Apple Intelligence features (Writing Tools, Smart Summaries, Priority Notifications). All AI processing in Némos — auto-titling, classification, Smart Spaces, AI chat — uses this framework. Requires iPhone 15 Pro or later (A17 Pro chip), or any iPhone 16 / M-series iPad.

Does Némos send my notes to any server? No. The Foundation Models API is a local API — all calls are processed by the Neural Engine on your device. Némos has no server that receives your content. There is no account required, no sign-in, no cloud sync beyond iCloud (which you control). Your notes never leave your device except to iCloud, which is end-to-end encrypted.

What is the Foundation Models framework? Foundation Models is an Apple framework introduced in iOS 26 that lets developers use Apple's on-device language model (~3B parameters) via Swift API. It provides LanguageModelSession for text generation, structured output via @Generable, streaming token generation, and guided generation. It runs on the Neural Engine with no network connection required.

Which iPhones support Foundation Models? Foundation Models requires the A17 Pro chip or later: iPhone 15 Pro, iPhone 15 Pro Max, iPhone 16, iPhone 16 Plus, iPhone 16 Pro, iPhone 16 Pro Max, and all M-series iPad models. Older devices running iOS 26 will have Némos but without the AI features that require Foundation Models.

How does Némos auto-title voice memos? When you finish recording, Némos transcribes the audio on-device using Apple's Speech framework, then passes the transcript to a Foundation Models LanguageModelSession with a system prompt that specifies title format. The model returns a @Generable struct with a single title string. This runs in the background in approximately 200ms on A17 Pro — your memo has a title by the time the recording indicator disappears.

Taha Baalla·Founder, Némos

Taha built Némos after years of losing screenshots and voice memos across a dozen apps. He writes about on-device AI, personal knowledge management, and building privacy-first tools for iPhone.

@nemosapp

Join 2,400+ on the waitlist

Stop losing things you save.

Némos remembers every screenshot, voice memo, link, and note — and surfaces them when you need them. Free, private, on-device AI.

Join the waitlist · free See how it works

No credit card · iOS launch Q3 2026 · We'll email you when it's live