I prep 4 podcast episodes a week. Here's the iPhone-only system that replaced 3 apps.
How working podcasters capture guest research, episode ideas, soundbite voice memos, and competitor analysis in one place — without sending every transcript to Otter or Granola's servers.
If you make a podcast, you have an episode-prep stack that includes Otter for transcripts, Notion for the show bible, Apple Notes for the question list, Voice Memos for soundbite ideas, Pocket for the articles you'll quote, and a Google Doc for the episode outline. You also have a constant low-grade panic that the link between all six of those is your brain — and your brain is mostly thinking about the next episode.
The episode prep loop, in real life
You book a guest two weeks out. Here's how the prep actually goes for an organized indie podcaster:
- Day 1: You read their book / scroll their feed / listen to 3 of their other interviews. You take 30 screenshots and 8 voice memos along the way.
- Day 4: You read a competitor's interview with the same guest. You screenshot 6 of their best questions. You voice-memo: "ask her about the 2019 pivot — the one she keeps avoiding."
- Day 9: You build the question list. You search your notes. You find 4 of the 30 screenshots from Day 1. The other 26 are *somewhere* in your camera roll. You give up.
- Day 12: You record. The questions are fine. They're not your best. You leave the studio knowing you forgot at least one great angle.
The bottleneck is Day 9. You captured plenty. You couldn't retrieve. That's where a second brain pays for itself.
What an indie podcaster's second brain actually needs to do
Five jobs:
- OCR every guest-research screenshot. Quotes, bios, headlines, tweet text — all searchable.
- Transcribe every voice memo on-device. Searchable transcripts without sending audio to a vendor.
- Connect captures across modes. A voice memo tagged "Acme guest" should pull up the screenshots with the same tag.
- Surface old captures during prep. Three months later, when you book a similar guest, the old research should re-surface.
- Keep the IP private. Your prep, your reactions, your unaired thoughts — none of those belong on Otter's servers.
On-device AI is what makes 1, 2, and 5 possible at the same time. The rest is a UX problem.
Why Otter and Granola aren't enough for prep
Otter and Granola are excellent at one specific thing: meeting transcription with speaker diarization. They are *not* second brain tools. They are *not* designed for prep. Here's the gap:
- Otter stores every transcript in the cloud. You can search. You cannot tag across types — your screenshots aren't there, your text notes aren't there, your reactions aren't there. Nemos vs Otter walks through this.
- Granola is meeting-shaped. It assumes the audio you care about is a live meeting. Most podcast prep is not a live meeting — it's reading, scrolling, and voice memos in the car. Nemos vs Granola covers the shape mismatch.
- AudioPen is voice-to-summary, which is genuinely great for capture, but flat — there's no connection layer underneath. Nemos vs AudioPen expands.
You probably still want one of those for the actual episode recording. But for prep, the second-brain layer is the missing piece.
The 14-day pre-interview workflow
Picture a real interview prep, 14 days out. Here's the Némos-shaped version:
Day 1 — read. You scroll the guest's Twitter. You screenshot 18 tweets that catch your eye. Némos OCRs them on-device. Searchable.
Day 2 — listen. You play their last interview at 1.5x. Every time something pops, you hold the side button and dictate a 3-second voice memo. "Ask her about the part where she pivoted from agency work — that's the underexplored angle." 12 voice memos by the end of the interview. All transcribed on-device.
Day 5 — competitor research. You listen to two other shows that interviewed the same guest. More screenshots. More voice memos. Now you have ~50 captures, all tagged automatically with the guest's name (because Némos detected it).
Day 9 — question synthesis. You open Némos. You type the guest's name. Fifty captures come up. You skim. You ask the on-device AI: "give me 8 question angles based on these captures, grouped by theme." You get 8 clean angles. You write your actual question list on top of them — but the AI gave you the scaffolding.
Day 14 — record. You walk in with a question list you actually trust. You ask the unexpected angle. The guest lights up. The episode is your best in months.
This isn't AI writing your interview. This is AI organizing your own research, on-device, into a form you can use. Your judgment is still doing all the work that matters.
Why on-device AI matters specifically for podcasters
You are recording people. You are taking notes on people. Some of those people are journalists. Some are academics. Some are public figures with PR teams that read everything. If your prep notes — your *reactions* to their work, your gut takes, your "ask her about the divorce" — leak via a cloud-AI vendor breach, that is a *real* problem.
Cloud AI tools have leaked before. Notion AI had a notable incident in 2024. Otter had a smaller one. The pattern is well-documented. Is Némos private? is the architecture answer for why on-device is different. On-device AI vs cloud AI is the longer essay.
The soundbite voice memo (the secret weapon)
Most podcasters have a folder of voice memos titled "episode ideas." Most never get used because they're not searchable. The fix:
Every time you have an episode idea, dictate 20–60 seconds. *Pitch yourself the episode*. "Episode about how craft beer brands are getting acquired by Coke — angle is the founder who tried to refuse the acquisition. Three guests: the founder, the M&A lawyer, the buyer." The on-device transcription catches it. The on-device AI tags it. Six months later you search "craft beer acquisition" and the full voice memo plus any related screenshots you've collected since are there.
Best voice recording app for iPhone covers the voice memo stack more broadly. The trick for podcasters is treating voice memos as the idea inbox, not just the recording archive.
The Apple Watch flow
For podcasters, the Apple Watch is the *driving* second brain. You're driving. NPR is on. They mention something you want to follow up on. You raise your wrist, hold the side button, whisper "NPR Wednesday morning, story about [topic], get the original NYT piece." The audio captures on the watch, transcribes on the phone, ready when you sit down. Apple Watch capture flow is the deeper write-up.
The episode-post-mortem habit
After every episode, take 90 seconds to record a self-debrief into Némos. What worked. What didn't. What you'd ask differently. Tag it with the guest's name and the episode number. Three months later when you're prepping a similar guest, those debriefs surface and you avoid repeating the same mistake.
This is the knowledge management system for personal use pattern applied to interview craft. The episode-post-mortem is the highest-leverage 90 seconds in a podcaster's week.
What about Riverside, Descript, Cleanfeed, etc.?
Those are *production* tools. They handle the recording, the editing, the cleanup. They are not prep tools. You can keep using them. Némos doesn't compete with the production stack — it sits upstream of it, where the *thinking* happens.
The full stack for an indie podcaster, in 2026:
- Prep: Némos (capture, retrieval, on-device AI)
- Record: Riverside, Cleanfeed, or local studio
- Edit: Descript or Logic Pro
- Publish: Transistor, Buzzsprout, or your own RSS
Each layer optimizes for its own moment. Mixing them is the mistake.
Three moves for the next two weeks
- For your next interview, run the Némos prep loop end to end. Compare the question list quality to your last episode.
- Audit your existing voice memos. Anything you'd want findable in 6 months — re-record into Némos so it's transcribed and tagged.
- Read [[the top 10 second brain apps for 2026](/blog/top-10-second-brain-apps-2026)]. If Némos isn't the right shape for you, one of the alternatives might be.
Guest research workflow: LinkedIn, Twitter, past podcasts, and the deep cuts
The difference between a memorable interview and a forgettable one is almost always the depth of pre-show research. Surface research — "I read their book and skimmed their Wikipedia" — produces surface questions. Deep research — "I listened to 8 of their past interviews and noticed they always pivot away from a specific topic" — produces the questions that make episodes go viral.
The Némos research stack for a typical 14-episodes-per-month podcaster: LinkedIn for professional history, Twitter for current voice, past podcasts for question gaps, books and long-form pieces for substance. Each layer produces its own captures. LinkedIn screenshots get tagged with the guest name and "career-history." Twitter screenshots get tagged with "current-takes." Past-podcast voice memos ("interviewer never followed up on the 2019 pivot — that's the gap") get tagged with "question-angles." By the time the interview day arrives, you have 50-80 captures organized by guest, each with a specific role in the conversation you're going to have.
The compounding effect over a year is real. By month 12, you've done 168 episodes' worth of guest research. The patterns surface — certain types of guests always avoid certain topics, certain professional backgrounds correlate with certain interview rhythms, certain industries produce guests with predictable narrative arcs. Your interview craft improves not because of talent but because your archive of guest patterns is bigger than any other podcaster's in your niche.
Episode outline capture (the workflow that prevents the 3am panic)
Every podcaster has had the 3am moment where you sit up in bed because you realized you don't have a clear narrative arc for tomorrow's interview. The questions are written but they don't add up to a story. The guest is going to be on for 75 minutes and you're going to spend most of that time on tangents.
The fix is an explicit outline capture habit. The week before each episode, dictate a 90-second voice memo describing the narrative arc you want the conversation to follow. "We open with her early career mistake, build through the pivot, climax with the controversial 2024 decision, resolve with the looking-forward question." The on-device transcription captures it. Tag with the episode number. The on-device AI summarizes your scattered captures (research, voice memos, screenshots) into a draft outline that fits the arc. You edit. You print. You walk into the studio with a story to tell, not just a list of questions to ask.
This pattern distinguishes interview-podcasters who get booked on bigger shows from interview-podcasters who plateau. The technical interview craft (mic technique, pacing, follow-up questions) is teachable. The *narrative arc instinct* is what makes an interview feel like an episode instead of a conversation. The arc has to be planned. The capture habit is how you plan it efficiently.
Soundbite voice-memo tagging
Every podcaster collects soundbite ideas — phrases, framings, openings, closings that they want to use *someday*. Most podcasters lose 80% of them because the storage layer is flat. Voice memos are stored chronologically. Apple Notes is stored chronologically. Neither makes "find me the opening I drafted six months ago about creator economy fragmentation" a solvable problem.
The Némos pattern: every soundbite idea gets a voice memo plus a one-word tag. "Opening" or "closing" or "transition" or "ad-spot." The on-device transcription captures the actual words. Tag with the topic theme. Six months later when the right episode shows up, you search "opening creator economy" and your draft from last spring surfaces. You polish it. It's the strongest opening you've done in months because it's been *cooking* in your library for half a year.
Soundbite cooking is real. The opening you draft today and use tomorrow is rarely as good as the opening you draft today, forget about, and rediscover three months later. Time makes opinions better. The capture habit is what makes time possible.
Post-show editor notes (the workflow that saves the editor 4 hours per episode)
If you have an editor — and most indie podcasters at scale do — the post-show notes you send them are the single highest-leverage 10 minutes of your podcasting week. Good editor notes mean clean cuts, tight pacing, and the right soundbites surfaced. Bad editor notes mean three rounds of revisions and a delayed publish.
The pattern: immediately after the recording ends, take 10 minutes to dictate editor notes while the conversation is still fresh. "Around minute 18, she said something brilliant about platform incentives — pull that as the cold open. Around minute 34, we had a long tangent on personal finance — cut the whole 4-minute block, it doesn't fit the arc. Around minute 52, my laugh was way too loud — gate it down." Tag with the episode number. Send to the editor. The editor knows exactly what to do. The episode publishes on time. The audience hears a tighter show.
This single workflow has saved an embarrassing number of hours for editors I've talked to. The capture cost is 10 minutes. The output is roughly 4 hours of saved editor time per episode. At $40-80/hour for a decent podcast editor, that's $160-320 in real money per episode, before counting the value of faster turnaround.
Sponsor brief library
Indie podcasters with sponsorships end up managing 10-30 active sponsor relationships at any given time. Each sponsor has their own brief, their own talking points, their own forbidden topics, their own delivery preferences. The cognitive load of tracking all of this is real.
The pattern: every sponsor brief that arrives, save the PDF to Némos. Voice memo the key constraints: "BetterHelp brief — don't say 'mental health' explicitly per their guidance, use 'emotional well-being' instead, lead with the convenience angle, never compare to in-person therapy." Tag with the sponsor name. When you're recording an ad read, you search the sponsor name and the full brief plus your interpretation surface together. The ad reads cleaner. The sponsor renews. The CPM stays high.
This is also where the Riverside handoff gets cleaner — you record your ad reads with the brief context fresh in your head, which means fewer takes, which means a happier editor and a faster publish cycle.
Social-media-promo capture
Every episode produces 3-8 social media promo opportunities. The 60-second clip for Instagram Reels. The pull-quote for Twitter. The screenshot of the most compelling moment for LinkedIn. The behind-the-scenes still for the newsletter. Most podcasters capture none of these because by the time the episode is edited, the moments have faded from memory.
The pattern: while you're recording, voice-memo the promo moments as they happen. "Minute 23 — she just said something that's going to be the Twitter quote. Minute 41 — that laugh from both of us would be a great Instagram clip. Minute 58 — the lightning round questions are perfect for the newsletter teaser." Tag with the episode number and the platform. When the episode is published, you (or your social-media person) search the episode number and the full promo plan is there with timestamps. Cutting the clips becomes a 30-minute task instead of a 3-hour task. The promo schedule actually happens.
The compounding effect over six months is significant. Podcasters who run this loop reliably post 3-5 promo clips per episode and see download growth that podcasters relying on memory don't. The capture habit is the unlock.
Listener email triage
Once your podcast hits a certain scale (~1,500 weekly listeners is the inflection point for most niches), the listener emails start arriving. Initially this feels great. By month 6 it feels like another inbox to manage. By month 12, important emails are being lost in the noise.
The Némos pattern for listener email: any email worth saving — a deep question, a critical feedback, a guest suggestion, a sponsor inquiry — gets screenshotted into Némos with a voice-memo classification. Tag with "listener-question" or "guest-suggestion" or "feedback" or "sponsor-inquiry." Quarterly, you review the captures and identify patterns. The questions listeners keep asking? Those become future episodes. The guests they keep suggesting? Those become booking targets. The feedback themes? Those inform the show's direction.
This turns listener email from a triage burden into a *signal source*. The captures become an indexed audience-research database. Many indie podcasters report that their best episodes come from the question patterns they noticed across listener emails — patterns that would be invisible without the capture habit.
Conference networking capture
Conferences in the podcast world (Podcast Movement, On Air Fest, the various creator events) are essentially interview booking opportunities. Every meaningful conversation is a potential guest. Every potential guest is a person whose context you need to remember weeks later when you actually reach out.
The pattern: every meaningful conversation, after they walk away, dictate a 30-second voice memo. "Just met [name] at the Apple Podcasts booth — works on a show about climate finance, mentioned wanting to talk about the 2024 IRA bill, asked me to reach out next month after their book launch in October." Tag with "conference-contact." When you sit down to do guest outreach two months later, you search the conference tag and the full conversation context surfaces. The cold email isn't cold — it's "we met at Podcast Movement, you mentioned your book launch was in October, I'd love to talk after that."
The booking rate on warm outreach is materially higher than the booking rate on cold outreach. Capture is what makes warm outreach possible at scale.
Equipment review references
This is a niche workflow but the right audience will recognize it. Most working podcasters end up obsessing over equipment at some point — microphones, interfaces, software, plugins, post-processing chains. The internet is full of equipment reviews. Most of them are useless. The few that are useful are scattered across blogs, YouTube videos, and Reddit threads that will be gone in three years.
The pattern: every useful equipment review you encounter, capture the relevant section. Voice memo the takeaway: "Justin Snell's review of the Shure SM7B vs SM7dB — concludes the dB is worth it if you're traveling and don't want to carry the Cloudlifter. Use this when I'm next thinking about the road kit." Tag with the equipment category. Six months later when you're upgrading, your personal equipment-research library surfaces with the takeaways already condensed. You don't redo the research — you trust your past self.
The Apple Watch driving capture (the unfair advantage)
This is worth its own section because it's the workflow that most distinguishes Némos from the alternatives. Podcasters drive. We drive to studios, to coffees with guests, to conferences, to recording locations. Driving is one of the highest-density idea-generation environments in any creative worker's life. Most of those ideas die because you can't safely break focus to write them down.
The pattern: raise wrist, hold side button, speak. "Episode idea — the rise of 'cozy podcasts' as a backlash to the algorithmic interview show. Three guests would be the host of [show], the editor of [magazine], the founder of [studio]." Released. The capture lands on the phone, transcribes by the time you're parked. Tag with "episode-pitch" when you get to your desk. Over a year, you generate 40-60 episode pitches that you would have lost otherwise. Some of them become real episodes. Some of them inform the show's overall direction. None of them are lost.
This is the Apple Watch capture flow that podcasters underuse. The watch is for the moments you can't reach the phone — driving, walking, exercising, falling asleep. Those are the moments when the best ideas arrive.
Try Némos
Free. TestFlight beta. Waitlist on the homepage. Tell us what show you make when you join — we love seeing podcasters get unstuck.
Other guides for your role
- Doctors & CliniciansI'm an ER doc with 8,400 saved articles. Here's the iPhone setup that finally works.
- LawyersWhy 73% of associate attorneys we surveyed use Apple Notes wrong (and what to do about it)
- DesignersI took 14,000 inspiration screenshots in 2 years. Here's the iPhone system that made them findable.
- DevelopersI'm a senior dev. I lost 400 voice ideas to AirPods this year. Here's the iPhone fix.
- ParentsTwo kids, one iPhone, zero brain cells. The capture system that saved my year.