Where Did All My Money Go?
Notes on intent-driven interfaces, voice-first finance, and building Moniewave , a conversational payments experiment powered by MCP, OpenAI voice agents, and Paystack.
The Idea
Lately I’ve been thinking about interfaces. Not in the “what color should this button be” way, but in a more fundamental sense.On intent-driven UI: OpenAI ChatKit for building interactive experiences inside ChatGPT. CopilotKit Generative UI covers AG-UI, A2UI, and MCP Apps patterns. Vercel AI SDK Generative UI for streaming, tool-based UI generation.
A while back, I asked a designer on my team a simple question: what if the future of apps isn’t pre-designed screens at all, but AI generating the UI as you go?
We already have hints of this. If you’ve used a smartphone long enough, you’ve used Siri or Alexa. They’re voice-first interfaces. You say what you want, and something happens. But they’re limited. Mostly informational. Ask, respond, done.
That got me wondering: what if apps didn’t exist as static interfaces anymore? What if they only rendered when and where they were needed, based entirely on intent?
On paper it sounds simple. We’ve seen versions of it in movies for years. In practice, it’s not.
Voice design alone is hard. Add interface design. Then add the problem of describing UI to an AI, and asking that AI to decide which UI matters right now, especially when actions are destructive or critical. At that point you’re no longer just designing screens. You’re designing systems.
Then it clicked.
OpenAI announced that apps were coming to ChatGPT.OpenAI Apps in ChatGPT , The announcement that sparked the Moniewave idea: dynamic, decision-driven interfaces inside a conversational environment. What stood out to me wasn’t “apps in chat.” It was that they had solved something much deeper: dynamic, decision-driven interfaces inside a conversational environment. You could browse, select, decide, even check out, all without traditional navigation. That felt like the missing piece.See also: OpenAI Agentic Commerce Protocol , open-sourced protocol powering instant checkout in ChatGPT. Proof that conversational commerce with real transactions is viable.
So naturally, my next thought was: why don’t I try building with this?
Moniewave
I needed a use case that made the idea obvious. Something universal. Something conversational by nature.
Money is emotional. Messy. Conversational. It felt like the right playground.
I imagined it as a personal finance assistant you talk to, not tap through. One that could handle paying people, tracking expenses, managing invoices, splitting bills with friends, even handling those very human, very unstructured requests like: “Sis, abeg, help me with 2k.”
From there, things got real pretty quickly.
I needed a payments backend. I started with Flutterwave’s sandbox but ran into compliance walls, so I moved to Paystack,Paystack Developer Docs , the payments backend for Moniewave. Handles cards, bank transfers, USSD, and virtual accounts in Nigeria. See also the full API reference. clean APIs, solid docs, Stripe-backed. Still, fintech isn’t something I like to build alone, so I called a friend who’s built core banking systems and asked him a very specific question:
“If I wanted to wire this into MCPModel Context Protocol , the open protocol from Anthropic for connecting AI assistants to external tools. JSON-RPC 2.0 based. SDKs on GitHub in Python, TypeScript, C#, Java. in under a day, how would you do it?”
He sent me an MCP server. I spun it up, connected it to Claude, and just started talking to it. “Create a transaction.” “Send money.” “Track this.” And it worked.
But there was a problem.
What I had was a chat interface. And that wasn’t the goal.
The Voice Problem
The goal was voice-first. Fluid. The kind of experience where you talk naturally, and the UI only appears when it has to, for confirmations, reviews, and irreversible actions. Everything else stays out of the way.
I explored a few platforms. ElevenLabsElevenLabs Conversational AI , the voice agent platform explored before switching to OpenAI. MCP-compatible via their official MCP server, but limited client-side control. stood out, agent-based, MCP-compatible. In their test environment, it worked beautifully. I defined assets, connected the MCP server, everything flowed.
Until it didn’t.
As the MCP server grew more complex, things started breaking. Configurations wouldn’t save. Parameters disappeared. More importantly, I realized something structural: I didn’t actually have control on the client side. Intercepting intent or UI decisions would mean digging into SDK internals.
Worse, it was voice-to-voice. My voice to model to voice. No real middleware layer. No place to reason about intent, UI, or safety in between.
That’s when I stopped patching around it and decided to rewrite.
The Rewrite
OpenAI’s Agents SDKOpenAI Agents SDK (Voice) , real-time voice agent framework with tool support. See the reference implementation and gpt-realtime model for improved instruction following and natural speech. had just enough of what I needed: real-time audio, speech-to-text, text-to-speech, and crucially, tools. The tradeoff was obvious: I’d have to rewrite all my MCP handlers as agent tools.
Conveniently, Claude emailed me saying I had 1,000 credits to burn.
So I found a code snippet, fed it the MCP logic, and asked it to rewrite the whole thing as API endpoints. Suddenly I had a full backend, a frontend, and a working surface to play with.
The Intent Layer
During this whole process, I kept coming back to how ChatGPT’s “Create a GPT” framework works. Underneath the friendliness, it’s deeply structured. JSON Schema is doing a lot of heavy lifting, describing objects, actions, even UI affordances.
That was the breakthrough.
If AI understands structured text this well, then UI itself can be described, not hard-coded. JSON Schema becomes a bridge: intent to structure to interface. I experimented with an idea I called “optimizedNotation”, basically JSON streamlined for AI-first UI descriptions, though not all of it made it into the final build.
This is the same territory Google’s A2UI protocol, Vercel’s json-render, and our own Hezra have been exploring:Google A2UI (GitHub) , agents send declarative JSON specs; clients render natively. Vercel json-render , schema-driven UI with React Native support. Hezra , our own early intent-to-interface engine. declarative, schema-driven UI that agents can generate, stream, and update incrementally.
Where It Sits Now
Everything is in place: MCP server, voice agent frontend, Paystack backend, intent schema, UI generation schema. The only thing left is to deploy it and see if the idea holds.
I’m still sitting with the implications.
If this works, apps stop being destinations and start being behaviors. Interfaces stop being static artifacts and start becoming momentary tools, summoned only when needed, dismissed when they’re not.
If there’s interest, I can pull together the architecture diagram and code snippets: MCP server, voice agent, UI renderer, payment API. But for now, this is just a note. Something I’ve been thinking about.
Video Script: Moniewave
Two documentary-style story arcs. Both open with the same hook. Warm Lagos realism, cinematic B-roll, natural lighting, handheld feel. Lo-fi Afrobeats instrumental builds subtly with the narrative. AI voice for Moniewave: soft female Nigerian accent, clear diction.
Recipe 1: The Freelancer Story — “The Work Never Stops”
Scene 1 — The Hook: "Money Moves Fast"
VO: "You've got ten apps on your phone — one for bills, one for transfers, one for savings — and somehow… every month, you still ask yourself: 'What the hell happened to my money?'"
Quick shots: different finance apps open, overlapping notifications. A Lagos freelancer's desk — laptop, sketch pad, coffee cup. He sighs, checks multiple apps for one overdue payment.
Scene 2 — The Realization
VO: "See, money moves fast. Before you blink, rent is gone, data is gone, vibes are gone. And now you're out here playing catch-up, wondering where it all went."
Tracking shot: he walks through Yaba Tech corridor or a creative hub. Split-screen: invoices sent, pending, bank alerts flying in. Background: traffic, faint Afrobeats.
Scene 3 — The Pain Point
Interview cutaway: "I spend half my week chasing invoices and receipts. I just want to focus on my work, not my wallet."
B-roll: him editing a project, paying contractors manually, juggling spreadsheets. He mutters under his breath as a payment bounces.
Scene 4 — The Shift: Enter Moniewave
VO: "But that was before I found a voice that understands how I work."
He picks up his phone.
"Moniewave, send invoice to Chika for ₦250,000."
Moniewave: "Invoice sent. Would you like me to remind you when it's paid?"
VO: "Now I just talk — and things happen."
App UI animates on screen: invoice created, payment received. "Moniewave, pay contractors." Confirmation tones. He smiles, focused on design again.
Scene 5 — Empowerment
VO: "From bulk transfers to project cards, Moniewave keeps my money organized. Every expense has a reason, every transaction a story."
Interview cutaway: "It's not just about getting paid. It's about knowing where every naira goes."
Virtual debit card generated: "Design Project Budget ₦100,000." It expires after payment. Overlay: "Secure. Smart. Powered by Paystack."
Scene 6 — The Outro
VO: "Money moves fast. But now, it moves with me."
VO (closing): "Moniewave — your voice, your money, same lane, same speed."
Montage: happy client call, quiet evening reflection. Logo fade-in: Moniewave waveform glowing.
Recipe 2: The Family Finance Story — “Control in Every Conversation”
Scene 1 — Hook: "Money Moves Fast"
VO: "You've got ten apps on your phone — one for bills, one for transfers, one for savings — and somehow… every month, you still ask yourself: 'What the hell happened to my money?'"
Quick-cut montage: woman checking different banking apps, sighing. Kids in the background, generator hum, life happening.
Scene 2 — The Realization
VO: "See, money moves fast. Before you blink, rent is gone, data is gone, vibes are gone. And now you're out here playing catch-up, wondering where it all went."
She scrolls through endless transfers: mom, brother, cousin. WhatsApp voice note: "Sis abeg, I need small urgent 2k." She smiles, but sighs again.
Scene 3 — The Turning Point
Interview cutaway: "I help my family every month. But sometimes, I don't even realize how much I've sent out until my balance starts crying."
Transaction list overlay: ₦5,000 → ₦10,000 → ₦20,000. She looks frustrated but reflective.
Scene 4 — The Voice Solution
VO: "So, I started talking to my money."
"Moniewave, how much have I sent to my siblings this month?"
Moniewave: "₦42,500 so far. You've set a limit of ₦60,000."
"Okay, next month, make it ₦50,000. And remind me when I hit 80%."
VO: "With Moniewave, I don't just transfer — I track, set limits, and stay accountable."
Her smile returns. UI renders only what's needed: spending summary, limit bar, confirmation.
Scene 5 — Empowerment
"Moniewave, list my top three expenses this week."
Moniewave: "Food ₦18,000, family transfers ₦12,000, transport ₦9,000."
VO: "It's like having a financial therapist — who listens, answers, and helps me stay in control."
Visual montage: "Monthly Summary" with spending categories. Clean UI overlays on warm domestic scenes.
Scene 6 — Resolution
VO: "Money moves fast. But now, I move with it."
VO (closing): "Moniewave — Talk. Track. Take control."
Calm evening, family video call, relaxed and balanced. Logo glow-in, voicewave animation, Paystack badge.
Production Notes
| Element | Direction |
|---|---|
| Format | Mini-documentary with cinematic B-roll, natural lighting, handheld feel |
| Tone | Honest, relatable, aspirational — real people, real problems, smart solution |
| Camera | Shallow focus, Lagos realism + clean UI overlays |
| Color | Warm oranges + cool teals (contrast tech with humanity) |
| Music | Minimal lo-fi Afrobeats instrumental, builds subtly with narrative |
| Voice mix | Real human voiceover + natural AI voice for Moniewave (soft female Nigerian accent, clear diction) |
| Duration | 3 minutes per recipe |
7 References
LinkedIn Video Format Reference
Brad Kowalk's AI autocomplete announcement , reference for LinkedIn video format and presentation style.
GenUI: AI-Generated Interfaces , NN/g
Nielsen Norman Group on AI-generated interfaces , generative UI research and patterns.
YC Application Video Parody
YC application video parody concept for Moniewave.
Generative UX , Figma
Figma design file for the generative UX exploration.
TensorKit AI , Notion
Internal project notes and documentation for TensorKit AI.
Hezra
Schema-driven UI rendering engine, our early exploration into intent-to-interface pipelines.
Narrative Storytelling Moodboard
Visual storytelling reference , tone, pacing, and cinematic style for the Moniewave video scripts.