WonderCast
Engineering · System architecture

How WonderCast is assembled.

The runtime topology, domain services, data model, realtime channels, and the critical end-to-end flows that power personalized stories, videos, and audiobooks for families.

Private beta · Feb 23, 2026Official launch · Mar 23, 2026Last reviewed · Apr 2026
At a glance

Quick reference

The six things you need in your head to read the rest of this doc.

Frontend
React + Vite

Port 3000 · PM2 wonder-frontend · hash routing

Backend
Express + TS (ESM)

Port 3001 · PM2 wonder-server · /api/v1/*

Data
PostgreSQL + Prisma

~55 models · ioredis for session + rate

Media
AWS S3 + Fortify

Signed URLs · genai-*-worker PM2 forks

Realtime
Socket.IO + Redis

Shared HTTP listener · 2-min reconnect

Auth
JWT · argon2 · Passport

Google + Apple OAuth · LoginEvent audit


Section 01

System context

WonderCast is an AI-driven personalized children's media platform. A parent onboards characters (family members, pets, toys), and the system generates personalized stories as image storybooks, videos, and audiobooks.

Billing is based on monthly generation caps per modality, not raw credits (index.ts:1).

PersonaEntry pointPrimary UI
End user (parent)www.wondercast.appSPA under /app/*
Admin / staffSame SPAGated by user.isAdmin → 11 tabs
Marketing visitorSame host/pricing, /blog, /corp/*
Native shelliOS / tvOSCapacitor WebView wrapping the same web bundle

The native layer is thin (StoreKit 2, Sign-in-with-Apple); all business logic stays on the web.

Section 02

Runtime topology

A single EC2 host, nine PM2 processes, everything in one VPC. Stateless app tier; all ephemeral state lives in Redis.

Browser / SPAwww.wondercast.appiOS / tvOS shellCapacitor WebViewStripe webhooksbilling eventsApple App StoreASSNv2 notificationsAWS ELB*.wondercast.app · TLS terminationVite frontendPM2 · :3000 · HMRExpress API + Socket.IOPM2 · :3001 · single HTTP listener//api/v1/* + ws://PostgreSQLPrisma · ~55 modelsRedissessions · rate · pubsubAWS S3media + website assetsFortify controller-apijob queue + webhooksgenai-image-workerPM2 × Ngenai-video-workerPM2 × NSESemaildispatchSMTPwebhook + WSWonder serviceData store3rd-party / externalClient surface
All processes live on a single EC2 host under PM2. Dashed lines are outbound calls from the API; the purple arrow is the asynchronous Fortify callback that unlocks generated media.

PM2 reports nine live processes: wonder-server, wonder-frontend, four genai-image-worker forks, three genai-video-worker forks, and pm2-logrotate. The Fortify workers are run by a sister service (controller-api) and communicate with wonder-server over HTTP webhooks and Socket.IO.

The Express process boots in server/src/index.ts:

  • Helmet / CORS / compression / body parsing — L72-130
  • Passport init — L136
  • Health check aggregating DB / Redis / S3 / Fortify — L152-217
  • http.createServer(app) — shared port with Socket.IO — L63
  • Socket.IO init, then httpServer.listen(env.PORT)L592-604
Tech debt: Vite in production
The frontend is a Vite dev server in production; no static build step. Flagged in §12.
Section 03

HTTP API surface

All REST endpoints live under /api/v1/*. Everything below the maintenance middleware short-circuits to 503 for non-admins during outages.

PrefixFilePublic?
/emailemail.routes.tstoken
/featuresfeatures.routes.tsoptionalAuth
/public/*4 routes (assets, costs, founding-family, contact)yes
/healthdeep healthyes
/webhooksStripe / Fortify / Applesigned
/authlogin, signup, OAuth, refreshyes
maintenanceMiddlewaregates all below
/usersprofile, account membersauth
/characters, /voices, /storiescore domain CRUDauth
/media/img, /mediaS3 presigned URL proxyauth
/billingStripe + IAP glueauth
/adminadmin panel APIsadmin
/aisafety, quiz, grammarauth
/generateheavy generation · 14 endpointsauth + rate
/chatdedicated chat (split out of /generate)auth

Two cross-cutting middlewares worth calling out

  1. authenticate + lastSeen — auth chain on /users stamps User.lastSeenAt in a throttled fire-and-forget Redis+Postgres dance (lastSeen.middleware.ts:28-75) — at most one DB write per user per 60s, never awaited.
  2. maintenanceMiddleware — everything above remains reachable in maintenance mode; everything below short-circuits to 503 unless the caller is an admin.
Section 04

Backend services (domains)

Each server/src/services/<domain>/ folder owns a bounded context. The most load-bearing piece is the auth login pipeline.

auth

3 files

Password login, OAuth linking, refresh tokens, account-deletion grace, LoginEvent audit

auth.service.tsoauth.service.tstwoFactor.service.ts

admin/

2 files

30+ service files covering accounts, billing stats, campaigns, marketing, reports, team, webhooks, feature flags

accounts.service.tsfeature-flags.service.ts

admin-settings/

2 files

System-wide config: AI model selection, prompts, limits, maintenance flag

ai-models.service.tsdefaults.ts

generation/

2 files

Gemini-backed text + speech: story, scene-regen, safety, chat, per-item TTS

story.service.tsspeech.service.ts

fortify/

2 files

Image / video / batch-audio pipeline: batch-builder → HTTP client → WebSocket tracker

client.tswebsocket-handler.ts

ai/

2 files

Low-level Gemini wrappers: retry, JSON safety, text completion, structured schemas

text.service.tsgemini.service.ts

billing/

2 files

Stripe + App Store IAP dual-path subscription management

subscription.service.tsappStore.service.ts

voice + training/

2 files

ElevenLabs voice cloning, local bookkeeping, voice-training worker

voice.service.tsvoice-training.worker.ts
Load-bearing: auth.service.ts login pipeline
Writes a LoginEvent audit row at every decision point — rate-limit hit, bad password, suspended account, success — each guarded so no audit failure can break a real login (L191-238). On success it also bumps the denormalized Account.lastActivityAt (L361), the sort key used by the admin Accounts tab.
Section 05

Data model highlights

Prisma schema is 2,559 lines of fully normalized PostgreSQL spanning ~55 models. The core cluster every request touches:

ModelRoleNotable fields
AccountBilling unit + tenant boundaryplan, usage counters, stripeCustomerId, lastActivityAt
UserFamily members within an accountaccountId, googleId/appleId, isAdmin, lastActiveAt + lastSeenAt
CharacterChild / parent / pet / toy with reference imageslinks to CharacterToy, VoiceProfile
VoiceProfile + VoiceSampleElevenLabs clone + raw audio
GeneratedStoryOne story generation eventhas many StoryScene
StoryScene + SceneDialogueScene text, image URL, dialogue linesimage + audio URLs point at S3 keys
Subscription + PricingPlanCurrent plan state (Stripe or IAP)
SystemConfigGeneric key/value configindividual feature.* keys live here

Recently-landed additions

  • LoginEvent (L1878-1899) — append-only auth audit. Indexed by userId, accountId, createdAt, email. onDelete: SetNull so audit survives user deletion.
  • Account.lastActivityAt + its descending index — indexed ORDER BY in admin list without joining User.
  • User.lastSeenAt vs lastActiveAt — two columns by design. lastActiveAt = successful login; lastSeenAt = any authenticated request, throttled. DAU / "online now" uses the latter.
  • AdminAuditLog — admin mutations, separate from user auth events.

Money-adjacent: CreditTransaction, StripeTransaction, AppStoreTransaction, Invoice, and RefundRequest are first-class; Stripe and App Store stay on separate tables so reconciliation runs independently.

Section 06

Frontend architecture

Two top-level React Router trees; hash-based routing; lazy-loaded admin panel with 11 tabs.

Routing & auth hydration

Two top-level router trees, split in App.tsx:

  • Website + marketing tree/, /pricing, /blog, /corp/*, /landing/*. Uses WebsiteLayout, no auth context.
  • Application tree/app/*, /login, /signup, /oauth-callback, /shared/:token, /onboarding. Wrapped in QueryClientProvider + AppProvider, lazy-imports every page.

Hash-based routing is used (#/app/...) because the iOS shell and email-link handling depend on it.

ProtectedRoute does three things on mount

  1. Waits for AuthProvider.hydrate (isLoading).
  2. Fetches /api/v1/maintenance/status — only for non-admin users; admins bypass.
  3. Fetches /api/v1/legal/terms to check whether the user needs to accept a new terms version.

Only after all three complete does it render <Layout>. Admins skip the maintenance check, so a global outage doesn't lock staff out.

Self-healing media: SignedImage

Signed S3 URLs expire, but cached URLs float through WebSocket payloads and localStorage. Every generated <img> uses SignedImage.tsx with two layers:

Proactive

Parse X-Amz-Date / X-Amz-Expires; refresh before a 5-minute default buffer.

Reactive

On <img onError>, one-shot refresh via refreshSignedUrl(storageKey).

The app tolerates clock skew, long-lived pages, and stale payloads with no broken images visible to users.

Section 07

Realtime (Socket.IO)

Single Socket.IO server attached to the same HTTP listener as Express. Redis-adapter-backed for horizontal scaling.

Configuration

  • Redis adapter@socket.io/redis-adapter with dual pub/sub clients against REDIS_URL. On Redis failure the server logs and downgrades to single-node mode.
  • Connection-state recovery — 2-minute window for reconnecting clients to resume without missing events.
  • Per-socket middlewareauthenticateSocketrateLimitSocketlogSocketEvents.
  • Transports['websocket', 'polling'] with upgrades allowed.

Emit pattern

Server-to-client emitters in emitters.ts are called from services — e.g. emitUserActivity({ action: 'login' }) from auth.service.ts:373 — so the admin dashboard sees events in real time.

// server side
emitUserActivity({ userId, action: 'login' });

// admin dashboard
socket.on('user:activity', (evt) => {
  toast.show(`${evt.userId} just signed in`);
});
Section 08

Two settings systems

A leaky abstraction that causes real bugs. Writes fan out to both; reads merge the JSON blob with overrides from individual keys.

Admin settings

Individual SystemConfig rows

Keyed feature.share_story, etc. Written by admin panel toggles.

Feature flags

JSON blob in SystemConfig.feature_flags

Read by public /api/v1/features via features.routes.ts.

The gotcha is real
Writes must fan out to both; reads from /features merge the JSON blob with overrides from the feature.* keys — see ADMIN_FEATURE_OVERRIDES at features.routes.ts:30-38. Consolidating these two systems would eliminate recurring "toggle flipped but feature still off" bugs.
Section 09

External integrations

Seven vendors, one .env, every key indirected through admin settings.

ServiceUseEntry point
Google Gemini
text
Story, scene, chat, safety, quiz, grammar. Model indirected through admin settings — default gemini-3-flash-previewstory.service.ts:7
Fortify Media
image video audio
Batch submit → worker process → POST back via webhook + Socket.IO pushclient.ts
websocket-handler.ts
ElevenLabs
audio
TTS narration, voice cloningvoice.service.ts
Google Cloud TTS / OpenAI TTSAlternative TTS providers, model-routedspeech.service.ts
AWS S3All generated media + uploaded assets. Website illustrations served unsigned; user media signeds3.ts
AWS SESTransactional email from website@wondercast.app (us-west-1)services/email/
StripeWeb subscription billing + customer portalstripe.payment.service.ts
Apple StoreKit 2iOS + tvOS IAP. Dual-bundle verifier registry — one app server handles both appsappStore.service.ts:127
Google OAuth (passport), Apple Sign InSocial login; Apple sub lookup works even without email thanks to provider-id lookupoauth.service.ts:40-42

Secret configuration is loaded from server/.env via env.ts. This doc never hardcodes values.

Section 10

Critical end-to-end flows

Three flows worth knowing by heart. The OAuth signup is the onboarding rung; story generation is the product; the audit pipeline is how staff sees the world.

10.1 · Google OAuth signup → first login

  1. 1

    "Continue with Google" → AuthContext.socialLogin('google') splits native vs web on Capacitor.isNativePlatform().

  2. 2

    Web: redirect to /api/v1/auth/google → Passport → callback hits oauth.service.ts#findOrCreateOAuthUser, which does a provider-id lookup first and falls back to email. On first sign-in, an Account + owner User are created in one transaction.

  3. 3

    OAuthCallback.tsx stores tokens and re-hydrates.

  4. 4

    /appProtectedRoute sees user.onboardingComplete === false → redirects to /onboarding.

  5. 5

    First authenticated API call fires the lastSeen middleware (writes lastSeenAt); the login path already wrote lastActiveAt, LoginEvent, and Account.lastActivityAt.

10.2 · Story generation (Create Magic)

Create Magic UIStoryChatModal.tsxPOST /ai/story-suggestcheap Gemini titlerPOST /generate/check-safetygetSafetyCheckModel()POST /generate/storygenerateStory()generation/story.service.ts1. Build cast description (gender, pronouns, age)2. Load admin prompt (5-min cache)3. Gemini structured-JSON call4. Persist GeneratedStory + StoryScenePOST /generate/batchfortify/batch-builder.tsFortify controller-apiqueue + dispatchimage workers × NImagen / Geminiaudio workersElevenLabs / GCP / OpenAIAWS S3scene-*.pngAWS S3narration.mp3scene prompts + voicesfortify/websocket-handler.tswebhook → Socket.IO emitSignedImage proactive + reactive refreshcomponents/SignedImage.tsxjob.complete (WS)S3 URLrenders
Orange = client; violet = WonderCast API; gray = external workers; green = S3. Only steps 1–4 block the request; everything else is async via Socket.IO.

10.3 · Admin login-audit trail

POST /auth/loginemail + passwordRedis rate-limit checkauth.service.ts:246Lookup user + verify argon2constant-timeAccount suspension guard2FA challenge (opt-in)Issue JWT + refreshreturn to clientrecordLoginEventsuccess=false · reasonrecordLoginEventsuccess=false · bad_passwordrecordLoginEventsuccess=false · suspendedprisma.user.update → lastActiveAtrecordLoginEvent → LoginEvent (success)touchAccountActivityemitUserActivity (WS)Fail-open auditevery audit writeis try/catch; neverblocks a real login
Every decision point — rate limit, bad password, suspended account — writes its own LoginEvent. The admin dashboard sorts accounts by the denormalized Account.lastActivityAt.

Each successful password login runs auth.service.ts:348-378:

  1. prisma.user.update → bumps lastActiveAt.
  2. recordLoginEvent({ success: true, ... })LoginEvent row with method, provider, IP, user agent.
  3. touchAccountActivity(accountId) → bumps Account.lastActivityAt.
  4. emitUserActivity({ action: 'login' }) → Socket.IO broadcast to admin dashboard.

The Accounts tab sorts on the indexed lastActivityAt and aggregates loginCount via _count.loginEvents.

Section 11

Operational posture

Single-host, PM2-managed, Redis-backed. Built for beta; scaling out is a configuration change, not a rewrite.

Single-host deployment

All processes on one EC2 instance under PM2.

Reload cost

pm2 restart wonder-server produces ~300 ms of connection refusals. Acceptable at beta; revisit for GA.

Horizontal scaling readiness

Socket.IO Redis adapter wired; all ephemeral state (rate limits, throttles, pubsub) in Redis — app processes are stateless.

Observability

Structured logging via utils/logger with per-module children. PM2 captures stdout into /home/ubuntu/.pm2/logs/. No APM yet — known gap.

Rate limiting

Global rateLimiter; narrower aiRateLimiter, newsletterRateLimiter, publicChatRateLimiter on high-abuse surfaces. Per-email login throttle: Redis INCR + EXPIRE.

Fail-open audit

Both recordLoginEvent and touchAccountActivity catch and log. Audit infrastructure must never lock a legitimate user out.

Section 12

Known architectural debt

The things we know about and haven't fixed yet. If you touch one, update this list.

  • Vite in prod
    Frontend served by vite dev mode under PM2 instead of a built, CDN-fronted bundle.
  • Legacy credits field
    On Account (schema.prisma:344) — deprecated but still read in edge paths.
  • Two settings systems
    §8 is a leaky abstraction; consolidating would eliminate recurring "toggle flipped but feature still off" bugs.
  • 30+ files in services/admin/
    Becoming monolithic. Sub-folders (team/, email/, reports/) exist — extending that pattern would help.
  • No type-safe API client
    Hand-written wrappers in services/api/*.ts; OpenAPI or tRPC would remove a class of type-drift bugs.
  • Single-region
    SES us-west-1; S3 and Postgres likewise single-region. DR posture is basic.
Source of truth: docs/architecture.md · Last reviewed April 2026 · Back to WonderCast