Multi-tenant, talent-centric social-media orchestrator built for marketing agencies that manage many independent talents across IG · TikTok · LinkedIn (+ future). White-label persona agents act in each talent's voice, owner-approved content queues gate every post, and channel-isolated execution keeps accounts safe.
Status: V1 design phase, narrowed for fastest ship. Several capabilities are deliberately deferred to V2 to keep the first slice small — Browser channel, the iproxy fleet, fingerprints, audience identity resolution, account signup automation, email post-request path, per-talent tick cadence, Supabase Realtime, and a React frontend. See ARCHITECTURE_V2.html for the deferred set with the trigger conditions that bring each one back.
Live mockups: Dev — operator dashboard ↗ · Dev — talent portal ↗ · Prod — operator dashboard ↗ · Prod — talent portal ↗
A single system that can create, read, update, and delete on any major social platform — starting with Instagram TikTok LinkedIn and easily extending to Twitter / X YouTube Threads Facebook.
Dr.Social is built for marketing agencies. A single agency is one tenant and manages many independent talents (not a group — each talent is an autonomous, real human personality with their own goals, voice, look, and audience). Every talent runs multiple accounts across multiple platforms, each with their own following, and is represented end-to-end in the system by a rich talent profile stored in the database.
The profile is the source of truth for everything the system does on behalf of a talent:
Dr.Social is a multi-tenant SaaS: every record in the system is owned by a tenant (a client of ours — an agency, a creator team, a brand). Inside a tenant, the unit of "who" is a talent — a real human personality whose social presence we manage. One talent can run many accounts, across many platforms, on any channel. This shape is the spine of the data model and shows up in every agent, every query, every dashboard view.
tenant_id NOT NULL.tenant_users table with roles (owner / editor / viewer).FK → talents.id. FK → tenants.id. One per (platform, talent, channel).channel is immutable (Device or MCP in V1).audience_members + audience_identities + merge UI. See V2 §6.
USING (tenant_id = auth.jwt() ->> 'tenant_id') for operator-facing queries.
(2) Service-role queries made by the worker always include WHERE tenant_id = $job.tenant_id — verified in code review and via composite FKs (tenant_id, id) so the database refuses cross-tenant attachments.
The system stacks into six layers. Each upper layer only knows about the layer immediately below it through a stable interface, so the lower layers are individually replaceable.
Two server-rendered web UIs: operator dashboard (multi-talent supervision) and talent portal (/me, single-talent self-service with the approval queue). Python templates + htmx for partials. Polling for "live" panels at 5s.
4 LLM-driven agents (PersonaAgent, ContentAgent, InboxAgent, ModerationAgent) — all routing through the abfs.tech LLM router. Plus the cron Workflow handler and a handful of plain Python services (AccountImporter, RequestAgent for raw-media preprocessing, Analytics SQL).
One transparent SocialPlatform contract. IG, TikTok, LinkedIn implement it. Adding Twitter or YouTube means writing one more adapter — agents above don't change.
Device (XCUITest / W3C / ADB) and MCP (Composio.dev). Each account is permanently bound to exactly one backend — no cross-channel fallback, no migration. See channel isolation rule. Browser channel arrives in V2 §1.
Supabase Vault for per-account session cookies (Device) and OAuth refresh tokens (MCP). No fingerprint pool, no proxy registry, no signup automation, no 2FA relay — those are V2. V1 talents bring existing logged-in accounts.
Postgres for state (tenant-scoped via RLS), Supabase Storage for media, Supabase Vault for secrets, Auth for operator login, pg_cron for the workflow tick. No Redis, no Vaultwarden, no separate object store, no Realtime in V1 (polling is enough; Realtime arrives in V2 §9).
V1 has exactly four agents that need an LLM in the loop — anything reasoning, generating, or classifying. Everything else is a plain Python module (no agent ceremony). Don't commit to a heavyweight agent framework (e.g. Google ADK A2A) until V2 actually needs peer-to-peer agent topology; V1's orchestration is a star (Workflow → agent) and a simple async function-call graph is enough.
A stateless function: persona_decide(talent_id, context) → action[]. On every call it loads the talent's profile (personality, voice, stories, goals, media library, content pillars, "do not say" list, connected accounts) and decides what to post / reply to / wait on, in that talent's voice. No daemon, no long-lived instance — each tick fires a fresh call.
Owns: "what to post next," "reply to this DM in voice," "draft this brief."
Generates videos, photos, audio, and text — always conditioned on the target talent's profile (face refs, voice clone, brand kit, style guide). Same talent → consistent face/voice/look across every account on every platform. Brief in, asset out. LLM for captions/hashtags/scripts; media models for image/video.
Reads DMs, comments, mentions, replies. Classifies (lead / fan / spam / escalation). Drafts in-character replies via PersonaAgent for the talent or operator to approve. Polls on a cadence; MCP adapters with webhook support skip polling.
Filters generated content and inbound messages for brand-rule / sensitivity / "do not say" violations before either is acted on. Rules-based first pass (literal "do not say" hits), LLM second pass for tone/brand judgement calls.
pg_cron-driven function. On each tick, iterates active talents and fans out work to the LLM agents. No reasoning of its own.
Imports existing accounts: paste in session cookies (Device channel) or complete an OAuth grant (MCP). Validates by issuing a health-check call. No signup automation, no 2FA relay — those are V2.
Picks an account, picks a piece of approved content, picks the bound backend, publishes. Records the post ID. Folded into the platform adapter — not its own LLM agent in V1.
Pre-processes raw media attached to a post request: scene detection, OCR, transcription. Then hands off to ContentAgent for per-platform variant generation. No LLM directly — it's a pipeline of media-analysis calls.
Aggregates post performance, account health, inbox SLAs. SQL queries, not an LLM agent. Feeds the dashboard and informs ContentAgent's next-brief decisions.
workflow.tick()
for talent in tenants.active_talents(due_now):
ctx = load_talent_context(talent) # profile, recent posts, inbox, queue
for account in ctx.accounts:
if persona_decide_should_post(talent, account, ctx):
brief = persona_decide_brief(talent, account, ctx)
draft = content_generate(talent, brief) # ContentAgent
if moderation_approve(draft):
content_queue.enqueue(talent, account, draft,
status='awaiting_owner_approval')
# publish only what the owner has already approved
for piece in content_queue.due_and_approved(talent, account):
post_publish(account, piece) # PostAgent / adapter
for msg in inbox_fetch(account): # InboxAgent
if moderation_approve_inbound(msg):
reply = persona_draft_reply(talent, msg)
inbox_send(account, msg.thread, reply,
status='awaiting_owner_approval')
Two invariants to notice:
(1) no backend is selected at runtime — the dispatcher reads accounts.channel and refuses any other.
(2) nothing publishes without owner approval — every draft and reply lands in the queue and only the pieces the human owner has approved are eligible.
All four LLM agents call a single internal provider: the abfs.tech LLM router at
https://www.abfs.tech/v1/. The router is already operational and serves Anthropic Claude
models through Anthropic-compatible and OpenAI-compatible endpoints, with subscription rotation
and per-key logging. Dr.Social doesn't talk to Anthropic / OpenAI / any model provider directly — only
to this router.
POST /v1/messages — Anthropic Messages API shapePOST /v1/chat/completions — OpenAI Chat Completions shapeGET /v1/models — list available model IDsGET /health — router health
One API key per Dr.Social environment (dev, prod). Sent as either
Authorization: Bearer <key> or x-api-key: <key>.
Keys are issued from the abfs.tech admin dashboard at /admin and stored in Supabase Vault on our side.
Server-Sent Events on both endpoints. Anthropic streams pass through verbatim (message_start, content_block_delta, message_stop). For ContentAgent caption work we use buffered responses; for InboxAgent reply drafting we stream so the operator sees the draft as it composes.
Literal model IDs pass through; /regex/flags syntax matches against the available model list and picks the best fit. V1 defaults: Sonnet for content generation + inbox drafting, Haiku for moderation classification, Opus for the rare expensive PersonaAgent calls (long-context, multi-account decisions).
# Pseudocode — every LLM call in Dr.Social goes through this client.
class LlmRouterClient:
def __init__(self, base_url="https://www.abfs.tech", api_key=ENV["ABFS_LLM_API_KEY"]):
self.base_url = base_url
self.api_key = api_key
async def messages(self, *, model, system, messages, stream=False):
# POST /v1/messages — Anthropic-shaped
...
async def chat(self, *, model, messages, stream=False):
# POST /v1/chat/completions — OpenAI-shaped
...
Every social network is reduced to one Python interface. Agents above call this interface — they never know if they're driving Instagram or LinkedIn.
class SocialPlatform(Protocol):
name: str # "instagram" | "tiktok" | "linkedin" | ...
# account lifecycle (V1 = import-only)
async def import_account(self, profile: ImportedProfile, backend: Backend) -> Account: ...
async def login(self, account: Account, backend: Backend) -> Session: ...
async def logout(self, account: Account) -> None: ...
async def health_check(self, account: Account) -> AccountHealth: ...
# create / update / delete
async def publish_post(self, account: Account, content: Content) -> PostRef: ...
async def edit_post(self, account: Account, post: PostRef, patch: ContentPatch) -> PostRef: ...
async def delete_post(self, account: Account, post: PostRef) -> None: ...
# read
async def read_feed(self, account: Account, limit: int) -> list[Post]: ...
async def read_post(self, account: Account, post: PostRef) -> Post: ...
async def read_dms(self, account: Account, since: datetime) -> list[Message]: ...
async def read_comments(self, account: Account, post: PostRef) -> list[Comment]: ...
async def read_mentions(self, account: Account, since: datetime) -> list[Mention]: ...
# respond
async def send_dm(self, account: Account, thread: ThreadRef, body: str, media: list = []) -> None: ...
async def reply_comment(self, account: Account, comment: CommentRef, body: str) -> None: ...
async def react(self, account: Account, target: Ref, reaction: str) -> None: ...
Posts, reels, stories, DMs, comments. V1: Device only (existing accounts imported via session-cookie paste). Browser-channel IG arrives in V2 §1.
Video upload, replies, DMs. V1: Device only. (MCP coverage is weak; Browser is V2.)
Articles, posts, DMs, comments. V1: MCP via Composio. Cleanest API surface — no Device needed.
Drop-in. Agents and dashboard pick them up automatically once registered.
V1 ships with two backends. The Browser channel is documented as V2 §1 — its absence is what makes V1 tractable.
| Operation | Device | MCP |
|---|---|---|
| Post text | ✓ | ✓ |
| Post video / reel | ✓ (best) | ~ |
| Read feed | ✓ | ✓ |
| Read DMs | ✓ | ~ |
| Send DM | ✓ | ~ |
| Reply to comment | ✓ | ✓ |
| Delete post | ✓ | ✓ |
| Account signup | → V2 §5 | |
The matrix above is descriptive (what's possible). It is not a runtime fallback path — see the isolation rule below.
accounts.channel ∈ {device, mcp}; V2 expands to include browser). That binding is the account's identity. The orchestrator
will refuse to act on an account through any other channel. There is no fallback, no failover,
no "try the other backend." Ever.
channel column is set on enrollment and is immutable at the DB level (CHECK + trigger that rejects updates).account.channel at job dispatch and loads only that backend. No "required_channel" duplicate column on jobs — the account row is the source of truth.device account has device binding + cookies; an mcp account has OAuth grant only. These never cross.-- accounts.channel is set once and never changes
alter table accounts
add column channel text not null
check (channel in ('device', 'mcp')); -- V2: add 'browser'
create or replace function lock_channel() returns trigger as $$
begin
if NEW.channel is distinct from OLD.channel then
raise exception 'channel is immutable on account %', OLD.id;
end if;
return NEW;
end $$ language plpgsql;
create trigger accounts_channel_immutable
before update on accounts
for each row execute function lock_channel();
-- worker dispatch (no `required_channel` on jobs; the account row is truth)
job = claim_next_job()
account = load(job.account_id)
backend = load_backend(account.channel) -- ONLY this backend, no fallback
backend.execute(job)
V1 needs the absolute minimum to act on existing accounts. Anti-bot stealth, signup automation, 2FA relay, fingerprint pools, proxy fleets — all V2.
Supabase Vault — encrypted credentials stored in the same Postgres. Holds session cookies (Device channel) and OAuth refresh tokens (MCP channel). RLS restricts which service role can decrypt what; never decrypted into logs.
Talents (or operators with talent permission) paste in cookies / complete an OAuth grant once per account. Validated by a health-check call. No automated signup — that's V2.
Adapters report outcome (ok / soft-block / hard-block) on every call. V1 reaction: pause the account, alert the operator. No fingerprint rotation, no proxy rotation (no fingerprints or proxies in V1). Full burn-detection loop is V2 §4.
content table.content_queue row with appropriate aspect, length, hashtags.ContentAgent decides what to make. The vendors below actually make it. V1 picks the smallest set that produces images, voiceover, subtitles, and edited video — without the V2-grade cost and quality penalty of text-to-video generation.
Image generation. Talent face consistency via per-talent LoRA fine-tunes seeded from talent_assets[kind='face']. One API key (REPLICATE_API_TOKEN) covers the FLUX family + Whisper + many other models, so swapping image generators is a config change, not a vendor change.
Cost shape: ~$0.03/image, pay-per-use. ~$1–3/mo per talent at V1 volumes.
Voice cloning (1–5 min of clean reference audio per talent → cloned voice profile) and streaming TTS. Opt-in per talent — Brand & Voice section in /me manages consent + reference uploads. Only vendor in the stack for which "the talent's own voice" is non-substitutable; the rest are commodity model markets.
Cost shape: $22–99/mo tiered subscription; minutes-of-output budgeted per tenant.
Voice memos in talent post requests get transcribed and used as the prompt. Long-form video uploads get a transcript that feeds caption generation + subtitle burn-in. Same Replicate key as image gen — no second vendor.
Cost shape: ~$0.005/min audio, pay-per-use.
Runs inside the Dr.Social worker process. No vendor, no fees. Handles:
Runway Gen-3 · Luma Dream Machine · Google Veo · Sora. All slow (30–90s per clip), expensive ($0.40–1.00 per 5s reel), and weak at talent-face consistency. V1 sidesteps the problem: talents either upload raw video that we edit (FFmpeg), or get "video reels" composed from FLUX-generated stills + ElevenLabs voiceover + FFmpeg pan/zoom. We add text-to-video when one of those falls short.
Suno and Udio sound great but don't have stable production APIs yet. V1 uses royalty-free libraries (Pixabay-style) or each talent's pre-uploaded brand-kit BGM in talent_assets[kind='bgm'].
When a talent submits a request with prompt-only and TikTok as the target, the pipeline composes a reel without text-to-video:
media/config.py.media/ module exposes a stable shape: generate_image(prompt, talent) → url, tts(text, talent) → url, transcribe(url) → text+timings, compose_reel(brief) → url. Swap vendors behind it without touching ContentAgent.Content reaches the approval queue from two distinct sources — both treated equally downstream:
Post requests matter because most talents have specific things they want said or shown that the AI won't dream up on its own — a launch, a reaction, a behind-the-scenes moment. Media is optional: a request can be just a prompt + platforms (the system generates content from scratch in the talent's voice), just media + platforms (the system writes captions and cuts the assets for each platform), or both.
The talent's /me portal exposes a "Request a post" form with a prominent platform picker (chips per connected account), a prompt textarea, and an optional drag-and-drop area for media. Attachments upload directly to Supabase Storage (per-talent path, RLS-scoped). The submit creates one post_requests row.
Each talent gets a stable address <talent-slug>[email protected]
(e.g. [email protected]). DNS for dr-social.app is on Cloudflare; an Email Routing catch-all rule forwards every inbound message to one operator mailbox (currently [email protected] for the MVP). The Dr.Social worker IMAP-polls that mailbox every 30s with an App Password, parses the original To: header to identify the talent, drops attachments into Storage, transcribes any voice memos for the prompt, and creates the same post_requests row as the browser form. Processed messages get labeled drsocial/ingested and archived. Subject line is the prompt unless the body has one; platform tags #TT #IG #LI in the body override the talent's last-used platform selection.
To: header on forward — so we can still extract the talent slug from each message even though every email lands in one Gmail inbox.[email protected] can submit. That's fine when the address is shared only with the talent and their immediate collaborators (camera op, etc.). For tighter security, V2 adds a talent_request_senders allowlist (see V2 §7).
content_queue rows are created, grouped under the same post_request_id in the talent's approval queue UI. The talent sees the request as a single "post idea" with stacked per-platform variants.approved rows./me), transcribed voice memos can return as polished narration in their voice.post_requests (id, tenant_id, talent_id,
source CHECK in ('browser', 'email'),
prompt_text NULL, -- empty = let AI pick from pillars
target_account_ids[], -- one variant generated per
schedule_hint NULL,
sender_email NULL, -- populated for source='email'
gmail_message_id NULL UNIQUE, -- populated for source='email'; dedup key
status in ('received','analyzing','drafting','complete','failed'),
created_at, processed_at)
post_request_attachments (id, post_request_id, storage_path, mime, bytes, duration_s,
analysis_json NULL, -- scenes, faces, hooks, OCR …
transcript_text NULL)
-- zero rows if the request had no media
content_queue.origin enum: 'ai_initiated' | 'request_browser' | 'request_email'
content_queue.post_request_id NULL unless origin is 'request_*'
content_queue.rejection_reason captured at owner rejection — fed into talent style guide
slug column (e.g. aurora-lee). The catch-all email forwarding model means slugs collide across tenants, so uniqueness is enforced tenant-globally, not tenant-scoped, for this one column.<slug>[email protected] for post requests. Reserve +post, +settings, +inbox as future intents — the IMAP parser refuses unknown suffixes for now.~9 tables. Every tenant-scoped table carries tenant_id NOT NULL and is gated by an RLS policy. Composite FKs (tenant_id, id) on every cross-table reference make the database refuse cross-tenant attachments.
tenants (id, name, plan, status,
default_tick_interval_seconds,
created_at)
-- one row per agency. Per-talent override → V2 §8
tenant_users (id, tenant_id, supabase_user_id,
role CHECK in ('owner','editor','viewer'))
talents (id, tenant_id, display_name, real_name,
bio_short, bio_long, timezone, language, status,
owner_user_id, -- the real human who must approve
primary_email, contact_emails[], phone,
created_at)
talent_profile (talent_id PK,
personality_md, stories_md,
goals_json,
face_ref_urls[], voice_clone_ref NULL,
brand_kit_json, style_guide_md,
content_pillars[], hashtag_preferences[], do_not_say[],
updated_at)
-- 1:1 with talents; split out only for size
talent_assets (id, talent_id, kind in ('face','voice','logo','wardrobe','bgm',
'picture','video','document'),
storage_url, hash, metadata_json, created_at)
accounts (id, tenant_id, talent_id, platform, handle, status,
channel CHECK (channel in ('device','mcp')) IMMUTABLE,
vault_ref, -- session cookies OR OAuth refresh
posting_schedule_json,
created_at)
-- one row per (platform, talent, channel)
fingerprints, iproxy_connections, devices, oauth_grants, sessions, ip_rotations. V1 stuffs both session cookies and OAuth refresh tokens into accounts.vault_ref (a single Vault entry per account, shape varies by channel).
content (id, tenant_id, talent_id, hash, type, asset_url,
caption, hashtags, generated_by, moderated, created_at)
content_queue (id, tenant_id, talent_id, account_id, content_id,
origin enum ('ai_initiated','request_browser'),
post_request_id NULL,
scheduled_for,
state CHECK in
('draft','awaiting_owner_approval','approved','rejected',
'queued','published','failed'),
approved_by_user_id, approved_at,
rejected_reason,
created_at, updated_at)
-- per-talent publish queue shown on /me
post_requests (id, tenant_id, talent_id,
prompt_text NULL,
target_account_ids[],
schedule_hint NULL,
status in ('received','analyzing','drafting','complete','failed'),
created_at, processed_at)
post_request_attachments (id, post_request_id, storage_path, mime, bytes, duration_s,
analysis_json NULL, transcript_text NULL)
threads (id, tenant_id, account_id, peer_handle, last_message_at)
-- V2 adds audience_identity_id (cross-platform identity resolution)
messages (id, tenant_id, thread_id, direction, body, media_urls,
classified_as, status, created_at)
jobs (id, tenant_id, account_id, kind, payload_json, status, run_after)
-- no required_channel column; account.channel is the source of truth
events (id, tenant_id, ts, kind, account_id, payload_json)
-- audit + activity stream
Invariants enforced in Postgres:
(1) accounts.channel is immutable (trigger blocks UPDATE).
(2) tenant_id consistency — every FK across tables uses composite (tenant_id, id), so the database refuses to attach an account in tenant A to a talent in tenant B.
(3) RLS policies on every tenant-scoped table.
audience_members, audience_identities, audience_interactions, identity_link_signals (V2 §6); fingerprints, iproxy_connections, devices, oauth_grants, sessions, ip_rotations (V2 §1–§4); talent_request_senders (V2 §7); talents.tick_interval_seconds (V2 §8).
Two server-rendered web UIs, two different audiences, one shared backend. Both are tenant-scoped — Supabase RLS enforces tenant isolation on every query. Python templates + htmx for partials; polling at ~5s for "live" panels. (React + Vite frontend → V2 §10. Supabase Realtime → V2 §9.)
Top-level navigation pivots around talents, not accounts. Agency staff see all talents under their tenant, run overrides, supervise inboxes, and watch system health.
Live activity stream, accounts grouped by talent, queue-next-24h snapshot, inbox-awaiting-operator, channel health, plus an analytics rollup panel (folded in — no standalone Analytics page in V1).
Every talent in the tenant, with status, accounts grouped, queue volume, profile completeness. Click into a talent to see their workspace.
Every managed account across every platform, with health, last-active, current backend, vault status. Per-talent filter.
All scheduled posts across the tenant: draft → moderated → awaiting owner approval → approved → queued → published. Moderation flags shown inline (no standalone Moderation page in V1). Operators can reorder, pull a piece, or force-approve (only when the talent has opted into operator-override).
In-character reply drafts from PersonaAgent pending approval, escalations, conversation timeline per peer, per-account inbox.
Integrations (Supabase, abfs.tech LLM router key, Composio), team & roles, defaults (tenant tick cadence, content language, posting cap), billing, danger zone.
/me)
A standalone single-page workspace for the talent themselves. Scoped to one talent — the one whose
owner_user_id matches the logged-in Supabase Auth user. Different chrome from the operator dashboard
(warmer accent, no tenant switcher, no cross-talent navigation). Mockup: dashboard/me.html.
Pending content piece-by-piece, grouped by post idea with per-platform variants stacked under each idea. Tagged AI-initiated or from your request. The talent can approve each platform independently, or click "Approve all N variants" on the parent. Reject with a reason (fed into the style guide), edit, or reschedule.
Form with a prominent platform picker (chips per connected account), a prompt textarea, and an optional drag-and-drop area for media. Feeds the Post Request Pipeline. A "recent requests" list shows each request's status (analyzing → drafting → in queue).
Display name, tagline, short bio, personality traits, goals this quarter, content pillars, target audience. These fields drive every PersonaAgent draft.
Brand colors, logo/wordmark, face reference images, voice samples (for opt-in audio cloning), tone-of-voice positive examples, never-say list. ModerationAgent hard-blocks anything in the never-say list.
Per-platform handles with health and channel binding visible. Re-authenticate on demand. Adding a new account in V1 = importing existing credentials (Device cookies OR MCP OAuth); automated signup is V2.
Metadata of every stored credential — secrets are never displayed, only "last verified" timestamps and expiry. A red "revoke everything & pause my persona" panic switch.
Daily posting cap, blackout windows, timezone, default content language, two-factor on the portal, and the critical operator-override toggle (off by default). Per-talent tick interval → V2 §8; V1 inherits from the tenant.
The talent's own analytics — total followers and 7d/30d growth, per-platform breakdown, top posts, audience composition + geographies, what content patterns are above/below their own benchmark. Aggregated only across their accounts.
Scheduling lives inside Supabase via pg_cron. A row in a jobs table is inserted on every tick;
the worker process polls jobs WHERE status='pending' using SELECT … FOR UPDATE SKIP LOCKED — a clean
Postgres-native queue with no Redis or Celery.
V1 has a single cadence per tenant — tenants.default_tick_interval_seconds — driving every talent in that tenant.
The Workflow handler iterates the tenant's active talents on each tick and fans out per talent.
Per-talent override and dynamic adjustment (warm-up, blackouts, burn back-off) → V2 §8.
pg_cron inserts a tick job at the tenant's configured interval → Workflow picks it up and fans out.accounts.posting_schedule_json. The tick checks "is it time?" and enqueues a publish job — but only for items already approved in content_queue.-- pg_cron entry — tenant cadence configurable, not hard-coded
select cron.schedule(
'drsocial-baseline-tick',
'* * * * *', -- once per minute baseline
$$ insert into jobs (kind, payload) values ('tick', '{}') $$
);
-- Workflow handler (per tick, per tenant)
for tenant in tenants.active():
if not tenant.is_tick_due(now):
continue
for talent in tenant.active_talents():
fan_out(talent, now)
-- Atomic job pick
select id, kind, payload
from jobs
where status = 'pending' and run_after <= now()
order by run_after
limit 1
for update skip locked;
One Railway service, one managed backend (Supabase), one external LLM provider (abfs.tech router). That's the whole infrastructure for V1.
Python FastAPI serving REST + the htmx-rendered dashboards, plus the asyncio job loop running as a parallel task in the same process. One container, one port, one restart loop. Splitting into two services → V2 §12 when load demands it.
Postgres + Auth + Storage + Vault + pg_cron. The only database. The only secret store. The only object store. The only scheduler. Realtime + Edge Functions reserved for V2 features.
All LLM calls go to https://www.abfs.tech/v1/ — see §5. Bearer-auth API key per environment, stored in Supabase Vault. One billing surface for every model call Dr.Social makes.
Python 3.12 · asyncio · uv
FastAPI
Server-rendered Python templates + htmx. No Node, no Vite, no React. React + Vite → V2 §10.
abfs.tech LLM router — Anthropic + OpenAI-compatible endpoints
Appium / XCUITest / W3C / ADB (off-cloud, on a local Mac/PC; reaches API via outbound HTTPS)
Composio.dev MCP client
Replicate — FLUX 1.1 Pro (images), Whisper-large (transcription), face-LoRA fine-tunes per talent. One key, swappable models.
ElevenLabs — per-talent voice clone (opt-in) + streaming TTS for narration on generated reels.
FFmpeg in-process. Cuts, aspect reformat, subtitle burn-in, BGM. Text-to-video gen → V2.
Cloudflare Email Routing catch-all on dr-social.app → operator Gmail mailbox → IMAP poll by the worker every 30s. No SES/Postmark in V1.
Supabase Postgres (schema in plain SQL or one tiny migration tool)
Supabase Vault (no Vaultwarden)
Supabase Storage (no S3 / GCS / Drive)
Postgres jobs table + pg_cron (no Redis, no Celery)
htmx polling at 5s. Supabase Realtime → V2 §9.
# Auto-set by Railway
PORT
RAILWAY_ENVIRONMENT_NAME
# Supabase (one set per env)
SUPABASE_URL
SUPABASE_SERVICE_KEY
SUPABASE_ANON_KEY
# LLM provider (abfs.tech router)
ABFS_LLM_API_KEY # bearer for https://www.abfs.tech/v1/
# Media generation
REPLICATE_API_TOKEN # FLUX, Whisper, etc.
ELEVENLABS_API_KEY # voice clone + TTS
# MCP backend (LinkedIn, etc.)
COMPOSIO_API_KEY
# Device backend (on-prem hub)
DEVICE_HUB_URL # e.g. https://devicehub.yourbiz.com:4723
DEVICE_HUB_TOKEN # bearer for the device hub
# Email post-request intake (Gmail IMAP poll, V1 MVP shape)
GMAIL_INTAKE_USER # e.g. [email protected]
GMAIL_INTAKE_APP_PASSWORD # Gmail App Password (requires 2FA on the account)
Two Railway environments host the design-phase mockups via serve.py (a small stdlib HTTP server that templates the operator-dashboard fragments and serves the talent portal + this doc). The real FastAPI service arrives when implementation begins.
| Environment | URL | Deploys from | Gate |
|---|---|---|---|
| Production | drsocial-production.up.railway.app ↗ | master only |
(planned: CI check once tests exist) |
| Dev | drsocial-dev.up.railway.app ↗ | any branch (auto on push) | None |
Dr.Social/
├── docs/
│ ├── ARCHITECTURE.html # this document (V1)
│ └── ARCHITECTURE_V2.html # deferred features
├── dashboard/
│ ├── _layout.html # operator-portal shell ({{TITLE}}, {{CONTENT}}, …)
│ ├── _overview.html # operator-portal page fragments (5)
│ ├── _talents.html
│ ├── _accounts.html
│ ├── _queue.html
│ ├── _inbox.html
│ ├── _settings.html
│ └── me.html # standalone talent self-service portal
├── serve.py # tiny stdlib HTTP server that renders the fragments
├── Procfile # web: python serve.py (Railway start command)
├── pyproject.toml # minimal — signals Python project to Railway nixpacks
└── README.md
Dr.Social/
├── src/drsocial/
│ ├── agents/ # the 4 LLM agents — each thin, all call llm_router
│ │ ├── persona.py # persona_decide_* — stateless functions
│ │ ├── content.py # content_generate
│ │ ├── inbox.py # inbox_fetch + classify + reply-draft
│ │ └── moderation.py # moderation_approve, moderation_approve_inbound
│ ├── services/ # plain Python — no agent ceremony
│ │ ├── workflow.py # pg_cron-driven tick handler
│ │ ├── account_importer.py # session-cookie / OAuth import + health check
│ │ ├── request_agent.py # raw-media preprocessing (scenes/OCR/transcript)
│ │ ├── post_publisher.py # publishes approved content via platform adapter
│ │ └── analytics.py # SQL rollups
│ ├── media/ # media generation + editing — stable API, swappable vendors
│ │ ├── images.py # generate_image(prompt, talent) — Replicate FLUX + face LoRA
│ │ ├── audio.py # tts(text, talent), clone_voice(refs) — ElevenLabs
│ │ ├── transcribe.py # Whisper via Replicate
│ │ └── ffmpeg.py # reel composer, aspect reformat, subtitle burn-in, BGM
│ ├── intake/ # post-request ingress
│ │ ├── browser.py # /me form handler — multipart upload to Storage
│ │ └── gmail_imap.py # IMAP poller; parses To: → talent_slug; dedup via gmail_message_id
│ ├── platforms/ # adapter layer
│ │ ├── base.py # SocialPlatform protocol
│ │ ├── instagram.py
│ │ ├── tiktok.py
│ │ └── linkedin.py
│ ├── backends/ # execution
│ │ ├── device.py # Appium / XCUITest / ADB client
│ │ └── mcp.py # Composio client
│ ├── llm_router.py # thin client for https://www.abfs.tech/v1/
│ ├── supabase_client.py # one client (DB + Storage + Vault + Auth)
│ ├── jobs.py # pg_cron-driven queue helpers
│ ├── api.py # FastAPI — REST + serves dashboard + serves /me
│ ├── settings.py
│ └── main.py # entrypoint: runs api + worker in one asyncio loop
├── dashboard/ # htmx + Jinja templates, served by api.py
├── supabase/
│ ├── migrations/ # plain .sql files
│ └── seed.sql
├── tests/
├── Dockerfile # single image — runs api + worker
├── railway.json # single-service Railway config
└── pyproject.toml
agents/ stays at 4; services/ grows with fingerprints.py, iproxy.py, burn_detection.py, signup.py, audience_resolver.py;
backends/browser.py appears;
supabase/functions/inbound_email/ appears;
dashboard/ swaps htmx for React + Vite if V2 §10 fires;
the single Dockerfile splits into Dockerfile.api + Dockerfile.worker if V2 §12 fires.