Automation · AI

Aptos Content Pipeline

Trending topic in, publish-ready short-form video out — fully automated, across a whole network of accounts.

2.1MTotal views

0.0%Engagement rate

0Connected accounts

0Videos / day target

0+Docs in the RAG base

0Platforms, auto-published

The problem

Why this had to exist.

Producing a single short-form video the right way takes a content researcher, a scriptwriter, a video editor, a motion-graphics artist, and a social-media manager — then multiply that by a network of accounts and a 60-video-a-day target and it simply doesn’t scale by hand. Trends move in hours; manual research, scripting, editing, captioning and cross-platform publishing move in days. The challenge was to compress that entire chain into an automated system that still looks hand-crafted, stays factually grounded in real Aptos research, and produces genuinely distinct content per account rather than the same clip reposted thirty times.

What we built

An end-to-end content pipeline

The Content Pipeline transforms trending social topics into publish-ready short-form video for the Aptos blockchain ecosystem. It monitors social platforms for momentum, generates research-backed scripts grounded in a curated knowledge base, personalizes them to distinct account voices, produces full videos with AI presenters and supporting visuals, layers on branded logo overlays and animated captions, and distributes across TikTok, YouTube Shorts, Instagram Reels and Facebook Reels. It is not a single-account tool — it runs a network of accounts, each with its own personality, voice, presenter and style, all coordinated by a resumable Python orchestration layer over 7+ external services.

From a trending topic to a published video — the seven automated stages at a glance.

How it works

Seven stages, fully automated

One trending topic flows through seven stages and comes out the other side as multiple unique, publish-ready videos — one per account, each authentically its own.

Topic Discovery

Scans TikTok for videos gaining traction across Aptos-relevant hashtags and keywords, applies engagement filters, then uses AI to extract and categorize the real topics — regulation, adoption, payments, institutional moves — with confidence scoring. One intentional human checkpoint keeps strategy aligned. Hours of trend monitoring compressed into minutes.

Script Generation

Produces a research-backed narration script tuned for 30–40s. A RAG architecture queries a knowledge base of 47 technical articles, 5 academic papers, 17 narrative documents and 136 Aptos Improvement Proposals, tracks previously written content to vary the narrative angle, and integrates live market data. Output is structured as hook / body / CTA.

Script Personalization

Rewrites the base script to match a specific account’s voice, tone and style — defined by a Personality Profile and a Video Style Guide. A full rewrite of sentence structure, word choice, energy and pacing, not token substitution. This is the core enabler for multi-account scaling: same facts, genuinely distinct delivery.

Video Production

The most complex stage — a 12-step sub-pipeline in four phases: script prep with AI placing 10–13 visual cut-away markers; per-account TTS narration with word-level transcription for frame-accurate timing; parallel asset generation of presenter footage and b-roll, where intelligent rendering only generates presenter footage for moments the presenter is actually visible; then final assembly compositing every asset with transitions. Coordinates 4+ services and dozens of intermediate files.

Visual Overlays

Detects mentions of known entities — cryptos, companies, platforms — and overlays animated branded logos at the exact moment each is named. Alias resolution maps “Ripple” / “XRP” / “$XRP” to one entity, word-level timestamps place the overlay, and AI selects the animation style (fade, swing, bounce, zoom, shake) by context. Makes automated content look hand-crafted.

Captions

Animated word-by-word captions precisely synced to narration. Word-level transcription is grouped into 1–2 word display units matching speech rhythm, rendered in bold high-contrast fonts with a dark stroke, lower-third positioning, current-word highlighting and scale-in animation. Broadcast-quality captions in minutes instead of 30–60 minutes by hand.

Publishing

Distributes across TikTok, YouTube Shorts, Instagram Reels and Facebook Reels through a unified integration, with one dashboard tracking followers, views and engagement across all 30 accounts. Tuned for 60 videos/day with staggered publish windows and rate-limit balancing.

The knowledge base

Grounded in real research — not generic AI

What sets this apart from a generic AI video tool is what the scripts are built on. Every script is grounded in a curated, hand-assembled research corpus of 205+ indexed documents — not a web scrape — so the writing carries genuine technical depth instead of surface-level filler.

47 technical articles across the full Aptos stack — consensus, execution, the Move language, account innovations, developer tooling and infrastructure.

5 peer-reviewed academic papers — Zaptos, Raptr, Block-STM, Shoal and Narwhal-Tusk — behind the deep performance and protocol claims.

136 official Aptos Improvement Proposals, plus 17 narrative documents on policy, regulation and institutional context.

Category-aware retrieval: regulation topics pull from policy documents, performance topics from the research papers — the right sources for the right script, every time.

Under the hood

Built to run itself — and recover when it doesn't

By hand, this is five jobs: a researcher, a scriptwriter, a video editor, a motion-graphics artist and a social manager. Here it's a single Python orchestration layer coordinating 7+ external services, running jobs in parallel and tracking dozens of assets per video — with exactly one human checkpoint, for topic selection.

Fully resumable: if any stage fails, the run resumes exactly where it left off, never re-running the expensive earlier work.

Every run is observable end to end, stage by stage, so anything that breaks is obvious and quick to fix.

Intelligent rendering optimisation generates AI-presenter footage only for the moments the presenter is actually on screen — custom cost-engineering on the single most expensive operation.

Frame-accurate timing: word-level audio transcription drives the sync for cut-aways, logo overlays and captions alike.

Every run, stage by stage — fully observable, and resumable from any point.

Why it scales

Multi-account by configuration, not by code

The whole network runs on configuration, not forks of the codebase. Each account is just three files — an account config (name, platform, voice, avatar), a personality profile and a video style guide — read at runtime and adapted automatically.

Adding a new account takes zero code changes — just those three config files.

Base research and the source script run once per topic; personalization, production and overlays then run independently per account.

Every account's outputs, intermediate files and run history stay fully isolated — one network, no cross-contamination.

The same trending topic becomes thirty genuinely distinct videos — not one clip reposted thirty times.

The network's real numbers in one dashboard — 2.1M views and an 8.2% average engagement rate across the account network.

Tech & architecture

What it's made of.

A Python orchestration layer coordinating 7+ external services — parallelized throughout and fully resumable from any stage.

Python orchestrationAI / LLMsRAG retrievalText-to-speechAI avatar videoWord-level transcriptionMotion graphicsVideo compositingMulti-platform distributionParallel + resumable jobs

Your move

Want something like this?

We design, build and run custom software end-to-end. Tell us the problem — we'll build the system that solves it.

Start a project →Book a call · (613) 696-9545