Built for Tenstorrent QB2 · Blackhole

Local AI
video generation
for builders.

Generate cinematic video, images, and animated characters on your own Tenstorrent hardware. No cloud. No rate limits. Full creative control.

tt-local-generator main interface — video generation, history gallery, and TT-TV mode

Generated on Tenstorrent QB2

A collection of videos generated with tt-local-generator on a QB2 (P300×2) system using Wan2.2-T2V-A14B — all running locally, no cloud.

Create. Curate. Watch.

A focused three-phase loop for building a local AI media library — from a blank prompt to a polished collection you can enjoy on any screen.

Phase 01
Create

Write a prompt — or let the three-tier generator do it for you. Submit to Wan2.2, Mochi, SkyReels, FLUX, or AnimateDiff. Server management, queuing, and live progress are built in.

Phase 02
Curate

Every generation lands in a persistent gallery with full metadata. Hover to preview. Star favorites. Export or share. A growing archive of everything your hardware has ever made.

Phase 03
Watch

Switch to TT-TV — a lean-back cinematic mode that plays your library as a continuous, looping experience. Your models. Your prompts. Your channel.

Your private AI channel.

A full-screen cinematic viewer for your generated library. No algorithm decides what you see — just your own creations, playing on loop.

  • Auto-advance through your entire generated library, newest first
  • Seamless looping — video ends, next begins, no interruption
  • Resizable sidebar — keep the prompt and metadata visible, or go pure full-bleed
  • Open in system player or export directly from the viewer
  • Works remotely — connect from a Mac or laptop while hardware runs on the rack

One app. Six models.

Switch between text-to-video, image generation, image-to-video, character animation, and generative art from a single interface. Each model has a dedicated server managed by the app.

Hardware note: All video and image inference requires Blackhole hardware (QB2 — dual P300X2). This is the primary development and test target. Wormhole (N150/N300) is not currently supported for inference. The Generative Art text and SVG generators run on any machine via CPU.

Text-to-Video
Wan2.2 T2V

14B parameter cinematic video model. The flagship experience for long, detailed prompts.

QB2 (P300X2) — Blackhole
Text-to-Video
Mochi-1

High-motion, expressive video generation. Great for character and action-heavy prompts.

QB2 (P300X2) — Blackhole
Image-to-Video
SkyReels V2

Fast diffusion transformer at 540P. Animates still images with physics-respecting motion.

QB2 (P300X2) — Blackhole
Image
FLUX.1-dev

State-of-the-art text-to-image. Rich detail, accurate text rendering, photorealistic output.

QB2 (4× P300c) — Blackhole
Character Animate
Wan2.2 Animate

Video-to-video character animation. Give any character a motion — or replace a person in a clip.

CPU / CUDA — TT hardware support pending
Generative Art · Animated GIF
AnimateDiff

TTNN UNet with cross-frame temporal attention. Generates animated GIFs directly on Blackhole — no Docker, no server warmup. Every generation driven by the prompt engine.

QB2 / Blackhole P300c — no server needed

Generative Art — entirely on-device.

A second creative mode alongside video: SVG landscapes, color palettes, verse, ANSI art, and — on Blackhole hardware — animated GIFs via the AnimateDiff TTNN pipeline. Everything driven by the same three-tier prompt engine: algorithmic word banks → Markov chain → LLM polish.

Works from day one. The Generative Art tab automatically uses the best LLM available — a dedicated Generative Art model when you have one running, or the lightweight Qwen3-0.6B prompt server otherwise. Verse and palette generate well at any model size. SVG generators shine with a larger model. No configuration needed: start a bigger LLM and every subsequent generation upgrades automatically.

Mountain landscape — deep purple to crimson to orange gradient with layered polygon ridges
Mountain Ridgeline · Crimson
City Night — deep indigo sky, gold-windowed towers
City Night · Indigo
Glitch landscape — acid green horizon, magenta and cyan distortion bands
Glitch Landscape · Acid
Deep navy starfield — scattered white stars on midnight blue
Starfield · Navy
City Day — pale blue sky, teal-accented towers
City Day · Teal

Animated GIFs generated directly on Blackhole — no Docker container, no server warmup. Nine starred outputs from a single session, all driven by the literary prompt engine. The model is SD 1.4 with a TTNN UNet and cross-frame temporal attention.

Purple phosphor glow across distant mountains at 2am
purple phosphor glow · 2am mountains
An aleph in a cellar — a point containing all other points
aleph in a cellar · Borges
A cherry orchard, a Theremin, a hand approaching but not touching
cherry orchard · Theremin · Chekhov
A man writes the same sentence in his sleep every night
same sentence every night · milonga
A man growing feathers in a Salford bedsit with no explanation
growing feathers · Salford bedsit
A man who lived to 132, clocks in a house stop at different times
man who lived to 132 · Thames at dawn
A petitioner who has filled out the same form for six years
petitioner · six years · Tamil Nadu
Kipple spreading across a kitchen table — old batteries, receipts, nothing useful
kipple · Escher staircase · Dick
A garden of forking paths where every choice happens simultaneously
garden of forking paths · Borges

Generating coherent ANSI pixel art in a single LLM pass fails at any canvas larger than a few dozen cells — the model can't simultaneously plan spatial composition, choose the right characters, and assign 256 colors to hundreds of cells at once. The solution is a three-pass pipeline that separates each concern into a task the model handles well:

Pass 1 — Structure

Draw the subject using plain ASCII characters. No color, no block chars — just spatial composition. X # . | / placed where the model's layout judgment is strongest.

Pass 2 — Refinement

The ASCII sketch is redrawn with Unicode block characters (█ ▀ ▄ ▌ ▐ ░ ▒ ▓) for richer geometry. Layout is fixed — only visual quality improves.

Pass 3 — Colorization

Given the fixed character map, the LLM wraps every cell with one xterm-256 foreground color. Deciding a single color for an already-placed character is far simpler than planning position + color together.

BBS style applies neon-on-void rules with zone constraints (void rows top and bottom, neon subject in the center) and derives a board-identity color theme from the board name. Try it: tt-ctl artgen ansi --ansi-style bbs --board-name "NEON VOID" --subject "glowing skull"

The palette generator produces named color sets with evocative prose lore — each one a mood, a texture, a smell. Rendered in the app exactly as shown below.

Flicker Veil

A palette of bruised lavender and fevered pearl, like the inside of a broken bulb — warm and trembling, with a metallic shimmer that won't quite fade.

#1A002C
background
#5A3F6B
accent
#C2A5D3
highlight
#F5E6F5
light
Trench Whispers

A palette of deep, damp stone and cold, glimmering water. The air smells of brine and mineral deposits, with a faint bioluminescent glow at the edges.

#0a0e18
background
#2b3a4e
midtone
#3c5e7a
accent
#a3c4d4
light
Cathedral Dust

A palette of aged stone and soft light, like the hush of a cathedral after a storm. Dust motes cling to the walls, and the air holds the particular stillness of a room that has witnessed centuries.

#2A1B2A
background
#7D6B87
accent
#D4C6D7
light
#F5E6F2
shine

Three tiers. Always a great prompt.

A built-in prompt engine with no cloud dependency. Generates cinematic, specific, and evocative prompts for every model type — instantly.

1
Algorithmic

Samples from deep, curated word banks — subjects, settings, lighting, camera moves, mood, style — and assembles them into a structured slug. Six structural templates rotate the syntactic shape of each result; a 12 % chance injects an unexpected juxtaposition element the LLM cannot neutralise. Fast, always available. No model required.

Always available
2
Markov

Trains on a seed corpus of tagged prompts and generates novel recombinations at the sentence level using state size 1 — maximising wild collisions over near-verbatim repetition. A 1970s Betamax in an Escher staircase. A Muppet at a Manhattan diner at 2am.

Always available
3
LLM Polish

Qwen3-0.6B on CPU (port 8001) takes the raw slug and makes it flow naturally — without re-selecting or hallucinating new elements. Temperature and token budget scale to model size; small models (<3 B) run three candidates and the most specific one wins.

Qwen3-0.6B on CPU
🎲 Algo word banks → slug
🔀 Markov corpus → recombine
LLM Polish Qwen3 → final prompt
🎬 Submit to any model
Input Steinbeck · migrant
video: Steinbeck, migrant family, Route 66
A jalopy overloaded with furniture and children crests a dusty hill on Route 66, flat gold light stretching ahead, locked-off wide shot, heartbreaking and determined
Input Philip K. Dick · suburban
video: Philip K. Dick suburb, TV on, nobody home
A 1960s California living room at midnight — the television plays a test pattern, every lamp lit, nobody in any of the chairs, handheld and uneasy, the specific dread of a familiar room at an unfamiliar hour
Input Jeff Noon · Manchester rave
video: Jeff Noon, rave at dawn, pollen
A young woman covered in yellow pollen walks out of a Manchester warehouse at dawn, a canal behind her reflecting orange sodium vapor lights, handheld, exhausted and luminous
Input Retro · Moog synthesizer
video: Moog on kitchen table, reel-to-reel, late night
A Moog Minimoog sits on a kitchen table at 2am, patch cables trailing off the edge, a reel-to-reel machine spinning in the background, single 60-watt bulb overhead, static wide shot, warm and obsessive
Input Stephen King · dread
video: Stephen King, hotel corridor, two girls
Two identical girls stand at the end of a long Overlook Hotel corridor, floral wallpaper, chandeliers lit, the far end dark, a slow dolly push toward them, eerie and still
Input Cartoon · Looney Tunes
video: coyote, cliff, cartoon logic, Looney Tunes
A coyote runs off a cliff edge and hangs in midair, looks down, flat painted desert behind, Saturday-morning cartoon palette and rubber-hose physics, locked-off wide, one frozen beat before the fall
Input Nature · redwood forest
skyreels: fog through redwood forest, morning light
FPS-24, morning fog rolls through a redwood forest in slow rolling waves, shafts of gold cutting between trunks a hundred feet tall, static locked-off, cathedral quiet
Input Cosmic · aurora
skyreels: aurora borealis, frozen lake
FPS-24, green and violet aurora ribbons ripple across a subarctic sky above a frozen lake, the ice below reflecting the light in long still pools, static wide, silent and immense
Input Urban · rain, Tokyo
skyreels: Tokyo alley, rain, figure walking away
FPS-24, a figure in a yellow raincoat walks away from camera down a wet Tokyo alley, neon signs stretching reflections across the cobblestones behind them, slow dolly forward, atmospheric and solitary
Input Sci-fi · ring station
skyreels: sci-fi, ring station, gas giant
FPS-24, a colossal ring station rotates slowly above a gas giant, the planet's amber cloud bands reflected in the hull, smooth orbital camera move, epic and weightless
Input Nature · wolves in snow
skyreels: wolves, winter, pine forest
FPS-24, a pack of grey wolves runs through deep snow at the edge of a pine forest, breath streaming behind them in the cold air, low-angle tracking shot, blue-hour light, wild and alive
Input Ocean · dawn
skyreels: ocean waves, sea cliff, golden morning
FPS-24, ocean waves break against a sea cliff in golden morning light, spray catching the sun in slow arcs, static wide locked-off, the horizon flat and endless beyond
Input Stephen King · hotel
image: Overlook Hotel corridor, two girls, dread
Two identical girls standing at the end of a long hotel corridor, floral wallpaper, single overhead bulb swinging, the far end dark and indistinct, photorealistic, 35mm film grain, ultra-detailed, shallow depth of field
Input Sci-fi · Jupiter surface
image: surface of Jupiter, cloudscape, epic scale
The turbulent cloudscape of Jupiter's upper atmosphere, swirling amber and cream storm bands stretching to the horizon, a vast vortex eye glowing from within, digital matte painting, ultra-detailed, 8K, masterpiece
Input Retro · 1984 living room
image: 1984 living room, Saturday morning cartoons, cereal
A living room in 1984 — a bowl of cereal going soggy on a carpet, a CRT television showing rubber-hose cartoons in flat bright colors, warm morning light through venetian blinds casting long stripes across the room, photorealistic, 35mm film grain, shallow depth of field, masterpiece
Input Kafka · impossible office
image: Kafka, office corridor, impossible length, dread
An office corridor that extends far beyond any building could contain, fluorescent lights in a grid above, identical doors on either side, a figure at the far end that never gets closer, ultra-detailed, 8K, sharp focus, photorealistic
Input Psychedelia · Peter Max
image: Peter Max psychedelia, Yellow Submarine style
A swirling psychedelic landscape in Peter Max colors — flat magenta, electric blue, lime green, faces fractured into stacked layers — the Beatles as cartoon silhouettes in a sea of flowers, Yellow Submarine flat color illustration style, masterpiece, ultra-detailed
Input Escher · geometry
image: Escher staircase, impossible geometry, figures
An Escher staircase looping forever, figures walking both up and down simultaneously, M.C. Escher lithograph style, tessellating shadows, impossible geometry rendered in precise ink line, ultra-detailed, masterpiece, sharp focus
Input Character · fisherman
animate: old fisherman, turns toward horizon
An elderly fisherman slowly turns his face toward the horizon, weathered expression softening into quiet recognition, harbor at dawn, soft diffuse morning light, nostalgic and still
Input Character · dancer
animate: dancer in silk, raises arms, serene
A woman in flowing white silk raises her arms slowly above her head with eyes closed, weight shifting to one foot, temple courtyard at dusk, warm golden backlight, serene and weightless
Input Character · child, rain
animate: child with umbrella, looks up at sky
A small child in a yellow raincoat tilts their face upward and opens their mouth to catch rain, puddles on the pavement reflecting grey clouds, diffuse overcast light, tender and slightly broken
Input Character · monk
animate: monk in orange robes, bows slowly
A monk in deep orange robes brings his palms together and bows from the waist with deliberate slowness, moss-covered temple courtyard, dappled morning light through leaves above, reverent and unhurried

AnimateDiff runs a TTNN UNet with cross-frame temporal attention directly on Blackhole — no Docker container, no server warmup. The prompt engine (algo → Markov → Qwen3 polish) drives every generation; SD 1.4 reads the visual scene. Literary register shapes the mood; the architecture handles temporal coherence across frames.

Input Dostoevsky · counting house
animatediff: ledger room, gaslight, complicit smile
a moneylender in spectacles tapping a ledger with one finger, smiling with his mouth only, descending stairs in a Saint Petersburg counting house at night, gaslight throwing long shadows, tallow candles, winter seeping through the glass, deep chiaroscuro, morally complicated
Input Chekhov · provincial idyll
animatediff: old man, pigeons, Tuesday, nothing to do
an old man feeding pigeons in a provincial square on a Tuesday with nothing else to do, moves through a crowded space without touching anyone, a Dutch merchant's warehouse seen through the window, cool blue of pre-dawn, half-awake in a good way
Input Chekhov · evening garden
animatediff: man who missed his whole life, garden, dusk
a man who has missed his whole life sitting in a garden in the evening light, stirs a cup of coffee until it is cold, a summer internment camp in the California desert seen in memory, amber from a kerosene lamp, very still and very full
Input Kafka · impossible office
animatediff: door opens with no one touching it, letter falling from sky
a door swings slowly open with no one touching it into a corridor that extends far beyond any building could contain, a giant letter falling from the sky into a quiet neighborhood, the pale blue of a phone screen in total darkness, feverishly alive for no reason

Real outputs — nine starred GIFs from a live QB2 session, each generated in under two minutes.

purple phosphor glow across distant mountains at 2am
purple phosphor glow · 2am
an aleph in a cellar — a point containing all other points
aleph in a cellar · Borges
a cherry orchard, a Theremin, a hand approaching but not touching
cherry orchard · Theremin
A man writes the same sentence in his sleep every night
same sentence every night
a man growing feathers in a Salford bedsit with no explanation
growing feathers · Salford
A man who lived to 132, clocks stop at different times
man who lived to 132
a petitioner who has filled out the same form for six years
petitioner · six years · Tamil Nadu
kipple spreading across a kitchen table — Escher staircase
kipple · Escher staircase · Dick
a garden of forking paths where every choice happens simultaneously
garden of forking paths

The Generative Art tab sends structured generation prompts to an LLM and renders the output — SVG art, ANSI pixel grids, or plain text — directly in the gallery. Six starred examples from a real session are shown below, one for each generator type.

The LLM is chosen automatically: dedicated Generative Art server first (Qwen3-8B, Llama-3.1-8B, etc.), then the always-on Qwen3-0.6B prompt server as a fallback. Verse and palette work well at either size. SVG generators give richer results with a larger model but function at 0.6B too.

skyline noir · midnight teal
city skyline, noir silhouette, near-black teal sky (#0A2A25 → #001810), angular towers, no color in the buildings — just shape and negative space
Midnight noir skyline — angular towers against a near-black teal sky
landscape violet · deep space
layered landscape, deep violet cosmic mood, gradient sky (#120020 → #3B1060 → #7B2D8B), receding terrain silhouettes, purple atmosphere
Deep violet landscape — cosmic purple hills under an amethyst sky
geometric teal grid · deep field
abstract geometric, diagonal line grid across a deep blue-gray field (#0F2A35 background), teal strokes (#1A3C47), minimal — structure without colour noise
Geometric teal line grid — diagonal strokes on deep blue-gray field
constellation invented star chart · teal connectors
invented star chart, 800×500, 60–80 background stars, 20 named principal stars, teal connecting lines, dark navy field
Invented star chart SVG
freeform self-referential poem · plain text
write a short poem that references itself as an artifact — knows it is being generated, folds that awareness into its form
Generated verse

I am a poem, born of code and art,
A fleeting thought, a digital heart.
My words are woven, a tapestry so fine,
Generated lines, a mechanical design.

I know I'm artificial, a construct of mind,
A simulation of poetry, left behind.
My rhymes and rhythms, a calculated beat,
A manufactured muse, a synthetic treat.

I fold upon myself, a self-aware refrain,
A poem about poems, a meta-pain.
I reference my own, digital birth,
A creation of algorithms, a poetic mirth.

In this recursive loop, I find my form,
A poem that knows itself, a self-aware norm.
I am a poem, generated with ease,
A digital artifact, a poetic tease.

ansi Zork · xterm-256 pixel grid
pixel art of the Zork opening scene, 60 columns × 30 rows, ANSI 256-color background blocks, depth-layered composition
The model rendered the xterm-256 color cube rather than a Zork scene — an honest accident. The full color space exposed by a single misread prompt.
Built-in word banks — drawn from literary tradition
Literary Registers
Steinbeck · PKD · Brautigan · King · Jeff Noon · Tom Robbins · Octavia Butler · Kafka
Retro & Electronics
Moog · TR-808 · VHS · Betamax · Speak & Spell · Theremin · dot matrix · Walkman
Cartoon Registers
Looney Tunes · Sesame Street · Muppets · Harryhausen · Rankin/Bass · Yellow Submarine
Camera Moves
slow dolly in · overhead crane · locked-off static · handheld shaky · rack focus · orbit
Lighting
golden hour · sodium vapor orange · god rays · chiaroscuro · bioluminescent · TV light
Settings
Route 66 diner · Overlook Hotel · Manchester rave · Dust Bowl · Tokyo alley · lighthouse
Geometric / Impossible
Escher staircase · Klein bottle · tessellation · impossible corridor · grid to horizon
Subjects
red fox · lone astronaut · samurai · mechanical owl · ravegoer · crop picker · replicant
# Generate a video prompt (algo + LLM polish)
python3 app/generate_prompt.py
 
# Markov mode, SkyReels type, no LLM
python3 app/generate_prompt.py --type skyreels --mode markov --no-enhance
 
# Five prompts, raw text output
python3 app/generate_prompt.py --count 5 --raw
 
# Artgen works as soon as the prompt server is running (port 8001)
./tt-ctl start prompt-server
 
# Upgrade to a larger artgen LLM for better SVG quality (port 8002)
./tt-ctl start artgen-qwen3-8b

Install tt-local-generator

Ubuntu 24.04 with Tenstorrent hardware? Grab the .deb. Mac or any other Linux machine? Clone and run directly.

1
Download the latest release
# Using the GitHub CLI (gh)
gh release download --pattern "tt-local-generator_*.deb" --repo tenstorrent/tt-local-generator

Or download directly from the Releases page. Install gh with sudo apt install gh if needed.

2
Install the package
sudo apt install ./tt-local-generator_*.deb

Installs the app and launchers. Docker may also be installed via recommended packages; otherwise, install and start Docker manually before first use.

3
Launch
tt-local-gen
# or search "TT Generator" in your app launcher

Generated videos and images are saved to ~/.local/share/tt-video-gen/ and automatically linked into ~/Videos/tt-local-generator/ for easy browsing.

Model weights are not bundled. Download models separately after install (no apply_patches.sh needed — the package handles that automatically):

tt-local-gen-download-model --repo Wan-AI/Wan2.2-T2V-A14B-Diffusers  (~118 GB)
tt-local-gen-download-model --repo Qwen/Qwen3-0.6B  (~1.2 GB — prompt server)

Or use the model packages: sudo apt install tt-model-wan2-t2v tt-model-qwen3
1
Clone and install system dependencies
git clone https://github.com/tenstorrent/tt-local-generator.git ~/code/tt-local-generator
cd ~/code/tt-local-generator
./bin/setup_ubuntu.sh  # Ubuntu — installs GTK4, GStreamer, Docker
./bin/setup_macos.sh   # macOS — installs via Homebrew (remote-client mode)

Clone destination ~/code/tt-local-generator is expected by all scripts. macOS connects as a remote client to a Tenstorrent machine over the network.

2
Set up the vendored inference server
./bin/setup_vendor.sh  # shallow-clone tt-inference-server at the pinned SHA

This creates vendor/tt-inference-server/ — the server tree that all start_*.sh scripts use. Takes ~30 s on a fast connection.

3
Apply patches to the vendored server
./bin/apply_patches.sh  # HF_HOME mount, SkyReels model specs, device config

Required once after every setup_vendor.sh. The start_*.sh scripts check for this and print an error if skipped.

4
Launch the app
./tt-gen

Use Servers ▸ Start in the app to start an inference backend, or launch one directly: ./bin/start_wan_qb2.sh. Remote clients: ./tt-gen --server http://your-tt-machine:8000