assgen β User Experience Design¶
This document describes how assgen is meant to feel from the user's perspective across the three modes of operation and documents the design decisions behind the client/server architecture.
The Three Modes¶
Mode 1 β Solo (no server configured)¶
The most common starting point. The user has just installed assgen on their machine and wants to try it out without any configuration.
$ assgen gen audio sfx generate "laser gun firing"
β‘ No server detected β starting local assgen-server on http://127.0.0.1:8432
(It will stay running between commands. Stop with: assgen server stop)
β Server ready
Job enqueued id=a1b2c3d4 type=audio.sfx.generate
Track with: assgen jobs status a1b2c3d4
Or wait with: assgen jobs wait a1b2c3d4
Or with --wait:
$ assgen gen audio sfx generate "laser gun firing" --wait
β‘ No server detected β starting local assgen-server on http://127.0.0.1:8432
(It will stay running between commands. Stop with: assgen server stop)
β Server ready
β Έ Downloading facebook/audiogen-medium (4/12 files)β¦ 8% 0:00:42
β Ό Model facebook/audiogen-medium already cached β 15% 0:00:01
⠦ Inference running⦠60% 0:01:12
⠧ Post-processing audio⦠90% 0:01:31
β Job a1b2c3d4 completed in 1m 33s
Output file: ~/assgen-outputs/a1b2c3d4/laser_gun_firing.wav
Duration: 2.0s | Sample rate: 44100 Hz
To play: afplay ~/assgen-outputs/a1b2c3d4/laser_gun_firing.wav
Or open: assgen jobs open a1b2c3d4
Key behaviours in solo mode:
- The server starts the first time and stays running between commands (via PID file). The second
assgen gen β¦in the same terminal session is instant β no re-start. - Models are downloaded to a shared OS-level location (
~/.local/share/assgen/modelson Linux,%LOCALAPPDATA%\assgen\assgen\modelson Windows) so re-runs skip the download entirely. - Output files are written to
~/assgen-outputs/<job-id>/by default, or overridden with--output. - The server is not killed when the command exits. It stays alive so the next command is faster. Use
assgen server stopto shut it down explicitly.
Mode 2 β Local server (explicit control)¶
For users who want the server running as a background service on their machine rather than relying on auto-start. Useful in development, or when you want to see server logs.
# Terminal A β start server with visible logs
$ assgen-server start
[2024-03-07 10:00:00] INFO Server listening on http://127.0.0.1:8432
[2024-03-07 10:00:00] INFO Worker thread started (device=cuda)
[2024-03-07 10:05:23] INFO Job a1b2c3d4 QUEUED type=visual.model.create
[2024-03-07 10:05:23] INFO Downloading TencentARC/InstantMesh...
[2024-03-07 10:06:18] INFO Job a1b2c3d4 RUNNING progress=0.22
[2024-03-07 10:07:45] INFO Job a1b2c3d4 COMPLETED
# Terminal B β use it normally
$ assgen gen visual model create --prompt "low-poly sword" --wait
β Έ Downloading TencentARC/InstantMesh (8/24 files)β¦ 12% 0:00:55
...
Key difference from solo mode: The client detects the running server (via PID file / health check) and uses it directly. No new server is spawned.
Mode 3 β Remote server (laptop β desktop)¶
The primary production use case: your beefy desktop runs the server and your laptop acts as the client. The desktop has a 4070; the laptop has nothing.
One-time setup on the desktop (Windows 10):
# Install
pip install assgen
# Configure to accept connections from the network (not just localhost)
assgen-server config set host 0.0.0.0
assgen-server config set port 8432
# Start the server (keep this terminal open, or use --daemon)
assgen-server start
# Or as a background Windows service:
assgen-server start --daemon
One-time setup on the laptop:
# Point the client at the desktop
assgen client config set-server http://MY-DESKTOP-IP:8432
# Verify connection
assgen server status
# β Connected to http://MY-DESKTOP-IP:8432 (assgen 0.0.1, device=cuda, model=...)
Then just use it normally:
$ assgen gen visual model create --prompt "low-poly sword" --wait
β Έ Model TencentARC/InstantMesh already cached β 15% 0:00:01
⠼ Generating mesh from prompt⦠40% 0:00:45
⠧ Exporting to GLB⦠88% 0:01:12
β Job a1b2c3d4 completed in 1m 14s (on http://MY-DESKTOP-IP:8432)
Output file: ./sword_a1b2c3d4.glb (downloaded from server)
Format: GLB | Vertices: 12,340 | File size: 2.1 MB
What happens under the hood (remote mode):
- Client sends
POST /jobsto the desktop server - Desktop server runs inference on the 4070 (models cached on desktop disk)
- Desktop server writes output to its own
outputs/directory - Client polls
GET /jobs/{id}every 2 seconds, showing progress - When complete, client calls
GET /jobs/{id}/filesto discover output filenames - Client downloads each file from
GET /jobs/{id}/files/{filename}and saves to local disk
The laptop user gets the generated file locally even though all computation happened on the desktop.
Server Lifecycle¶
| Situation | What happens |
|---|---|
No server_url configured, no PID file |
Auto-start server, write PID file, server stays up |
No server_url configured, PID file exists, process alive, health OK |
Reuse existing server β instant |
No server_url configured, PID file exists, process dead |
Clean up stale PID, start fresh |
server_url configured (remote or manual) |
Connect directly, no auto-start logic |
assgen server stop |
Send SIGTERM to PID, remove PID file |
| Machine reboots | PID file becomes stale, auto-start cleans it up on next command |
Why does the server persist between commands? The biggest cost for most AI workflows is loading model weights into GPU VRAM β this can take 30β120 seconds for large models. Keeping the server alive means the second assgen gen ... call skips that cost entirely.
File Flow¶
Server side Client side
ββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββ
Job runs inference
β writes output file(s) to (polling GET /jobs/{id})
~/.local/share/assgen/
outputs/{job_id}/
output.glb
output_preview.png
Job status β COMPLETED
output: {files: ["output.glb"]}
β client sees COMPLETED
client calls GET /jobs/{id}/files
client calls GET /jobs/{id}/files/output.glb
β saved to ./output.glb (or --output path)
client prints "Output saved to ./output.glb"
For local server use, the file path on the server IS the file path on the client (same machine). The client skips the download and just prints the local path.
For remote server use, the client always downloads the file.
Configuration Reference¶
Client config (~/.config/assgen/client.yaml)¶
server_url: null # null = auto-start; or "http://MY-DESKTOP:8432"
default_wait: false # true = always block and show progress bar
default_timeout: 300 # seconds before --wait gives up
poll_interval: 2.0 # seconds between status polls
output_dir: null # null = current dir; or "/home/user/assgen-outputs"
Server config (~/.config/assgen/server.yaml)¶
host: "127.0.0.1" # change to "0.0.0.0" for network access
port: 8432
device: "auto" # "auto" | "cuda" | "cpu" | "mps"
log_level: "info"
model_load_timeout: 120 # seconds to wait for model to load
job_retention_days: 30 # days to keep completed jobs in DB
# Security β leave allow_list empty to allow all models
allow_list: []
skip_model_validation: false
Command Summary¶
assgen gen <domain> <subdomain> <action> [OPTIONS]
Submit a job to the server (auto-starting it if needed)
--wait / --no-wait Block and show progress bar until done
--output PATH Where to save output file(s)
--model-id TEXT Override the catalog model (validated by server)
assgen jobs list [-s STATUS] [--limit N]
Show recent jobs
assgen jobs status <id>
Show full details for a single job
assgen jobs wait <id> [--timeout N]
Block until a job completes (attach to a running job)
assgen jobs download <id> [--output DIR]
Download output files for a completed job
assgen server status
Check if a server is reachable and show its config
assgen server start
Start a local server in the foreground (for development)
assgen server stop
Stop the local auto-started server
assgen client config set-server <URL>
Point this client at a specific server
assgen client config unset-server
Revert to auto-start mode
assgen tasks [--domain DOMAIN]
Browse all supported game-dev tasks and their assigned models
assgen config list
Show all job-type β model mappings
assgen config set <job-type> <model-id>
Override the model for a job type on this server
Design Principles¶
-
Zero configuration required.
pip install assgen && assgen gen audio sfx generate "explosion"works on first run. The server starts itself, downloads the model, and runs inference. -
The client is always talking to a server β even if it's a server the client started itself. This means all commands are thin HTTP wrappers; there is no inference code in the client package at all.
-
Local and remote are identical from the CLI perspective. The only difference is whether
server_urlis configured. Every feature (progress bars, output downloads, job history) works identically. -
The server is stateless enough to restart cleanly. Jobs are persisted in SQLite. If the server crashes, jobs in RUNNING state are re-queued on next start.
-
Model weights are cached, not re-downloaded. The first run downloads a model; every subsequent run uses the local cache.
assgen models listshows what's cached. -
Output files belong to the user, not the server. Jobs complete with file paths; clients download them. Files are never ephemeral.