tomoul run
One-shot generation. No server, no daemon. Pipe-friendly. Good for scripts, smoke tests, and shell pipelines.
Usage
$ tomoul run phi-4 -i "Write a haiku about a small bird called tomoul."
A tomoul takes flight—
Cloud edges trace its shortcut
Sky-stitched in silence.
Streaming output
Output prints token-by-token to stdout. Suppress with --no-stream if you're
piping to a tool that wants whole-output buffering.
tomoul run phi-4 -i "Summarize this PR" --no-stream | jq -Rs .
Reading from stdin
cat README.md | tomoul run phi-4 -i "Summarize in 3 bullets:"
git diff | tomoul run phi-4 -i "Suggest a commit message:"
Flags
| Flag | Default | Notes |
|---|---|---|
-i, --input | — | Prompt string (positional after the model also works). |
--max-tokens | 512 | Generation cap. |
--temperature | 0.7 | Sampling temperature. |
--system | — | Optional system message. |
--no-stream | off | Buffer full output instead of streaming. |
--cloud | off | Run against api.tomoul.ai (requires auth). |
--json | off | Emit OpenAI-shape JSON instead of plain text. |
vs serve
runexits when generation finishes. Use for one-shots and shell pipelines.servestays up and serves the OpenAI-compat API. Use when an app or IDE is the consumer.
Internally they share the same engine — the only difference is the I/O surface.
Last updated 13 May 2026Edit this page on GitHub