Your LLM can't watch a screen recording. Screex turns one into text it can read.

Python versions License: MIT

Screex

Screen-recording understanding for agents. Screex turns a screencast into a queryable index of UI states — each with the on-screen text (OCR), what text changed since the previous state, a thumbnail, and a full-resolution keyframe — so an LLM/agent can produce an action transcript, answer questions, or generate a how-to guide / bug report from a recording.

Training-free & model-agnostic — no fine-tuned UI model; any LLM can read the index.
pip install-only — OCR via rapidocr-onnxruntime, no system binaries.
Server-friendly runtime — uses headless OpenCV, so CI and Linux servers do not need GUI libraries just to build indexes.
Cheap by design — the on-screen text is plain text (nearly free to read); full-res keyframes are escalated to only when the text is insufficient.
~70% lower token cost — in our GUI video-QA benchmarks, handing an agent the Screex index instead of raw video frames cut the input tokens sent to the model by around 70%, with little loss in answer accuracy.
Fast OCR — tuned onnxruntime threading makes text extraction ~3.85× faster than the default.
Narration-aware — with pip install 'screex[audio]', the index includes a timestamped transcript of the spoken audio, interleaved into the step transcript.

Screex sends about 70% fewer input tokens than feeding raw video frames to the model

Good for: bug repros → reproduction reports · demos & Loom videos → how-to docs · tutorials → step lists · "what did the user do / what URL did they open?" Q&A over a recording.

Best on screen recordings. Screex is tuned for screencasts — mostly-static UI punctuated by discrete changes (clicks, typing, navigation). On that input it segments into a handful of meaningful states quickly. For general / continuous-motion video (camera footage, gameplay, talking-head clips) the change detector fires on nearly every frame, so prefer --fast with a higher --change-threshold (e.g. --fast --change-threshold 0.10) to avoid over-segmentation.

Example

Screex turning a screen recording into a markdown transcript

A short screen recording of a login → settings → error flow becomes a timestamped step list:

screex transcript bug-repro.mp4 -o steps.md

steps.md:

# Transcript — bug-repro.mp4  (0:06)

## 0:00–0:01  ·  State 1
Acme Console · Sign in · Email: [email protected]
**Appeared:** Acme Console, Sign in

## 0:01–0:02  ·  State 2
Dashboard · Welcome back, Rushi · Projects: 3
**Appeared:** Dashboard, Welcome back, Rushi
**Gone:** Acme Console, Sign in

## 0:03–0:04  ·  State 3
Settings > API Keys · New key: sk-live-9f2a · [ Save ]
**Appeared:** Settings > API Keys, New key: sk-live-9f2a

## 0:04–0:06  ·  State 4
Error: invalid API key format · Expected prefix 'sk_' not 'sk-'
**Appeared:** Error: invalid API key format

Prefer richer output? Hand the index.json to Claude via the bundled skill and ask for a bug report, a how-to guide, or answers to questions about the recording.

Install

From PyPI

pip install screex

For spoken-word narration in the index, also install the audio extra: pip install 'screex[audio]'.

From source

git clone https://github.com/blueprintparadise/Screex.git
cd Screex
pip install -e .          # add ".[test]" to also install pytest

Both give you a screex command (entry point screex.cli:main). Requires Python ≥ 3.9. The OCR models ship inside the rapidocr-onnxruntime dependency, so no separate download is needed.

Quickstart (CLI)

# Build the index for a screen recording
screex index path/to/recording.mp4 --fps 2
#   (or, without installing the package:)
python -m screex.cli index path/to/recording.mp4 --fps 2

This writes:

path/to/recording.screex/
  index.json            # the ScreenIndex (ordered UI states)
  frames/00000.png      # full-res keyframe per state
  frames/00000_thumb.png# thumbnail per state
  ...

screex

Popularity

What's Inside

README

Screex

Example

Install

From PyPI

From source

Quickstart (CLI)

Confidence

Similar Plugins

claude-watch

watch

peepshow

watch-cli

videodb

bibi