Skip to content

[output] Add optional TwelveLabs scene labelling#555

Open
mohit-twelvelabs wants to merge 3 commits into
Breakthrough:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration
Open

[output] Add optional TwelveLabs scene labelling#555
mohit-twelvelabs wants to merge 3 commits into
Breakthrough:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration

Conversation

@mohit-twelvelabs

Copy link
Copy Markdown

Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).

What this adds

An opt-in scenedetect.output.label_scenes helper that attaches a short natural-language description to each detected scene using the TwelveLabs Pegasus video-understanding model.

PySceneDetect's detectors find where the cuts are (pixel/histogram/hash based); this complements that by answering what is in each scene. Each detected scene's start/end timecode is forwarded to Pegasus via its start_time/end_time parameters, so each description covers only that portion of the video and the labels line up 1:1 with the scene list:

from scenedetect import detect, ContentDetector
from scenedetect.output import label_scenes

scenes = detect("my_video.mp4", ContentDetector())
for label in label_scenes(scenes, video_url="https://example.com/my_video.mp4"):
    print(label.index, label.label)

Why it helps

A common ask after shot detection is a quick semantic summary per scene (for editing, search, dataset labelling, highlight reels). This wires that in without the user having to build their own pipeline, while keeping PySceneDetect's pixel-based detection as the source of truth for cut boundaries.

Opt-in / non-breaking

  • New code only; touches no detection paths and changes no defaults.
  • Gated behind a new optional extra: pip install scenedetect[twelvelabs]. The twelvelabs package is not a hard dependency; if it's missing, label_scenes raises a friendly ImportError with install instructions.
  • Exported alongside the other scenedetect.output helpers, following the existing re-export convention.

How it was tested

  • No-network unit tests (tests/test_labels.py) using a fake client assert the per-scene timecode wiring and source-argument validation; these run in CI without a key.
  • An opt-in integration test is skipped unless TWELVELABS_API_KEY (and a TWELVELABS_TEST_VIDEO_URL) are set.
  • Verified against the live API: a Marengo text embedding returns the expected 512-dim vector, and the Pegasus analyze request wiring (model pegasus1.5, video source, prompt, per-scene start_time/end_time, max_tokens) passes all server-side parameter validation. Full per-scene generation against a hosted sample is pending a clean public sample URL.
  • ruff check and ruff format --check pass on all changed files; pytest tests/test_labels.py passes (2 passed, 1 skipped).

You can grab a free API key at https://twelvelabs.io — there's a generous free tier.

Adds scenedetect.output.label_scenes, an opt-in helper that attaches a
short natural-language description to each detected scene using the
TwelveLabs Pegasus video-understanding model. Pixel-based detectors
locate the cuts; this forwards each scene's start/end timecode to
Pegasus so the description covers only that portion of the video.

The integration is gated behind the optional 'twelvelabs' extra and is
never invoked during normal detection.
@frueter

frueter commented Jun 25, 2026

Copy link
Copy Markdown

that is exciting! the pip install isn't working for me though. user error?

(.venv) (base) frank@MacBookPro cutdetector_test % pip install scenedetect[twelvelabs]
zsh: no matches found: scenedetect[twelvelabs]

@mohit-twelvelabs

Copy link
Copy Markdown
Author

Not user error — that's zsh treating the square brackets as a glob pattern. Just quote the extra:

pip install 'scenedetect[twelvelabs]'

One caveat: since this is still an open PR (not yet on PyPI), pip won't find the twelvelabs extra from the released package. To try it out, install straight from the branch:

pip install 'scenedetect[twelvelabs] @ git+https://github.com/mohit-twelvelabs/PySceneDetect.git@feat/twelvelabs-integration'

or clone the branch and pip install '.[twelvelabs]'. Let me know if that gets you unblocked!

— Mohit (@mohit-twelvelabs, TwelveLabs)

@frueter

frueter commented Jun 27, 2026

Copy link
Copy Markdown

Awesome, that worked, thanks a lot!

@frueter

frueter commented Jun 27, 2026

Copy link
Copy Markdown

I got it working. Unfortunately the use of a file ID instead of a url doesn't seem to work (I got the error that this is not supported in pegasus 1.5). Also, when running a test with a url I got a BadRequestError error from Pegasus 1.5 because a detected scene was too short:
status_code: 400, body: {'code': 'parameter_invalid', 'message': 'The start_time parameter is invalid. the duration between start_time and end_time must be at least 4 seconds for pegasus1.5'}

Would be nice to just ignore that to make this more stable?

Pegasus 1.5 requires each analysed window to be at least 4s and rejects an
indexed video_id, so per-scene labelling could 400 and abort the whole pass.

- Add MIN_PEGASUS_SCENE_SECONDS (4.0) and skip shorter scenes with a logged
  note instead of calling analyze() on them.
- Catch a per-scene BadRequestError, log it, and continue rather than aborting
  the batch; auth/quota/server errors still propagate and fail fast.
- Treat an id source as an uploaded asset via VideoContext_AssetId, and reject
  the unsupported video_id up front with a clear, actionable error instead of a
  confusing raw 400. Update docstring/README accordingly.
@mohit-twelvelabs

Copy link
Copy Markdown
Author

Thanks for testing this, @frueter — both reports are fixed in 3e27d8c.

(a) Scenes shorter than 4s no longer abort the run. Pegasus 1.5 requires each analysed window to span at least 4 seconds, so a short detected scene was 400ing and breaking the whole pass. There's now a MIN_PEGASUS_SCENE_SECONDS = 4.0 constant: scenes shorter than that are skipped with a logged warning (and simply omitted from the results, identifiable by SceneLabel.index) and the pass continues. I also hardened the per-scene call so an unexpected BadRequestError from one scene is logged and skipped rather than aborting the batch — while auth/quota/server errors still propagate and fail fast.

(b) The id source no longer produces a confusing error. Confirmed against the SDK (twelvelabs 1.2.8) that pegasus1.5 analyze does not accept an indexed video_id — it needs a video_url or an uploaded asset. So an id source is now sent as an asset via VideoContext_AssetId (new asset_id= argument), and passing the old video_id= raises a clear, actionable ValueError up front (pointing you to video_url=/asset_id=) instead of letting the API return a raw 400. Docstring and README updated to match.

Unit tests cover both: a sub-4s scene is skipped without aborting, a per-scene API error is skipped, and the video_id path raises the up-front error.

— Mohit (@mohit-twelvelabs, TwelveLabs)

@frueter

frueter commented Jun 27, 2026 via email

Copy link
Copy Markdown

Catch TooManyRequestsError per scene: log a clear warning (with Retry-After
when present), stop further per-scene calls, and return the labels gathered so
far instead of aborting the whole pass. Long videos on the free tier can hit
the quota mid-run; consistent with the existing skip-and-continue contract.
@mohit-twelvelabs

Copy link
Copy Markdown
Author

@frueter No worries at all — thanks for putting it through a real 15-minute film, that's exactly the kind of run that surfaces this.

Fixed in commit 2688d10: label_scenes now catches the TwelveLabs rate-limit error (HTTP 429 / TooManyRequestsError). Instead of erroring out the whole run, it logs a clear warning (e.g. "TwelveLabs rate limit hit (free-tier quota?) — stopping scene labelling early; returning the labels generated so far"), stops making further per-scene Pegasus calls (once you're throttled, the rest would just 429 too), and returns the labels collected up to that point. If the API sends a Retry-After, it's included in the warning so you know roughly when the quota resets. I also added a unit test that mocks a 429 mid-pass and asserts the run stops gracefully with partial results rather than raising, plus a docs note that long videos on the free tier can hit the limit.

So a long video on the free tier now degrades gracefully — you get partial labels and a clear message rather than a crash, and can rerun once the quota resets to pick up the remainder. Really glad you're keen to bring it into CutDetector!

— Mohit (@mohit-twelvelabs, TwelveLabs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants