Samples for VLM video input. #3050

popovaan · 2025-11-19T16:41:20Z

Description

Python and C++ samples for VLM video input.

Checklist:

Tests have been updated or added to cover the new code.
This patch fully addresses the ticket. - This PR doesn't cover C sample, need a separate ticket for it.
I have made corresponding changes to the documentation.

Copilot

Pull Request Overview

This PR adds a Python sample demonstrating video-to-text functionality for Vision Language Models (VLMs). The sample enables users to input video files and interact with VLMs through a chat interface.

Adds new video_to_text_chat.py sample for VLM video input processing
Updates test configuration to include a tiny random LLaVA-NeXT-Video model and sample video file
Updates documentation to describe the new video-to-text sample usage

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
tests/python_tests/samples/conftest.py	Adds test model configuration for LLaVA-NeXT-Video and sample video file resource
samples/python/visual_language_chat/video_to_text_chat.py	New sample implementing video-to-text chat functionality using VLM pipeline
samples/python/visual_language_chat/README.md	Updates documentation to describe the new video-to-text sample and its usage
samples/deployment-requirements.txt	Adds opencv-python dependency required for video processing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/visual_language_chat/video_to_text_chat.py

samples/deployment-requirements.txt

Co-authored-by: Copilot <[email protected]>

samples/python/visual_language_chat/video_to_text_chat.py

samples/python/visual_language_chat/README.md

samples/python/visual_language_chat/video_to_text_chat.py

This reverts commit 1a8944a.

samples/cpp/visual_language_chat/README.md

samples/deployment-requirements.txt

samples/cpp/visual_language_chat/video_to_text_chat.cpp

Co-authored-by: Vladimir Zlobin <[email protected]>

…envino.genai into video_to_text_sample

github-actions bot added category: GGUF GGUF file reader category: VLM samples GenAI VLM samples category: samples dependencies labels Nov 19, 2025

Video to text python sample.

ca857ff

Wovchena requested a review from Copilot November 20, 2025 06:29

Copilot AI reviewed Nov 20, 2025

View reviewed changes

samples/python/visual_language_chat/video_to_text_chat.py Outdated Show resolved Hide resolved

samples/deployment-requirements.txt Outdated Show resolved Hide resolved

popovaan and others added 2 commits November 20, 2025 10:59

Sample test.

3b3c69d

Update samples/python/visual_language_chat/video_to_text_chat.py

b4a84f7

Co-authored-by: Copilot <[email protected]>

Wovchena requested changes Nov 24, 2025

View reviewed changes

github-actions bot added the category: cmake / build Cmake scripts label Nov 24, 2025

Added c++ sample.

29d78c4

github-actions bot added the category: GHA CI based on Github actions label Nov 24, 2025

popovaan changed the title ~~Python sample for VLM video input.~~ Samples for VLM video input. Nov 24, 2025

Attempt to add opencv build to ga workflow.

1a8944a

github-actions bot removed the category: GHA CI based on Github actions label Nov 24, 2025

popovaan added 5 commits November 24, 2025 13:32

Revert "Attempt to add opencv build to ga workflow."

1a0d25c

This reverts commit 1a8944a.

Used FetchContent to add opencv.

4d070ab

Corrected test.

a8fa911

Convert path to string().

bdc6940

Updated readme.

735060e

Wovchena requested changes Nov 25, 2025

View reviewed changes

samples/cpp/visual_language_chat/README.md Outdated Show resolved Hide resolved

samples/deployment-requirements.txt Outdated Show resolved Hide resolved

samples/cpp/visual_language_chat/video_to_text_chat.cpp Show resolved Hide resolved

popovaan and others added 2 commits November 25, 2025 16:09

Set 8 frames.

5dfdcf7

Update samples/cpp/visual_language_chat/README.md

43c76c9

Co-authored-by: Vladimir Zlobin <[email protected]>

popovaan requested a review from Wovchena November 25, 2025 15:55

popovaan marked this pull request as ready for review November 25, 2025 15:55

popovaan added 3 commits November 25, 2025 17:08

Fixed opencv version, minor corrections.

d77276f

Added assert.

cabd763

Merge branch 'master' into video_to_text_sample

a1c1290

github-actions bot added the category: GHA CI based on Github actions label Nov 26, 2025

Increase samples build timeout.

46e7d5d

popovaan added 5 commits November 26, 2025 10:01

Merge branch 'video_to_text_sample' of https://github.com/popovaan/op…

9f4e6b9

…envino.genai into video_to_text_sample

Cmake corrected.

5b6044d

Attempt to fix ci.

54cebe6

Fix on win.

e8cb51e

Merge branch 'master' into video_to_text_sample

58b8be4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Samples for VLM video input. #3050

Samples for VLM video input. #3050

popovaan commented Nov 19, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Samples for VLM video input. #3050

Are you sure you want to change the base?

Samples for VLM video input. #3050

Conversation

popovaan commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

popovaan commented Nov 19, 2025 •

edited

Loading