Video QnA using VLM through VSS Agent Use this skill when you need details about the video which requires VLM to look at the video frames — for example the agent has no usable prior answer and needs a fresh look at the pixels for a specific clip. When to Use The user asks what happens in the video , what objects / people / actions appear, colors , timing , safety , or other visual facts that require watching the clip. The user asks for details that cannot be answered from existing messages, summaries, Elasticsearch/MCP results, or filenames alone—you need model inference on the video . Follow-up questions about content details after a coarse summary or after report generation. Do not use this skill when a database / MCP / prior tool output already answers the question, unless the user explicitly wants verification against the video. Deployment prerequisite This skill requires a VSS profile that serves the video_understanding tool — typically base (recommended) or lvs . Before any request: Show more Installs 550 Repository nvidia/skills GitHub Stars 1.3K First Seen May 30, 2026 Security Audits Gen Agent Trust Hub Pass Socket Pass Snyk Warn

vss-ask-video

安装