This skill guides you on how to use the multimodal image recognition API to analyze images from URLs and extract meaningful information based on user intent.
Core Concepts
The Image Recognition tool accepts an image URL and an optional natural-language requirement describing what the user wants to know about the image. The backend uses a multimodal AI model to interpret the visual content and return a textual description or analysis.
Supported formats
JPG, JPEG, PNG, GIF, WebP, BMP.
How it works
You provide a publicly accessible image URL and a requirement (what you want to learn from the image). The service downloads the image, runs multimodal analysis, and returns a text-based result.
Parameter Guide
Parameter
Required
Description
imageUrl
Yes
A publicly accessible URL pointing to the image. Must be JPG, JPEG, PNG, GIF, WebP, or BMP. Maximum 1000 characters.
requirement
No
A natural-language description of what to identify or analyze in the image. Defaults to "Describe the content of this image" when omitted. Maximum 1000 characters.
Show more
Installs
497
Repository
linkfox-ai/link…x-skills
GitHub Stars
17
First Seen
Apr 8, 2026
Security Audits
Gen Agent Trust Hub
Pass
Socket
Warn
Snyk
Warn