openai-image-vision

安装量: 60
排名: #12373

安装

npx skills add https://github.com/zhayujie/chatgpt-on-wechat --skill openai-image-vision
OpenAI Image Vision
Analyze images using OpenAI's GPT-4 Vision API. The model can understand visual elements including objects, shapes, colors, textures, and text within images.
Setup
This skill requires at least one of the following API keys (OpenAI is preferred when both are set):
OpenAI
(preferred):
env_config(action="set", key="OPENAI_API_KEY", value="your-key")
LinkAI
(fallback):
env_config(action="set", key="LINKAI_API_KEY", value="your-key")
Optional: Set custom API base URL:
env_config
(
action
=
"set"
,
key
=
"OPENAI_API_BASE"
,
value
=
"your-base-url"
)
Usage
Important
Scripts are located relative to this skill's base directory.
When you see this skill in
, note the
path.
CRITICAL
Always use bash command to execute the script:

General pattern (MUST start with bash):

bash "/scripts/vision.sh" "" "" [ model ]

DO NOT execute the script directly like this (WRONG):

"/scripts/vision.sh" ...

Parameters:

- image_path_or_url: Local image file path or HTTP(S) URL (required)

- question: Question to ask about the image (required)

- model: OpenAI model to use (default: gpt-4.1-mini)

Options: gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4-turbo

Examples
Analyze a local image
bash
"/scripts/vision.sh"
"/path/to/image.jpg"
"What's in this image?"
Analyze an image from URL
bash
"/scripts/vision.sh"
"https://example.com/image.jpg"
"Describe this image in detail"
Use specific model
bash
"/scripts/vision.sh"
"/path/to/photo.png"
"What colors are prominent?"
"gpt-4o-mini"
Extract text from image
bash
"/scripts/vision.sh"
"/path/to/document.jpg"
"Extract all text from this image"
Analyze multiple aspects
bash
"/scripts/vision.sh"
"image.jpg"
"List all objects you can see and describe the overall scene"
Supported Image Formats
JPEG (.jpg, .jpeg)
PNG (.png)
GIF (.gif)
WebP (.webp)
Performance Optimization
Files larger than 1MB are automatically compressed to 800px (longest side) to avoid command-line parameter limits. This happens transparently without affecting analysis quality.
Response Format
The script returns a JSON response:
{
"model"
:
"gpt-4.1-mini"
,
"content"
:
"The image shows..."
,
"usage"
:
{
"prompt_tokens"
:
1234
,
"completion_tokens"
:
567
,
"total_tokens"
:
1801
}
}
Or in case of error:
{
"error"
:
"Error description"
,
"details"
:
"Additional error information"
}
Notes
Image size
Images are automatically resized if too large
Timeout
60 seconds for API calls
Rate limits
Subject to your OpenAI API plan limits
Privacy
Images are sent to OpenAI's servers for processing
Local files
Automatically converted to base64 for API submission
URLs
Can be passed directly to the API without downloading
返回排行榜