SAM Cell Segmentation Skill This skill provides guidance for tasks involving SAM (Segment Anything Model) based cell segmentation, mask processing, and polygon conversion pipelines. Task Characteristics This skill applies to tasks that involve: Using MobileSAM or SAM models for image segmentation Converting binary masks to polygon/polyline representations Processing CSV files containing coordinate data Building command-line tools for deep learning inference pipelines Cell or object segmentation in microscopy images Critical Pre-Implementation Steps 1. Interface Requirements Discovery Before writing any code, verify the exact interface requirements: Check how the script will be invoked : Read test files or evaluation harnesses to understand expected argument format Determine if arguments should be positional or keyword ( --arg_name ) Verify exact argument names expected by the test framework Verify output format specifications : Check input file format and match output format exactly Pay attention to data types: lists vs tuples, strings vs numbers Verify column names, ordering, and delimiters in CSV outputs Understand the evaluation criteria : Identify metrics used (IoU, accuracy, etc.) and their thresholds Understand what constitutes pass/fail conditions 2. Environment Assumptions Trust that specified packages will be available in the test environment Do not spend excessive time on environment setup or package installation If package installation fails or times out, proceed with code development assuming packages exist Focus on code correctness over environment debugging Implementation Approach Argument Parsing Pattern When building CLI tools for ML pipelines, use keyword arguments with explicit flags: import argparse def parse_args ( ) : parser = argparse . ArgumentParser ( description = 'Process masks with SAM' )

Use keyword arguments (--flag format), not positional

parser . add_argument ( '--weights_path' , type = str , required = True , help = 'Path to model weights' ) parser . add_argument ( '--csv_path' , type = str , required = True , help = 'Path to input CSV' ) parser . add_argument ( '--rgb_path' , type = str , required = True , help = 'Path to RGB image' ) parser . add_argument ( '--output_path' , type = str , required = True , help = 'Path for output CSV' ) return parser . parse_args ( ) Data Type Consistency Ensure consistent data types throughout the pipeline:

When processing coordinates, maintain list format (not tuples)

def mask_to_polygon ( mask ) :

... processing logic ...

Return coordinates as lists, not tuples

return [ [ int ( x ) , int ( y ) ] for x , y in coordinates ]

When saving to CSV, verify format

def save_coordinates ( coords , output_path ) :

Ensure coordinates are stored as lists

formatted_coords

[ list ( c ) if isinstance ( c , tuple ) else c for c in coords ]

... save logic ...

SAM/MobileSAM Integration Pattern def load_sam_model ( weights_path , device = 'cuda' ) : """Load SAM model with proper device handling."""

Check device availability

if device == 'cuda' and not torch . cuda . is_available ( ) : device = 'cpu'

Load model

model

sam_model_registry [ model_type ] ( checkpoint = weights_path ) model . to ( device ) model . eval ( ) return model , device def refine_mask_with_sam ( sam_model , image , initial_mask , device ) : """Use SAM to refine an initial mask.""" predictor = SamPredictor ( sam_model ) predictor . set_image ( image )

Get bounding box or point prompts from initial mask

... prompt extraction logic ...

masks , scores , _ = predictor . predict ( point_coords = point_coords , point_labels = point_labels , box = box , multimask_output = True )

Select best mask based on IoU with initial mask or score

best_mask

select_best_mask ( masks , scores , initial_mask ) return best_mask Mask-to-Polygon Conversion import cv2 import numpy as np def mask_to_polygon ( binary_mask , simplify_tolerance = 1.0 ) : """Convert binary mask to polygon coordinates."""

Ensure mask is binary uint8

mask

( binary_mask

0 ) . astype ( np . uint8 ) * 255

Find contours

contours , _ = cv2 . findContours ( mask , cv2 . RETR_EXTERNAL , cv2 . CHAIN_APPROX_SIMPLE ) if not contours : return [ ]

Get largest contour

largest_contour

max ( contours , key = cv2 . contourArea )

Simplify contour

epsilon

simplify_tolerance * cv2 . arcLength ( largest_contour , True ) simplified = cv2 . approxPolyDP ( largest_contour , epsilon , True )

Convert to list format (not tuples)

polygon

[

int

(

pt

[

0

]

[

0

]

)

,

int

(

pt

[

0

]

[

1

]

)

]

for

pt

in

simplified

]

return

polygon

Verification Checklist

Before Submission

Interface Verification

:

Run

python script.py --help

to verify argument parsing works

Verify argument names match test expectations exactly

Confirm keyword vs positional argument format

Output Format Verification

:

Compare output CSV structure with input CSV structure

Verify coordinate data types are lists, not tuples

Check all expected rows are present in output

Validate column names and ordering

End-to-End Testing

:

Run the complete pipeline with sample data

Verify output file is created and properly formatted

Check that all input rows produce corresponding outputs

Quality Metrics

:

Calculate IoU between refined masks and expected outputs

Verify metrics meet threshold requirements (e.g., IoU > 0.5)

Code Review Points

Argument Parser

Uses

--flag

format, not positional arguments

Data Types

Coordinates stored as lists, not tuples

Error Handling

Graceful handling of edge cases (empty masks, missing files)

Device Handling

Proper CUDA/CPU fallback logic

Common Pitfalls

1. Argument Format Mismatch

Problem

Using positional arguments when tests expect keyword arguments.

Detection

Script fails with "unrecognized arguments" or similar errors.

Solution

Always check test invocation format before implementing argparse.

2. Data Type Inconsistency

Problem

Storing coordinates as tuples when lists are expected.

Detection

Test failures related to coordinate format or JSON serialization issues.

Solution

Explicitly convert to lists before saving:

[list(coord) for coord in coords]

3. Incomplete Processing

Problem

Not all input rows appear in output.

Detection

Row count mismatch between input and output.

Solution

Verify loop processes all rows; add logging to track progress.

4. Environment Debugging Trap

Problem

Spending excessive time on package installation when it times out.

Detection

Multiple failed installation attempts.

Solution

Trust the test environment; focus on code correctness. If packages fail to install locally, proceed assuming they exist.

5. Premature Completion Declaration

Problem

Declaring task complete without end-to-end verification.

Detection

Fundamental errors discovered only during test evaluation.

Solution

Always run the actual command with test arguments before declaring completion.

6. Truncated File Reading

Problem

Not reading entire file contents, missing critical code sections.

Detection

Code review misses obvious errors in unread sections.

Solution

When file output is truncated, read in chunks or use offset/limit parameters.

Testing Strategy

Unit Test Priority Order

Argument parsing

Verify CLI interface matches expectations

Input/Output format

Verify data flows correctly through pipeline

Core functionality

Test mask processing and polygon conversion
Integration: End-to-end pipeline test Minimum Viable Test

Quick sanity check before full test suite

import subprocess import sys def test_cli_interface ( ) : """Verify script accepts expected arguments.""" result = subprocess . run ( [ sys . executable , 'script.py' , '--help' ] , capture_output = True , text = True ) assert result . returncode == 0 assert '--weights_path' in result . stdout assert '--csv_path' in result . stdout assert '--rgb_path' in result . stdout assert '--output_path' in result . stdout Task Execution Order Read test files or evaluation harness to understand interface requirements Identify exact argument format and output format specifications Implement argument parsing matching discovered requirements Implement core functionality (SAM loading, mask processing, polygon conversion) Verify interface with --help flag Run end-to-end test with sample data Verify output format matches specifications exactly Check all quality metrics meet thresholds

sam-cell-seg

安装

Use keyword arguments (--flag format), not positional

When processing coordinates, maintain list format (not tuples)

... processing logic ...

Return coordinates as lists, not tuples

When saving to CSV, verify format

Ensure coordinates are stored as lists

formatted_coords

... save logic ...

Check device availability

Load model

model

Get bounding box or point prompts from initial mask

... prompt extraction logic ...

Select best mask based on IoU with initial mask or score

best_mask

Ensure mask is binary uint8

mask

Find contours

Get largest contour

largest_contour

Simplify contour

epsilon

Convert to list format (not tuples)

polygon

Quick sanity check before full test suite