extract-elf

安装量: 35
排名: #19847

安装

npx skills add https://github.com/letta-ai/skills --skill extract-elf
ELF Binary Data Extraction
This skill provides guidance for tasks involving extraction of data from ELF binary files, including reading headers, parsing segments, and converting binary content to structured output formats.
Approach Overview
ELF extraction tasks typically require:
Parsing the ELF header to understand file structure
Reading program headers to identify LOAD segments
Extracting data from segments at correct virtual addresses
Converting binary data to the required output format
Implementation Steps
Step 1: Validate ELF Header
Before processing, verify the file is a valid ELF binary:
Check magic bytes at offset 0:
0x7F 'E' 'L' 'F'
(hex:
7f 45 4c 46
)
Identify ELF class (32-bit vs 64-bit) at offset 4
Identify endianness at offset 5 (1 = little-endian, 2 = big-endian)
Step 2: Parse ELF Header Fields
Extract key header fields based on ELF class:
For 32-bit ELF:
Program header offset: bytes 28-31
Program header entry size: bytes 42-43
Number of program headers: bytes 44-45
For 64-bit ELF:
Program header offset: bytes 32-39
Program header entry size: bytes 54-55
Number of program headers: bytes 56-57
Step 3: Process Program Headers
Iterate through program headers and identify LOAD segments (type = 1):
Extract virtual address (p_vaddr)
Extract file offset (p_offset)
Extract file size (p_filesz)
Extract memory size (p_memsz)
Step 4: Extract Segment Data
For each LOAD segment:
Read data from file at p_offset
Map data to virtual addresses starting at p_vaddr
Handle alignment and padding as specified
Critical Data Type Considerations
Signed vs Unsigned Integers
This is the most common source of errors in binary extraction tasks.
When reading multi-byte integer values from binary data:
Memory addresses are
always unsigned
Size fields are
always unsigned
Data values should typically be read as
unsigned
unless the task explicitly requires signed interpretation
Common API distinctions:
Node.js Buffer:
readUInt32LE
vs
readInt32LE
Python struct:
'I'
(unsigned) vs
'i'
(signed)
C/C++:
uint32_t
vs
int32_t
Verification
If output contains negative numbers but the expected output shows only positive integers, the wrong signedness was used.
Endianness
Match the endianness specified in the ELF header:
Little-endian (most common on x86/x64): Use
LE
variants
Big-endian: Use
BE
variants
Integer Sizes
ELF fields vary by class:
32-bit ELF: addresses and offsets are 4 bytes
64-bit ELF: addresses and offsets are 8 bytes
Verification Strategies
Before Declaring Success
Validate output format
Ensure JSON is well-formed, keys are correct types
Check address ranges
Verify addresses fall within expected segment boundaries
Sample value verification
Manually compute expected values for a few addresses using hex inspection tools Manual Verification Commands Use these tools to verify extracted values:

View ELF header information

readelf -h < binary

View program headers (segments)

readelf -l < binary

Dump section contents in hex

objdump -s < binary

View raw hex bytes at specific offset

xxd -s < offset

-l < length

< binary

Calculate expected value from hex bytes (little-endian example)

For bytes: 41 42 43 44 -> value = 0x44434241 = 1145258561

Value Sanity Checks If the example output shows only positive integers, verify output contains no negative values Compare a few computed values against manual hex calculation Verify address coverage matches expected segment ranges Common Pitfalls Using signed integer reads for unsigned data - Results in negative numbers for values with high bit set (e.g., -98693133 instead of 4196274163) Incorrect endianness handling - Produces completely wrong values; verify against ELF header byte 5 Off-by-one errors in segment boundaries - Carefully track whether sizes are inclusive/exclusive Assuming 4-byte alignment - Check if segment sizes are multiples of the read size; handle partial reads at boundaries Mixing 32-bit and 64-bit field sizes - Always check ELF class and use appropriate field sizes Overconfidence without verification - Never assume "values are read directly from binary, so they should match" - always verify sample values manually Output Format Considerations When producing structured output (e.g., JSON): Use string keys for addresses if they need to be JSON object keys (JSON requires string keys) Ensure integer values are within JavaScript/JSON safe integer range (2^53 - 1 for full precision) Consider whether addresses should be decimal or hexadecimal strings based on task requirements

返回排行榜