Test and debug regular expressions with detailed match visualization, plain-English explanations, and pattern generation from examples.
Quick Start
from scripts.regex_tester import RegexTester
# Test pattern
tester = RegexTester()
result = tester.test(r"\d{3}-\d{4}", "Call 555-1234 today")
print(result['matches']) # ['555-1234']
# Explain pattern
explanation = tester.explain(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b")
print(explanation)
# Generate pattern from examples
pattern = tester.generate_pattern(["555-1234", "123-4567", "999-0000"])
print(pattern) # r"\d{3}-\d{4}"
Features
-
Pattern Testing: Test against text with detailed results
-
Match Visualization: See exactly what matched and where
-
Pattern Explanation: Plain-English breakdown of regex
-
Pattern Generation: Create patterns from example matches
-
Find & Replace: Test substitution patterns
-
Common Patterns: Library of pre-built patterns
-
Validation: Check pattern syntax before use
API Reference
Testing
tester = RegexTester()
# Basic test
result = tester.test(r"\d+", "There are 42 items")
# {
# 'pattern': r'\d+',
# 'text': 'There are 42 items',
# 'matches': ['42'],
# 'match_count': 1,
# 'positions': [(10, 12)], # (start, end)
# 'groups': []
# }
# With flags
result = tester.test(r"hello", "Hello World", ignore_case=True)
# All matches with groups
result = tester.test(r"(\d{3})-(\d{4})", "555-1234 and 999-0000")
# matches: ['555-1234', '999-0000']
# groups: [('555', '1234'), ('999', '0000')]
Explanation
explanation = tester.explain(r"\b\w+@\w+\.\w{2,}\b")
# Returns:
# \b - Word boundary
# \w+ - One or more word characters
# @ - Literal '@'
# \w+ - One or more word characters
# \. - Literal '.'
# \w{2,} - Two or more word characters
# \b - Word boundary
Pattern Generation
# Generate pattern from examples
examples = ["user@test.com", "admin@site.org", "info@company.net"]
pattern = tester.generate_pattern(examples)
# Suggests: r"[a-z]+@[a-z]+\.(com|org|net)"
# With negative examples (what NOT to match)
pattern = tester.generate_pattern(
positive=["555-1234", "999-0000"],
negative=["555-12", "12345"]
)
Find & Replace
result = tester.replace(
pattern=r"(\d{3})-(\d{4})",
replacement=r"(\1) \2",
text="Call 555-1234"
)
# "Call (555) 1234"
Validation
# Check if pattern is valid
is_valid, error = tester.validate(r"[invalid")
# is_valid: False
# error: "unterminated character set"
Common Patterns
# Get pre-built patterns
email = tester.patterns['email']
phone = tester.patterns['phone']
url = tester.patterns['url']
ip = tester.patterns['ipv4']
CLI Usage
# Test pattern
python regex_tester.py --pattern "\d+" --text "There are 42 items"
# Test from file
python regex_tester.py --pattern "\w+@\w+\.\w+" --file emails.txt
# Explain pattern
python regex_tester.py --explain "\b[A-Z][a-z]+\b"
# Find and replace
python regex_tester.py --pattern "(\d+)" --replace "[$1]" --text "Item 42"
# Generate pattern
python regex_tester.py --generate "555-1234,999-0000,123-4567"
# Interactive mode
python regex_tester.py --interactive
CLI Arguments
| --pattern
| Regex pattern
| -
| --text
| Text to test against
| -
| --file
| File to test against
| -
| --explain
| Explain pattern
| False
| --replace
| Replacement pattern
| -
| --generate
| Generate from examples
| -
| --ignore-case
| Case insensitive
| False
| --multiline
| Multiline mode
| False
| --interactive
| Interactive mode
| False
Examples
Email Validation
tester = RegexTester()
email_pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
test_emails = [
"user@example.com",
"invalid.email",
"user@domain",
"test.user+tag@sub.domain.org"
]
for email in test_emails:
result = tester.test(email_pattern, email)
status = "Valid" if result['matches'] else "Invalid"
print(f"{email}: {status}")
Extract Data from Log
tester = RegexTester()
log_line = '[2024-01-15 14:30:22] ERROR: Connection timeout (192.168.1.100)'
pattern = r'\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\] (\w+): (.+) \((\d+\.\d+\.\d+\.\d+)\)'
result = tester.test(pattern, log_line)
if result['groups']:
timestamp, level, message, ip = result['groups'][0]
print(f"Time: {timestamp}")
print(f"Level: {level}")
print(f"Message: {message}")
print(f"IP: {ip}")
Phone Number Formatter
tester = RegexTester()
# Normalize various phone formats
phones = ["5551234567", "(555) 123-4567", "555.123.4567"]
pattern = r"[\(\)\.\s-]?"
for phone in phones:
# Remove all formatting
clean = tester.replace(r"[\(\)\.\s-]", "", phone)
# Format consistently
formatted = tester.replace(
r"(\d{3})(\d{3})(\d{4})",
r"(\1) \2-\3",
clean
)
print(f"{phone} -> {formatted}")
Understand Complex Pattern
tester = RegexTester()
# Explain a complex pattern
pattern = r"(?:https?://)?(?:www\.)?([a-zA-Z0-9-]+)\.([a-zA-Z]{2,})(?:/\S*)?"
explanation = tester.explain(pattern)
print(explanation)
# (?:https?://)? - Optional non-capturing group: 'http://' or 'https://'
# (?:www\.)? - Optional non-capturing group: 'www.'
# ([a-zA-Z0-9-]+) - Capturing group 1: domain name
# \. - Literal '.'
# ([a-zA-Z]{2,}) - Capturing group 2: TLD
# (?:/\S*)? - Optional non-capturing group: path
Common Patterns Library
| email
| Complex
| user@domain.com
| phone_us
| \d{3}[-.]?\d{3}[-.]?\d{4}
| 555-123-4567
| url
| Complex
| https://example.com
| ipv4
| \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
| 192.168.1.1
| date_iso
| \d{4}-\d{2}-\d{2}
| 2024-01-15
| time_24h
| \d{2}:\d{2}(:\d{2})?
| 14:30:00
| hex_color
| #[0-9A-Fa-f]{6}
| #FF5733
| zipcode_us
| \d{5}(-\d{4})?
| 12345-6789
Dependencies
(No external dependencies - uses Python standard library re module)
Limitations
-
Pattern generation is heuristic (may not be optimal)
-
Some advanced regex features are Python-specific
-
Explanation works best with common patterns
-
Very long patterns may have simplified explanations