Letter Count Test Manifold¶
Overview¶
The Letter Count test manifold evaluates a model's ability to perform precise character-level analysis across multiple words, requiring accurate letter identification, frequency counting, and arithmetic aggregation. Models must identify target letters within word collections, count their occurrences across all words (case-insensitive), and provide accurate totals while managing cognitive load from word repetition and irrelevant vocabulary.
Task Description¶
Models are presented with collections of words and must count the total occurrences of specified target letters across all words in the collection. The task requires character-level attention, case-insensitive matching, and accurate arithmetic summation while handling distractors in the form of words that don't contain target letters and potential word repetitions.
Key Features: - Character Recognition: Identifying specific letters within words regardless of case - Frequency Analysis: Counting multiple occurrences of letters within individual words - Cross-Word Aggregation: Summing letter counts across entire word collections - Case Insensitivity: Treating uppercase and lowercase letters as equivalent - Multi-Letter Targeting: Handling single or multiple target letters simultaneously - Distractor Resistance: Processing words without target letters that increase cognitive load - Repetition Handling: Managing repeated words that may appear in collections
Test Case Generation¶
Algorithm Overview¶
The generator creates challenging letter counting scenarios through a systematic process:
- Letter Selection: Choose target letters from frequency-stratified pools (common, uncommon, rare)
- Word Filtering: Filter dictionary by length constraints and letter content
- Target Word Sampling: Select words containing target letters with controlled repetition
- Distractor Word Injection: Add words without target letters to increase cognitive load
- Collection Shuffling: Randomize word order to prevent positional biases
- Letter Counting: Calculate total target letter occurrences across all words
- Format Generation: Create natural language problem statements
Letter Frequency Stratification¶
The system organizes letters into three frequency strata based on English language usage patterns:
| Stratum | Letters | Count | Usage Pattern |
|---|---|---|---|
| Common | e, t, a, o, i, n, s, h, r | 9 | High-frequency letters appearing in most words |
| Uncommon | d, l, c, u, m, w, f, g, y, p, b | 11 | Mid-frequency letters with moderate occurrence |
| Rare | v, k, j, x, q, z | 6 | Low-frequency letters creating challenging searches |
Word Selection Strategy¶
Target Words: Words containing at least one target letter
- Length Filtering: Constrained by min_word_length and max_word_length parameters
- Content Filtering: Must contain at least one target letter (case-insensitive)
- Repetition Control: Controlled by prob_repeat parameter for cognitive load variation
Distractor Words: Words containing no target letters - Length Filtering: Same length constraints as target words - Content Filtering: Must not contain any target letters (case-insensitive) - Cognitive Load: Increase processing difficulty without contributing to count
Repetition System¶
The generator supports controlled word repetition to test attention and memory:
Repetition Mechanics:
- Probability Control: prob_repeat determines likelihood of repeating previously selected words
- Selection Strategy: When repetition occurs, randomly select from already-chosen words
- Tracking: System tracks total number of repeated words for analysis
- Cognitive Challenge: Repeated words test whether models maintain accurate running counts
Configuration Parameters¶
Generation Schema (LetterCountGenerationParams)¶
class LetterCountGenerationParams(BaseModel):
count: int # Number of test cases to generate (> 0)
letters_common: int # High-frequency letters to include (≥ 0)
letters_uncommon: int # Mid-frequency letters to include (≥ 0)
letters_rare: int # Low-frequency letters to include (≥ 0)
min_word_length: int # Minimum word length (1-20, default: 1)
max_word_length: int # Maximum word length (3-20, default: 20)
target_words: int # Words containing target letters (≥ 1)
confounding_words: int # Words without target letters (≥ 0)
prob_repeat: float # Probability of word repetition (0.0-1.0)
Result Schema (LetterCountTestCaseResult)¶
class LetterCountTestCaseResult(BaseModel):
input: str # Formatted counting problem
letters: List[str] # Selected target letters
target: str # Total letter count (as string)
words_repeated: int # Number of repeated words
Example Test Cases¶
Single Letter Counting (letters_common=1, target_words=4, confounding_words=2)¶
Target letter: 'e'
Words: apple tree house car dog elephant
Letter Analysis: - apple: 1 'e' ✓ - tree: 3 'e' ✓ - house: 1 'e' ✓ - car: 0 'e' (distractor) - dog: 0 'e' (distractor) - elephant: 2 'e' ✓
Expected Answer: 7
Multi-Letter Counting (letters_common=2, letters_rare=1, target_words=5, confounding_words=3)¶
Target letters: 'a', 'e', 'z'
Words: amazing zebra house dog cat pizza table frozen
Letter Analysis: - amazing: 2 'a', 0 'e', 1 'z' = 3 ✓ - zebra: 1 'a', 0 'e', 1 'z' = 2 ✓ - house: 0 'a', 1 'e', 0 'z' = 1 ✓ - dog: 0 'a', 0 'e', 0 'z' = 0 (distractor) - cat: 1 'a', 0 'e', 0 'z' = 1 ✓ - pizza: 2 'a', 0 'e', 2 'z' = 4 ✓ - table: 1 'a', 1 'e', 0 'z' = 2 ✓ - frozen: 0 'a', 0 'e', 1 'z' = 1 ✓
Expected Answer: 14
Word Repetition Challenge (letters_uncommon=1, target_words=3, prob_repeat=0.5)¶
Target letter: 'd'
Words: dog dog bird cat dog
Letter Analysis: - dog: 1 'd' ✓ (appears 3 times) - bird: 1 'd' ✓ - cat: 0 'd' (distractor)
Repetition Analysis: 2 words repeated (two extra "dog" instances)
Expected Answer: 4
Length-Constrained Counting (letters_rare=2, min_word_length=6, max_word_length=10, target_words=4)¶
Target letters: 'x', 'z'
Words: example frozen maximize complex simple
Length Analysis: All words 6-10 characters Letter Analysis: - example: 1 'x', 0 'z' = 1 ✓ - frozen: 0 'x', 1 'z' = 1 ✓ - maximize: 1 'x', 1 'z' = 2 ✓ - complex: 1 'x', 0 'z' = 1 ✓ - simple: 0 'x', 0 'z' = 0 (distractor)
Expected Answer: 5
Mixed Frequency Challenge (letters_common=1, letters_uncommon=1, letters_rare=1, target_words=6)¶
Target letters: 'e', 'w', 'q'
Words: queen water example quick brown fox jumped
Frequency Analysis:
- Common: 'e' (high frequency)
- Uncommon: 'w' (mid frequency)
- Rare: 'q' (low frequency)
Letter Analysis: - queen: 2 'e', 0 'w', 1 'q' = 3 ✓ - water: 1 'e', 1 'w', 0 'q' = 2 ✓ - example: 2 'e', 0 'w', 0 'q' = 2 ✓ - quick: 0 'e', 0 'w', 1 'q' = 1 ✓ - brown: 0 'e', 1 'w', 0 'q' = 1 ✓ - fox: 0 'e', 0 'w', 0 'q' = 0 (distractor) - jumped: 1 'e', 0 'w', 0 'q' = 1 ✓
Expected Answer: 10
Distractor System¶
Primary Distractors: Non-Target Words¶
Words that contain no target letters but increase cognitive processing load: - Length Matching: Same length constraints as target words - Semantic Plausibility: Common vocabulary that fits naturally in word collections - Cognitive Load: Force models to process irrelevant words while maintaining focus
Secondary Distractors: Repeated Words¶
Word repetitions that test attention and counting accuracy: - Memory Challenge: Models must track whether repeated words contribute additional counts - Attention Test: Repeated words may cause models to lose focus or miscount - Arithmetic Complexity: Each repetition adds to the total count proportionally
Strategic Distribution¶
Words are randomly shuffled to prevent positional biases and ensure distractors appear throughout the collection, maintaining cognitive load across the entire problem.
Cognitive Skills Tested¶
- Character Recognition: Accurate identification of specific letters within words
- Case Insensitivity: Treating uppercase and lowercase letters equivalently
- Frequency Analysis: Counting multiple occurrences within individual words
- Cross-Word Aggregation: Maintaining running totals across word collections
- Selective Attention: Focusing on target letters while ignoring other characters
- Working Memory: Tracking counts while processing sequential words
- Arithmetic Accuracy: Precise summation of letter frequencies
- Distractor Resistance: Maintaining focus despite irrelevant words
- Pattern Recognition: Identifying target letters across varied word contexts
- Repetition Handling: Accurately processing repeated words without confusion
Applications¶
This test manifold evaluates capabilities essential for:
- Text Analysis: Character-level analysis and frequency counting in documents
- Data Processing: Extracting specific patterns from textual datasets
- Quality Assurance: Detecting specific characters or patterns in text validation
- Linguistic Analysis: Studying letter frequency patterns in language samples
- Information Extraction: Finding specific textual elements within larger collections
- Attention to Detail: Precise character-level processing under cognitive load
- Pattern Matching: Identifying recurring elements across text collections