Letter Count Test Manifold¶

Overview¶

The Letter Count test manifold evaluates a model's ability to perform precise character-level analysis across multiple words, requiring accurate letter identification, frequency counting, and arithmetic aggregation. Models must identify target letters within word collections, count their occurrences across all words (case-insensitive), and provide accurate totals while managing cognitive load from word repetition and irrelevant vocabulary.

Task Description¶

Models are presented with collections of words and must count the total occurrences of specified target letters across all words in the collection. The task requires character-level attention, case-insensitive matching, and accurate arithmetic summation while handling distractors in the form of words that don't contain target letters and potential word repetitions.

Key Features: - Character Recognition: Identifying specific letters within words regardless of case - Frequency Analysis: Counting multiple occurrences of letters within individual words - Cross-Word Aggregation: Summing letter counts across entire word collections - Case Insensitivity: Treating uppercase and lowercase letters as equivalent - Multi-Letter Targeting: Handling single or multiple target letters simultaneously - Distractor Resistance: Processing words without target letters that increase cognitive load - Repetition Handling: Managing repeated words that may appear in collections

Test Case Generation¶

Algorithm Overview¶

The generator creates challenging letter counting scenarios through a systematic process:

Letter Selection: Choose target letters from frequency-stratified pools (common, uncommon, rare)
Word Filtering: Filter dictionary by length constraints and letter content
Target Word Sampling: Select words containing target letters with controlled repetition
Distractor Word Injection: Add words without target letters to increase cognitive load
Collection Shuffling: Randomize word order to prevent positional biases
Letter Counting: Calculate total target letter occurrences across all words
Format Generation: Create natural language problem statements

Letter Frequency Stratification¶

The system organizes letters into three frequency strata based on English language usage patterns:

Stratum	Letters	Count	Usage Pattern
Common	e, t, a, o, i, n, s, h, r	9	High-frequency letters appearing in most words
Uncommon	d, l, c, u, m, w, f, g, y, p, b	11	Mid-frequency letters with moderate occurrence
Rare	v, k, j, x, q, z	6	Low-frequency letters creating challenging searches

Word Selection Strategy¶

Target Words: Words containing at least one target letter - Length Filtering: Constrained by min_word_length and max_word_length parameters - Content Filtering: Must contain at least one target letter (case-insensitive) - Repetition Control: Controlled by prob_repeat parameter for cognitive load variation

Distractor Words: Words containing no target letters - Length Filtering: Same length constraints as target words - Content Filtering: Must not contain any target letters (case-insensitive) - Cognitive Load: Increase processing difficulty without contributing to count

Repetition System¶

The generator supports controlled word repetition to test attention and memory:

Repetition Mechanics: - Probability Control: prob_repeat determines likelihood of repeating previously selected words - Selection Strategy: When repetition occurs, randomly select from already-chosen words - Tracking: System tracks total number of repeated words for analysis - Cognitive Challenge: Repeated words test whether models maintain accurate running counts

Configuration Parameters¶

Generation Schema (`LetterCountGenerationParams`)¶

class LetterCountGenerationParams(BaseModel):
    count: int                                   # Number of test cases to generate (> 0)
    letters_common: int                         # High-frequency letters to include (≥ 0)
    letters_uncommon: int                       # Mid-frequency letters to include (≥ 0)
    letters_rare: int                           # Low-frequency letters to include (≥ 0)
    min_word_length: int                        # Minimum word length (1-20, default: 1)
    max_word_length: int                        # Maximum word length (3-20, default: 20)
    target_words: int                           # Words containing target letters (≥ 1)
    confounding_words: int                      # Words without target letters (≥ 0)
    prob_repeat: float                          # Probability of word repetition (0.0-1.0)

Result Schema (`LetterCountTestCaseResult`)¶

class LetterCountTestCaseResult(BaseModel):
    input: str                                  # Formatted counting problem
    letters: List[str]                         # Selected target letters
    target: str                                # Total letter count (as string)
    words_repeated: int                        # Number of repeated words

Example Test Cases¶

Single Letter Counting (letters_common=1, target_words=4, confounding_words=2)¶

Target letter: 'e'
Words: apple tree house car dog elephant

Letter Analysis: - apple: 1 'e' ✓ - tree: 3 'e' ✓ - house: 1 'e' ✓ - car: 0 'e' (distractor) - dog: 0 'e' (distractor) - elephant: 2 'e' ✓

Expected Answer: 7

Multi-Letter Counting (letters_common=2, letters_rare=1, target_words=5, confounding_words=3)¶

Target letters: 'a', 'e', 'z'
Words: amazing zebra house dog cat pizza table frozen

Letter Analysis: - amazing: 2 'a', 0 'e', 1 'z' = 3 ✓ - zebra: 1 'a', 0 'e', 1 'z' = 2 ✓ - house: 0 'a', 1 'e', 0 'z' = 1 ✓ - dog: 0 'a', 0 'e', 0 'z' = 0 (distractor) - cat: 1 'a', 0 'e', 0 'z' = 1 ✓ - pizza: 2 'a', 0 'e', 2 'z' = 4 ✓ - table: 1 'a', 1 'e', 0 'z' = 2 ✓ - frozen: 0 'a', 0 'e', 1 'z' = 1 ✓

Expected Answer: 14

Word Repetition Challenge (letters_uncommon=1, target_words=3, prob_repeat=0.5)¶

Target letter: 'd'
Words: dog dog bird cat dog

Letter Analysis: - dog: 1 'd' ✓ (appears 3 times) - bird: 1 'd' ✓ - cat: 0 'd' (distractor)

Repetition Analysis: 2 words repeated (two extra "dog" instances)

Expected Answer: 4

Length-Constrained Counting (letters_rare=2, min_word_length=6, max_word_length=10, target_words=4)¶

Target letters: 'x', 'z'
Words: example frozen maximize complex simple

Length Analysis: All words 6-10 characters Letter Analysis: - example: 1 'x', 0 'z' = 1 ✓ - frozen: 0 'x', 1 'z' = 1 ✓ - maximize: 1 'x', 1 'z' = 2 ✓ - complex: 1 'x', 0 'z' = 1 ✓ - simple: 0 'x', 0 'z' = 0 (distractor)

Expected Answer: 5

Mixed Frequency Challenge (letters_common=1, letters_uncommon=1, letters_rare=1, target_words=6)¶

Target letters: 'e', 'w', 'q'
Words: queen water example quick brown fox jumped

Frequency Analysis: - Common: 'e' (high frequency) - Uncommon: 'w' (mid frequency)
- Rare: 'q' (low frequency)

Letter Analysis: - queen: 2 'e', 0 'w', 1 'q' = 3 ✓ - water: 1 'e', 1 'w', 0 'q' = 2 ✓ - example: 2 'e', 0 'w', 0 'q' = 2 ✓ - quick: 0 'e', 0 'w', 1 'q' = 1 ✓ - brown: 0 'e', 1 'w', 0 'q' = 1 ✓ - fox: 0 'e', 0 'w', 0 'q' = 0 (distractor) - jumped: 1 'e', 0 'w', 0 'q' = 1 ✓

Expected Answer: 10

Distractor System¶

Primary Distractors: Non-Target Words¶

Words that contain no target letters but increase cognitive processing load: - Length Matching: Same length constraints as target words - Semantic Plausibility: Common vocabulary that fits naturally in word collections - Cognitive Load: Force models to process irrelevant words while maintaining focus

Secondary Distractors: Repeated Words¶

Word repetitions that test attention and counting accuracy: - Memory Challenge: Models must track whether repeated words contribute additional counts - Attention Test: Repeated words may cause models to lose focus or miscount - Arithmetic Complexity: Each repetition adds to the total count proportionally

Strategic Distribution¶

Words are randomly shuffled to prevent positional biases and ensure distractors appear throughout the collection, maintaining cognitive load across the entire problem.

Cognitive Skills Tested¶

Character Recognition: Accurate identification of specific letters within words
Case Insensitivity: Treating uppercase and lowercase letters equivalently
Frequency Analysis: Counting multiple occurrences within individual words
Cross-Word Aggregation: Maintaining running totals across word collections
Selective Attention: Focusing on target letters while ignoring other characters
Working Memory: Tracking counts while processing sequential words
Arithmetic Accuracy: Precise summation of letter frequencies
Distractor Resistance: Maintaining focus despite irrelevant words
Pattern Recognition: Identifying target letters across varied word contexts
Repetition Handling: Accurately processing repeated words without confusion

Applications¶

This test manifold evaluates capabilities essential for:

Text Analysis: Character-level analysis and frequency counting in documents
Data Processing: Extracting specific patterns from textual datasets
Quality Assurance: Detecting specific characters or patterns in text validation
Linguistic Analysis: Studying letter frequency patterns in language samples
Information Extraction: Finding specific textual elements within larger collections
Attention to Detail: Precise character-level processing under cognitive load
Pattern Matching: Identifying recurring elements across text collections