Skip to content

Date Understanding Test Manifold

Overview

The Date Understanding test manifold evaluates a model's ability to comprehend temporal information, perform date arithmetic, and reason about calendar relationships across diverse narrative contexts. This task tests fundamental temporal reasoning capabilities by embedding date calculations within realistic scenarios involving personal events, scheduling, anniversaries, and time-sensitive activities.

Task Description

Models are presented with narrative scenarios containing temporal information and must answer questions about specific dates relative to the implied "today" in the story. The task requires extracting temporal context from natural language, understanding relative date references, and performing accurate calendar arithmetic.

Key Features:

  • Contextual Date Extraction: Determining the implicit "today" from narrative clues
  • Temporal Arithmetic: Calculating dates with offsets (days, weeks, months, years)
  • Calendar Awareness: Understanding month lengths, leap years, and year transitions
  • Multiple Date Formats: Processing various date representations and output formats
  • Realistic Scenarios: Embedding calculations in believable human situations

Test Case Generation

Algorithm Overview

The generator uses a three-stage pipeline to create comprehensive date reasoning test cases:

  1. Scenario Generation: Create realistic narrative contexts with embedded temporal information
  2. Question Assignment: Add date calculation questions with tiered difficulty levels
  3. Format Rendering: Apply diverse date formatting options for input and standardized output

Scenario Categories

The system generates 36 different scenario types covering various temporal reasoning patterns:

Scenario Key Scenario Name Example
simple_date_statement Simple Date Statement "It is 4/19/1969 today."
countdown_to_new_year New Year Countdown "2015 is coming in 36 hours."
uk_date_format UK Date Format "Today is 15th March, 2021."
thanksgiving Thanksgiving Reference "In the US, Thanksgiving is on the fourth Thursday of November. Today is the US Thanksgiving of 2001."
relative_past_reference Relative Past Reference "It was Sept. 1st, 2021 a week ago."
anniversary Wedding Anniversary "Jane and John married on Jan 2, 1958. It is their 5-year anniversary today."
golden_anniversary Golden Anniversary "Jane and John married on Jan 2, 1958. Today is their golden wedding anniversary."
flight_booking Flight Booking "Jane booked a flight for tomorrow, Jul 29, 2002."
work_anniversary Work Anniversary "Jane got her job in 2016. Today is her 3-year work anniversary."
last_day_of_month Last Day of Month "Jane is celebrating the last day of January 2012."
days_since_event Days Since Event "Jane quit her job on Mar 20, 2020. 176 days have passed since then."
appointment_scheduling Appointment Scheduling "Jane scheduled 3 appointments with 5 people for tomorrow (Tue, 7/9/1972)."
disagreement_correct_first Date Disagreement (First Correct) "Jane thinks today is 6/18/2019, but John thinks today is 6/19/2019. Jane is correct."
disagreement_correct_second Date Disagreement (Second Correct) "Jane thinks today is 6/18/2019, but John thinks today is 6/19/2019. John is correct."
date_correction Date Correction "Jane thought today is 3/11/2002, but today is in fact Mar 12, which is 1 day later."
monthly_visits Monthly Visits "Jane visits the bookstore on the 16th of each month starting from the October of 2009. It is her 5th visit to the bookstore today."
birthday_leap_year Leap Year Birthday "Jane was born on the last day of February in 2000. Today is her 16-year-old birthday."
nostalgic_reference Nostalgic Reference "May 6, 1992 is like yesterday to Jane, but that is actually ten years ago."
consumable_countdown Consumable Countdown "On May 9th, 2017 Jane bought 40 eggs. She ate one per day. Today she ran out of eggs."
event_delay Event Delay "The concert was scheduled to be on 06/01/1943, but was delayed by one day to today."
specific_time Specific Time Context "The current local time is 3:02 pm of 5/4/2004."
day_before_yesterday Day Before Yesterday "The day before yesterday was 11/23/1933."
deadline_countdown Deadline Countdown "The deadline is Jun 1, 2021, which is 2 days away from now."
first_monday_of_year First Monday of Year "The first day of 2019 is a Tuesday, and today is the first Monday of 2019."
last_day_of_year Last Day of Year "This is the last day of 1899."
meteor_shower Meteor Shower "Today is 3/5, and it is Jane's second time in the year 1973 to see a meteor shower."
nfl_watching Sports Watching "Today is 9/7. Jane is watching NFL 2003."
appointment_future Future Appointment "Today is Apr 10, 1985. Jane's appointment will be 3 days later."
christmas_eve Christmas Eve "Today is Christmas Eve of 1937."
ordinal_day_of_year Ordinal Day of Year "Today is the first day of 2007."
last_day_of_quarter Last Day of Quarter "Today is the last day of the first quarter of 2008."
palindrome_date Palindrome Date "Today is the palindrome day of 2021, because the MMDDYYYY format of the date is the same backwards as forwards."
ordinal_date Ordinal Date "Today is the second day of the third month of 1966."
meeting_reschedule Meeting Reschedule "Today's meeting is rescheduled to 11 am tomorrow, 10/16/1924."
tomorrow_reference Tomorrow Reference "Tomorrow is 11/12/2019."
yesterday_activities Yesterday Activities "Yesterday, Jan 21, 2011, Jane ate 2 pizzas and 5 wings."
year_end_transition Year End Transition "Yesterday was 12/31/1929. Today could not be 12/32/1929 because December has only 31 days."
month_end_transition Month End Transition "Yesterday was April 30, 2021."

Question Difficulty Tiers

The system assigns questions across four difficulty tiers:

Tier 0 (Basic): Direct date identification

  • "What is the date today in MM/DD/YYYY?"
  • "What is today's date in MM/DD/YYYY?"

Tier 1 (Simple Arithmetic): Adjacent day calculations

  • "What is the date tomorrow in MM/DD/YYYY?"
  • "What is the date yesterday in MM/DD/YYYY?"
  • "What is the date 24 hours later in MM/DD/YYYY?"

Tier 2 (Month/Year Arithmetic): Complex calendar calculations

  • "What is the date a month ago in MM/DD/YYYY?"
  • "What is the date one year from today in MM/DD/YYYY?"

Tier 3 (Week Arithmetic): Multi-day calculations

  • "What is the date one week ago from today in MM/DD/YYYY?"
  • "What is the date two weeks from today in MM/DD/YYYY?"
  • "What is the date 3 days ago in MM/DD/YYYY?"

Date Format Variations

Input Format Options (DateFormat Enum)

MM_DD_YYYY (0): Standard American format

"Jane got married on 06/15/1995."

NATURAL_LANGUAGE (1): Month name format

"Jane got married on Jun 15, 1995."

ORDINAL_DAY_OF_YEAR (2): Day-of-year numbering

"Jane got married on the 166th day of 1995."

OFFSET_MM_DD_YY (3): Relative reference format

"Jane got married on the day after 06/14/95."

Standard Output Format

All answers are provided in MM/DD/YYYY format regardless of input format:

Target: "06/15/1995"

Example Test Cases

Tier 0: Basic Date Identification

Input: "Jane and John married on Jan 2, 1958. It is their 5-year anniversary today. What is the date today in MM/DD/YYYY?"
Target: "01/02/1963"

Analysis: Requires understanding that a 5-year anniversary occurs exactly 5 years after the wedding date.

Tier 1: Simple Temporal Arithmetic

Input: "The deadline is Jun 1, 2021, which is 2 days away from now. What is the date tomorrow in MM/DD/YYYY?"
Target: "05/31/2021"

Analysis: Today is May 30, 2021 (2 days before June 1), so tomorrow is May 31, 2021.

Tier 2: Complex Calendar Logic

Input: "Jane was born on the last day of February in 2000. Today is her 16-year-old birthday. What is the date one year from today in MM/DD/YYYY?"
Target: "02/29/2017"

Analysis: Jane was born February 29, 2000 (leap year). Her 16th birthday is February 29, 2016 (leap year). One year later is February 29, 2017 - but 2017 is not a leap year, so this tests leap year handling.

Tier 3: Multi-Step Week Calculations

Input: "On May 9th, 2017 Jane bought 40 eggs. She ate one per day. Today she ran out of eggs. What is the date two weeks ago from today in MM/DD/YYYY?"
Target: "06/04/2017"

Analysis: Today is May 9 + 40 days = June 18, 2017. Two weeks ago is June 4, 2017.

Configuration Parameters

Generation Schema (DateGenerationParams)

class DateGenerationParams(BaseModel):
    count: int                           # Number of test cases to generate (> 0)
    tier: Optional[int]                  # Question difficulty tier (0-3, None for random)
    date_format: DateFormat              # Format for dates in input text (default: MM_DD_YYYY)

Standard Grid Configuration: - tier: [0, 1, 2, 3, None] - Question difficulty levels - date_format: [0, 1, 2, 3] - Input format variations - Generates 25 different combinations (5 tiers × 5 formats)

Result Schema (DateTestCaseResult)

class DateTestCaseResult(BaseModel):
    input: str                          # The formatted problem text
    target: str                         # The correct answer date in MM/DD/YYYY format
    scenario: str                       # The scenario type used to generate this test case
    question: str                       # The question asked
    tier: int                           # The difficulty tier of the question

Scenario Diversity

The generator includes 36 distinct scenario types with randomized parameters:

  • Character Names: 26 common names randomly assigned
  • Date Ranges: Flexible year ranges (typically 1900-2030)
  • Event Types: Varied activities (jobs, purchases, appointments, celebrations)
  • Numerical Variations: Random quantities, durations, and offsets

Cognitive Skills Tested

Core Competencies

  • Temporal Context Extraction: Identifying the implicit "today" from narrative scenarios
  • Calendar Arithmetic: Performing accurate date calculations across month/year boundaries
  • Leap Year Logic: Handling February 29th in leap years and transitions to non-leap years
  • Format Recognition: Processing dates in multiple representation formats
  • Relative Time Understanding: Converting natural language time references to specific dates

Advanced Reasoning

  • Multi-Step Inference: Combining multiple temporal clues to determine dates
  • Contradiction Resolution: Choosing correct information when presented with conflicting dates
  • Pattern Recognition: Understanding recurring events and anniversary cycles
  • Edge Case Handling: Managing calendar transitions, invalid dates, and special scenarios

Applications

This test manifold evaluates capabilities essential for:

  • Calendar Applications: Scheduling, reminder systems, and event planning
  • Natural Language Processing: Temporal information extraction from text
  • Historical Analysis: Processing documents with embedded temporal references
  • Business Logic: Deadline tracking, anniversary calculations, and time-sensitive workflows
  • Personal Assistants: Understanding and responding to time-based queries
  • Document Understanding: Extracting and reasoning about dates in various formats

The comprehensive scenario coverage and tiered difficulty system make this manifold particularly valuable for evaluating both basic temporal arithmetic and sophisticated calendar reasoning capabilities.