Project: PathFindN

A mathematical expression game that doubles as a dataset generator. Born from AI training and reviewing work, PathFindN asks players to find valid mathematical "paths" to a target integer N, with mandatory reasoning for each submission. Every accepted path and its reasoning are stored for potential use in anonymized research datasets.

Play PathFindN →

PathValidator: The Core System

The entire game hinges on one question: is this expression safe to evaluate, and does it actually equal N? The PathValidator handles both through a multi-stage pipeline that combines Python AST analysis with SymPy computer algebra verification.

PathValidator pipeline: expression input through normalize, AST parse, whitelist check, constraints, safe_eval, SymPy CAS verify, canonical hash, and storage.

Validation Pipeline

  1. Normalize — Unicode cleanup (π to pi, ^ to **, |x| to abs(x))
  2. AST Parse — Python ast.parse in eval mode
  3. Whitelist Check — Only whitelisted operators (+, -, *, /, //, **, %), functions (sqrt, log, sin, cos, floor, ceil, gcd, etc.), and constants (pi, e, tau) are allowed. Attribute access, method calls, dunders, and imports are all blocked.
  4. Constraint Check — Optional daily mutators: forbidden digits, allowed operator subsets, minimum expression length
  5. safe_eval — Numeric evaluation via whitelisted AST walker using cmath/math
  6. SymPy CAS Verifysimplify(expr - N) == 0 catches symbolic identities that numeric checks miss
  7. Canonical HashSHA256(sp.srepr(parse_expr(normalized))) for expression uniqueness. 1+2 and 2+1 canonicalize to the same hash.

CAS verification is the primary acceptance path. Numeric fallback (within ±0.0001) catches expressions SymPy cannot simplify. Both paths feed into the same canonical hash for deduplication.

Integer Regime System

Target numbers are drawn from difficulty tiers based on the formula IRn = 5 × 11n-1 (Principle of Self-Interaction):

RegimeRangeScale
d00 – 4Axiomatic
d15 – 55Tens
d256 – 605Hundreds
d3606 – 6,655Thousands
d46,656 – 73,205Ten-thousands
d573,206 – 805,255Hundred-thousands
d6805,256 – 8,857,805Millions

Challenge modes: Standard (stay in your regime) and Ladder (start at d0 and advance on each accepted path).

Stack

Python (Flask), SymPy, PostgreSQL (Supabase), Supabase Auth (JWT), Stripe for billing. FREE tier gets 3 attempts/day; SUPPORTER gets 5.

Dataset Generation Angle

Every submission is stored with its expression, canonical hash, reasoning text, regime, and challenge mode. The game is the data collection mechanism. Players generate novel mathematical paths that no scraper or synthetic generator would produce, because the reasoning requirement forces genuine mathematical thinking rather than brute-force enumeration.