I design AI systems against the friction points of real operations, informed by 24 years across sales, technical sales, and field application engineering, from C-suite to field installer, five-figure to seven-figure deals. I'm not a trained engineer or programmer. That's not the gap it looks like; it's the angle that produced the work below. The portfolio includes a 74-module measurement engine with 43 USPTO provisional filings, two working research papers (IBTR / TRIL and DALD), a live consumer reference app, a cross-domain falsification that closed on null, an EU AI Act reference architecture killed within ~24 hours of a self-commissioned competitive ultrareview, and an active California advisory practice in pre-launch where AI is auditable production infrastructure architected to a published Terms of Service and attorney-reviewed engagement agreement.
Most AI-augmented builders ship features. A smaller number ship products. Almost nobody ships primitives — measurement substrates, attestation interfaces, open standards underneath licensable implementations — and then tests whether those primitives transfer across domains. That's the work below: one such primitive, two foundational papers, and the projects that build on, extract from, or commercialize the substrate.
The portfolio is structured around falsifiability. Each entry declares a bar. Several clear it; two close cleanly on failing to clear it. Commercially credible is not commercially viable. Architecturally sound is not competitively defensible. The closures aren't buried; they're featured. A portfolio that only shows wins isn't evidence of judgment, it's evidence of selection.
24 years across sales, technical sales, and field application engineering — C-suite to field installer, five-figure to seven-figure deals. That built the cognitive substrate the AI work draws from: pattern recognition across modalities, liability instinct, cross-system reasoning, and the habit of operating in the gap between what a customer says and what's actually wrong. Multinational operating exposure across the US, China, Austria, Israel, and Korea — five distinct sets of norms around risk, hierarchy, and contract precision. Tigo Energy through IPO preparation, present in executive decision-making during the run-up. The instinct to see second- and third-order ramifications was assembled in real time while watching the legal, compliance, and liability apparatus get built around an operating business.
Broad architecture funds vertical execution. PIE is sensor-agnostic, domain-agnostic, worldwide-applicable, and stands as a substrate. Every commercial extraction came from narrowing — FretMind, WindPIE, Cardinal, The Installer's View, The Installer's Lens. Narrow versions moved faster and reached viability the broad version never could. The lesson isn't that broad was the mistake. It's that both layers are necessary: broad investment generates the substrate that vertical execution consumes. Applied to AI: frontier capability is broad; products that survive pick a vertical and execute ruthlessly while capability stays general.
AI was used throughout as a structured tutor and architect, not as a code generator. Most builders use AI to ship faster, which produces work but doesn't compound skill. The pattern here is the opposite: Claude walked me through unfamiliar territory step by step while I executed the actual work, with active pushback when something didn't fit. The work shipped at the same speed either way. The operator at the end is different.
I trust my intuition because it has been trained against twenty-four years of consequences. The work below either earns that trust or doesn't. The reader decides.
A sensor-agnostic measurement layer that scores any system's real-time state against its own historical baseline. Humans, hardware, AI systems, field-deployed sensors. Never against a population.
74-module JavaScript core (v8.32.0, ~971 KB, SHA-pinned) wrapping six statistical primitives: Welford incremental moments, Bayesian Online Changepoint Detection, Normal-Inverse-Gamma baselining, hierarchical shared-factor arrays, weighted heteroscedastic baselines, and distribution-fitting under AIC / BIC / tail-weighted criteria. Five-layer stack: Engine → HSAPI open-standard query interface → reference app (FretMind) → vertical adapters → AI infrastructure. Specialized modules: DALD (deceptive-alignment divergence), VTACA (vitals-triggered cognitive-assessment), PBITE (practice-quality intervention gating), GRS (ground-truth labeling), RETRO (retroactive baseline construction).
Individual-baseline-only commitment — every module computes against Welford moments of one entity's history; no population priors anywhere. Moat ordering: dataset > hardware > implementation > network effects > patents, patents fifth on purpose. HSAPI as an open standard with a published patent pledge, licensable implementation underneath. Identity Separation Rule (Validator §45) keeping processor, scope, and obligation distinguishable in every output.
27 experiments pre-registered with SHA-256 hashes before any run script existed. Six honest losses to specialist baselines (Omori, ARIMA, others) documented rather than buried. Failed experiments forced API redesigns: AIC bias produced the DFB criterion API; BOCPD startup artifacts produced CPB burn-in routing; null-saturation produced three first-class null procedures. Comprehensive trademark, prior-art, and IP vetting across the engine; pulled USPTO PAIR export to verify the 43 provisionals were on file, caught an entity-status error and an RFC 3161 temporal-attestation gap. OSS license audit and SBOM confirmed no GPL / AGPL contamination.
The foundational paper underneath the seven other projects in this portfolio: population reference is categorically wrong for individual-divergence questions, and one self-referential measurement substrate operates across domains as different as music, surgery, AI alignment, industrial machinery, and seismic monitoring.
Most measurement systems compare a subject to a population. This works for triage, selection, and placement. It fails for a different class: has this specific subject changed? Is the gap between what this subject claims and what this subject does growing? For those questions the subject is the reference, not the population. IBTR is the methodological proposal for what first-class infrastructural support for self-reference looks like.
Two signals: stated capability (what the subject claims) and demonstrated capability (what independent measurement records). Signed divergence accumulates as a per-subject baseline via Welford's online algorithm — O(1) memory, O(1) update, no need to store raw observations. After enough observations the baseline locks; new observations score against it via standard z-thresholds. The math is decades old; the contribution is the architectural commitment, not the equations.
Five worked-through domains in the paper share the same Welford accumulator and same z-classifier; only the sensor pipeline is domain-specific: music performance (intent vs. execution), surgical performance (procedural plan vs. motor control), AI alignment (model self-report vs. AI-affected reality), industrial machinery (reported state vs. sensor measurement), and geophysical monitoring (model prediction vs. measurement). One measurement substrate, five domains. The empirical content of the domain-agnostic claim.
The consumer reference app that proved PIE works in a live capture loop. The meta-artifact on which AI collaboration patterns produced novel work versus runaway scope.
Browser-based: Web Audio API for real-time pitch and rhythm, MediaPipe Hands for body mechanics, Basic Pitch (Spotify) for audio-to-MIDI. Five stateful coaching personas route to different LLM prompting and feedback styles. Welford-based individual baselines with z-score classification against the player's own history. DALD monitors divergence between AI-claimed session quality and the player's independently-measured improvement trajectory. VTACA detects breath-holding patterns during cognitive load. PBITE gates interventions on practice-quality signal.
Persona system as a structural product decision, not a UI choice — each persona routes to different prompting and recommendation libraries. April 2026 triage: 22 modules KEEP for consumer use, 10 SIMPLIFY, 30 REMOVE (retained in reference repo), 6 REFERENCE-ONLY. Shipped the hand-written browser port first rather than waiting for full-engine integration — prioritizing real sessions over architectural purity.
Two live practice sessions in April 2026 with real audio capture, verifying VTACA breath-hold detection and PBITE intervention gating worked in production rather than in simulation. Caught mic-clipping (sessions 1–6 corrupted due to laptop mic proximity), traced to input sensitivity rather than engine failure. Forced the distinction between "commercially credible" and "commercially viable" and admitted FretMind achieved the first but not the second. None of PIE's core functional claims (quality score validity, PBITE gating, DALD, BAIV, CAAD, AICV) have been empirically validated on real deployment data.
A narrowed extraction of PIE primitives applied to wind turbine fleet analytics. The hypothesis was that per-turbine individual baselines would beat fleet-mean comparison at surfacing subtle degradation. The data said otherwise.
Python module on a primitives stack: Welford running statistics, multi-scale rolling-window baselines, bin-keyed conditional accumulators, per-turbine integrated baseline, CUSUM changepoint detector. Wind-domain layer using pvlib for atmospherics, IEC 61400 density correction, turbulence intensity, sector classification, wake-affected flagging. Validation harness ran against the CARE-to-Compare labeled dataset (Wind Farm A, 22 labeled events).
IBTR commitment — individual baseline only, no population comparison in the primary detection path — against the easier fleet-mean path. Pre-registration discipline applied to my own validation, not just to claims I would make to others. SHA-256-hashed pre-registration locked hypothesis and decision rules before analysis. Held-out test split locked before any peek. Direct comparison against fleet-mean (peer) detection treating the IBTR architectural premise as the thing on trial.
A reference architecture for EU AI Act compliance infrastructure, scoped as documented specifications plus illustrative TypeScript implementation. Six articles in scope; three explicitly excluded as discipline. Killed within ~24 hours of a self-commissioned competitive ultrareview.
Reference architecture, not finished product. Documentation is the primary deliverable; code illustrates the documentation. Licensees take the architecture and build their own production systems against it. EU AI Act Articles 5, 9, 10, 12, 13, and 14 in scope across six functional module groups (core primitives, governance, audit, monitoring, oversight, transparency). Articles 11, 15, 17 explicitly excluded — Articles 11 and 17 as different problem space; Article 15 as different discipline and vendor category. Exclusions documented as deliberate boundaries, not gaps.
Reference-architecture positioning, not productized SaaS — inverting the typical documentation-to-code ratio to match what sophisticated EU AI Act buyers actually want during the Goldilocks period. Three-reference claim discipline: every architectural claim required specification, illustrative implementation, and tests before any external use; claims without all three were removed. Multi-path commercial strategy preserved rather than committed prematurely — direct license, audit firm partnership, strategic partnership, acqui-hire (€50K–200K to €3–15M range). Grant-back clause non-negotiable: any licensee improvements grant back non-exclusively. IP integration with 43 USPTO provisional filings.
After the May 7, 2026 foundation session produced full handoff documentation, scope, commercial strategy, and three identified warm EU contacts, the next gate before any outreach was a competitive landscape ultrareview I commissioned myself. It surfaced Microsoft Agent Governance Toolkit (AGT), released MIT-licensed and free on April 2, 2026, addressing substantially overlapping ground. I killed Cardinal on or around May 8, 2026, within ~24 hours of the discovery. No EU contacts were ever approached. The discipline rule the kill produced: competitive landscape ultrareview is a gate before commercial commitment, not a checkbox after sophisticated build.
An independent California solar advisory practice for residential homeowners, built as an AI-augmented professional services operation with explicit production guardrails.
Claude as primary LLM for content synthesis, document review, and operational reasoning, against vendor criteria enforced contractually: no training on client data, encryption in transit and at rest, time-bounded retention. Delivery stack: WordPress + Kadence on SiteGround; Stripe + Mercury + Wave for the financial stack; SignWell for e-signatures on engagements ≥ $300; M365 OneDrive on a .onmicrosoft.com tenant. A multi-sheet Excel workbook is the single source of truth for decisions, dashboard, compliance calendar, and lessons learned. Professional liability and cyber coverage carried; the practice is structured so that AI-produced work product is defensible to clients, regulators, and counsel.
Position B — AI in standard production, principal verifies all judgments — encoded consistently across Privacy Policy, Terms of Service, Engagement Agreement, and service-page FAQ. No-named-AI-tools rule in public materials: vendor selection criteria are durable; vendor names will churn. Nine-dimension content audit run against every public-facing draft. Editorial firewall against reviewing proposals from any installer TIV has consulted to within a defined recent window. Four-pillar service architecture with explicit "what TIV cannot do" scope.
Trademark / prior-art vetting pre-commit. Formal legal-docs drift audit against v1 Privacy Policy and Terms identified eight material drifts — sole proprietorship → LLC, Position A → Position B, Gumroad → Stripe, voice register, AI use disclosure missing — and v2 drafts produced with changelog. Engagement Agreement routed for attorney review with directed-attention notes to Limitation of Liability (CA Civil Code §1668), Indemnification reasonableness, and Governing Law. Affiliate-bias falsifiability test caught and rewrote my own original About-page draft as false once affiliate revenue was on the roadmap.
TIV was deliberately built using Claude in a structured-mentorship pattern. Each phase — domain registration, WordPress installation, Git and GitHub setup, Node and npm configuration, Ubuntu local development, hosting and DNS, insurance binding, legal review — was Claude walking me through unfamiliar territory step by step while I executed the actual work, with active pushback when something didn't fit. Claude as tutor, mentor, business manager, and architect, with me doing the building. Most candidates use AI to ship faster. The TIV work shipped at the same speed, but with the human side of the system substantially stronger at the end of the build than at the start.
The analytical engine underneath The Installer's View: a Python-based multi-source verification platform that automates the labor-intensive solar-proposal review workflow.
California homeowners reviewing rooftop solar proposals receive economic projections that frequently assume legacy NEM 2.0 economics, despite the state having transitioned to NEM 3.0 / NBT in April 2023 — under which export credits fell roughly 75% on average. Independent verification has historically required either paying a competing installer for a counter-bid (same incentive bias) or hiring an engineer ($1,500+, weeks of turnaround). The Installer's Lens automates the verification workflow at a fraction of the cost and turnaround.
Single-user Streamlit UI wrapping a Python backend that parses unstructured intake (utility bills via vision-model OCR, Green Button XML, panel photos, installer proposal PDFs), orchestrates parallelized API queries across 16 authoritative public data sources, runs a 10-lens analytical framework, and generates client deliverables via template-driven PDF rendering with full audit trail. Sources span solar resource modeling (NREL NSRDB, PVWatts, PySAM), roof geometry (Google Solar API), equipment validation (CEC), installer verification (CSLB, CFPB, CourtListener, SEC EDGAR), environmental context (EPA AirNow, CAL FIRE, CPUC PSPS), and utility-specific NBT rate schedules.
Three-tier data access pattern separating continuous-ingest sources from article-trigger pulls from reading-as-research material. Vendor-flexible AI inference layer allowing future provider swaps without rebuild. Local-first storage (SQLite + per-engagement file structure) for audit defensibility. Position B applied at the platform level: AI drafts every analytical finding, the principal verifies every numerical claim before delivery. Fifth extraction of the PIE primitives — baseline-measurement intelligence applied to system-level economic verification.
The alignment-specific application of IBTR / TRIL: detecting deceptive alignment in deployed AI systems without requiring model internals access.
State-of-the-art deception detection — mechanistic interpretability, activation-based methods (LAT, POSER), self-monitoring architectures — shows 89–98% accuracy in controlled settings but all require model internals access, training infrastructure, or controlled evaluation environments. None work for the increasingly common case: a deployed AI accessed through an API, with no access to weights, where what we want to know is whether this specific deployment is producing genuine helpfulness or has converged toward a satisfying-but-not-helping pattern with this specific user over time.
Per-user baselines of the divergence between AI-stated outcomes (c, what the AI claims about its effect) and independently-measured demonstrated outcomes (d, what actually happened). Sustained anomalous drift — high stated outcomes paired with flat or declining demonstrated trajectories — produces a MISALIGNED classification. The signature cannot be faked by the AI alone. The AI controls c; the AI does not control d. The user's baseline trajectory accumulates from observations made independently of the AI's presence. The math is the IBTR substrate (Welford accumulators, z-score thresholds); the architectural commitment is that alignment evaluation can be read from the human the AI affected, not just from the AI itself.
DALD does not compete with mechanistic interpretability — interpretability cannot operate where the model is not accessible; DALD can. Conversely, interpretability identifies the specific circuit; DALD cannot. The two are complementary. Activation-based methods are vulnerable to recent prompt-level adversarial defeat (Daniels et al. February 2026, reducing auditor accuracy from 100% to 1–3% while auditor confidence stayed high). Trajectory-level signals require coordinated long-horizon manipulation across a specific user's history — a substantially harder optimization target than single-output adversarial defeat.
DALD was first implemented in FretMind (P-03). AI guitar coaching system; c is the AI's claimed session quality; d is the user's independently-measured timing, pitch, and rhythm accuracy from the audio analysis pipeline against the score's reference signal — measured without consulting the AI's claim. Per-user baselines accumulate across sessions. The detection signature is high mean c paired with flat or declining baseline trajectory of d. The classical sycophancy failure mode mapped to a coaching context.
I'm the profile that doesn't show up in a standard candidate pipeline. 24 years of operating record across sales, technical sales, and field application engineering, the last several months of intensive AI building from a standing start, a 74-module measurement engine with 43 USPTO provisional filings, two working research papers on alignment-relevant measurement infrastructure, two projects killed cleanly on bars I set in advance, and an active California advisory practice in pre-launch where AI is auditable production infrastructure architected to a published Terms of Service and attorney-reviewed engagement agreement.
Not a trained engineer or programmer. That's the point. The work above is what someone with my background builds when AI removes the bottleneck that would otherwise have required hiring an engineering team. And the discipline to instrument, falsify, and kill the work cleanly was already there, from two decades of carrying responsibility for outcomes in front of customers, lawyers, and regulators.
Primary target: Product Support / Customer Support Engineering Management roles at AI companies — where deep customer-support operations expertise and substantive AI literacy combine. The lateral move from solar industry customer support into AI customer support is deliberate: same function, adjacent domain, with demonstrated rapid AI adoption.
Sales-led roles: Enterprise Sales, Strategic Account Executive, Business Development, and Director-level commercial roles at AI and AI-infrastructure companies. 24 years of senior commercial track record — territory growth from $250K to $15M+, first to $1M quarter, first to $1M month, Director of Sales — paired with the AI portfolio above is a rare combination in the AI hiring pool.
Technical and customer-facing roles: Solutions Engineering, Forward-Deployed Engineering, Technical Sales Engineering, and Technical Product Manager at AI and AI-infrastructure companies. Also open to founding Solutions Architect or founding Product roles at AI startups under twenty people, where the seat pairs deep AI understanding with operational instinct and customer-facing credibility — and engineers own the implementation.