No. 20 · AI · Browser ML · Sentio

Train an enterprise classifier.
In your browser tab.

Most chatbot platforms charge six figures and train in someone else's cloud. The data leaves the building. Compliance teams panic. NLU Bot Trainer ships a 5-classifier stacking ensemble of 171,772 parameters that runs entirely in the browser. Zero data egress. No GPU. No SaaS.

View source How the stack works

Training · Local browser 94%

Logistic Reg.

12K

Complement NB

Linear SVM

12K

MLP · 128h

133K

Gradient Boost

171,772 params · 2.0 MB Inference · 1-6 ms Egress · 0 bytes

5Classifiers stacking

171KTotal parameters

30sFull ensemble training

1-6msInference latency

0Bytes egress

Act I · The Problem

The chatbot platform owns the data.

Enterprise NLU is a six-figure subscription that ships customer transcripts to a third-party cloud. For regulated industries, that is the whole problem.

Six figures a year, plus per-call.

The big NLU platforms bill flat fees and per-message. Pricing scales with success. You succeed harder, you pay harder.

ii.

Your customer data lives elsewhere.

Training requires upload. Inference requires upload. Compliance teams cannot sign that off in DPDP, GDPR or HIPAA jurisdictions.

iii.

One model, one bias.

Linear models miss feature overlap. Naive Bayes struggles with correlations. SVMs overfit tight margins. A single algorithm is a single weakness.

iv.

You cannot vendor-lock-out.

Train on Lex, leave Lex, retrain on Dialogflow. Most teams stop trying. The ground truth gets stuck in someone else's format.

Act II · The Stacking Ensemble

Five algorithms.
One vote.

Each classifier fails differently. The ensemble error rate is strictly lower than any individual. Cross-validated meta-weights decide who to trust on what.

Input · "where is my package?" · tokenised → MurmurHash3 → 1024-dim sparse vector

Classifier 01

Logistic Regression

12K parameters

Strong on linear boundaries. Misses overlapping features.

Classifier 02

Complement Naive Bayes v2

7K parameters

Strong on small data. Struggles with correlations.

Classifier 03

Pegasos Linear SVM

12K parameters

Sharp margins. Overfits tight clusters.

Classifier 04

MLP · 128 hidden

133K parameters

Catches non-linear patterns. Hungry for data.

Classifier 05

Gradient Boosted Stumps

7K parameters

Great on sharp splits. Misses smooth boundaries.

Meta-weights · cross-validated · learned per intent

Predicted intent — "order_status"confidence 0.94 · top-5 ranked

Every algorithm (MurmurHash3, Pegasos SVM, Complement NB, backprop MLP, gradient boosted stumps) is implemented from scratch in TypeScript. Zero ML dependencies. Ships as static files.

Act III · Built for production

Self-learning, drift-aware,
seven export formats.

Self-learning loop

Evaluates, diagnoses weak intents, augments data, pseudo-labels high-confidence predictions, curriculum-orders, retrains, validates. Accepts only if accuracy does not regress. Fully autonomous.

Drift detection

Page-Hinkley for concept drift. DDM for error-rate drift. Vocabulary distribution monitored in real time. Dashboard shows you the moment behaviour shifts.

Model registry

Semantic versioning. Champion / challenger lifecycle. A/B testing with configurable traffic splits. Rollback in one click.

Seven-platform export

Rasa YAML 3.1, Dialogflow ES, Lex V2, LUIS, Wit.ai, CSV, JSON. Vendor-out is built into the product.

Scales for free

Every user brings their own compute, the browser. Ten users or ten million, the server load is identical: it serves static files.

WCAG 2.2 AA

Full keyboard navigation. Alt+1 through Alt+8 page switching. ARIA labels, screen reader support, reduced motion, skip navigation.

The Stack

Pure TypeScript math.
No Python runtime. No GPU. No API keys.

TypeScript 5.5
Next.js 14.2
React 18.3
Pure-JS MurmurHash3
Pegasos Linear SVM
Complement Naive Bayes
Backprop MLP
Gradient Boosted Stumps
localStorage model registry
Page-Hinkley drift
DDM error-rate drift
Docker · Vercel · Any VM
AGPL 3.0

If your data cannot leave the building, train it where it lives.

I build privacy-first ML for regulated industries. From-scratch algorithms, no SaaS dependency, drop into any browser. Vendor-out is the product.

Hire me Back to work