Work / Bharat-First / NFPC Mule Detection

No. 15 · Bharat-First · Finance
0.985

OOF AUC-ROC · LightGBM × XGBoost ensemble · Phase 1

The mule moves
the money. The bank
finds out
too late.

Indians lost 11,000 crore rupees to digital fraud in 2024. Most of it does not leave through one big account. It rivers through thousands of "mule" accounts, real names, fake intent, opened just to wash. NFPC Mule Detection scores those accounts before the money lands. Built for the RBI Innovation Hub × IIT Delhi TRYST 2025 challenge.

0.985Phase 1 OOF AUC-ROC
208Engineered features (Phase 2)
12Mule behaviour patterns covered
15,848Test accounts scored
3Models in the final ensemble

Act I · The Anatomy

A mule is a real account
doing somebody else's work.

A mule account belongs to a real person, KYC and all. They were paid, threatened or tricked into letting it be used. Money lands in, money leaves in seconds. By the time the fraud is reported, the money has crossed five accounts, and the trail is cold.

MULE VICTIMS EXFILTRATION

Fan-in. Pass-through. Fan-out. Five seconds.

Signal 01 · Velocity
Burstiness

Inactive account, then a sudden surge. A salaried account does not start moving in 30-second intervals at midnight.

Signal 02 · Topology
Fan-in × Fan-out

Many counterparties one way, many the other. Real account holders do not transact with thirty new strangers in an hour.

Signal 03 · Structuring
The 50,000 line

Transactions just below the reporting threshold. Round amounts, repeated. Compliance hides in plain sight.

Act II · The Patterns

Twelve patterns.
Every one covered.

The challenge specification names twelve known mule behaviours. The pipeline identifies all twelve, with statistical evidence and SHAP-explainable feature contributions for every flagged account.

  1. 01

    Dormant activation

    Inactive accounts that suddenly handle high-value bursts.

  2. 02

    Structuring

    Transactions parked just below the 50,000 INR reporting threshold.

  3. 03

    Rapid pass-through

    Near-unity credit-to-debit ratio inside the same hour.

  4. 04

    Fan-in / fan-out

    Many-to-one or one-to-many fund flows around a single hub account.

  5. 05

    Geographic anomaly

    PIN code mismatches across customer, branch and address records.

  6. 06

    New account high value

    Young accounts handling volume disproportionate to their tenure.

  7. 07

    Income mismatch

    Transaction values that do not line up with the declared balance and income profile.

  8. 08

    Post-mobile-change spike

    Activity surge in the days immediately after a mobile number update.

  9. 09

    Round-amount patterns

    Overuse of exact-round transfer amounts beyond the population baseline.

  10. 10

    Layered / subtle

    Weak signals across many low-confidence dimensions that combine into a clear flag.

  11. 11

    Salary cycle exploitation

    Laundering windows aligned with the salary credit cycle to hide inside expected flow.

  12. 12

    Branch-level collusion

    Suspicious account clusters traced back to one branch or one onboarding agent.

Act III · The Ensemble

No single model
catches every mule.

LightGBM is fast on tabular. XGBoost catches what LightGBM misses. CatBoost handles categorical fan-out without one-hot blowup. The final submission averages ranks across three seeds and five folds, so a borderline account does not flip on a single fold's noise.

Model 01

LightGBM

0.9834

Fast on the 125-feature Phase 1 set. Strong on velocity, structuring and salary-cycle signals. The base learner of the ensemble.

Model 02

XGBoost

0.9789

Catches the layered, multi-signal mules where LightGBM is less confident. Tuned with explicit class-imbalance weighting.

Model 03 · Winner

Ensemble

0.9851

Rank-averaged across LightGBM and XGBoost. Phase 2 adds CatBoost into the same pattern, taking the ensemble to a 3-model stack.

Act IV · The Features

125 features.
13 categories.

Every feature is named, sourced and ranked by SHAP importance. Phase 2 expands this to 208 features with graph signals: PageRank, HITS, betweenness centrality and Louvain communities on the counterparty graph.

Category
Count
Example feature
Transaction aggregation
7
txn_count · mean_amount · std_amount
Structuring detection
7
near_50k_rate · round_10k_rate
Velocity & burstiness
10
min_gap_hrs · med_gap_hrs · burstiness
Graph & network
8
n_unique_counterparties · cp_per_txn
Channel usage
12
ch_UPD_rate · ch_CHQ_rate · ch_ATW_rate
Unsupervised / anomaly
18
digital_score · kyc_score
Demographics & product
37
account_age · product_holdings · branch_class
Six other categories
26
temporal · geo · MCC · scheme · post-event · cohort

Act V · Proof

Held against the private set.

Phase 1 · OOF AUC-ROC

0.9851

Out-of-fold score on the 5-fold cross-validation. LightGBM × XGBoost rank-averaged ensemble on 125 engineered features.

Phase 2 · Public AUC-ROC

0.968136

Final RBIHub leaderboard, public set. 3-model ensemble: LightGBM + XGBoost + CatBoost. 208 features, 3-seed × 5-fold CV, rank averaging.

Phase 2 · Private AUC-ROC

0.955815

The held-out private leaderboard score. The drop from public to private is small, which is the score that actually matters in production.

Reproducible from one command

run_v3.py

Pipeline orchestrator runs end-to-end: features, label cleaning, ensemble, submission. Skip-Optuna flag reproduces the exact submitted score from a clean clone.

SHAP-explainable

125 → 1

Every flagged account ships with a per-feature SHAP contribution. The fraud team sees why the model said yes, not just that it did.

RBIH × IIT Delhi TRYST 2025

Team dmj.one

Built for the National Fraud Prevention Challenge hosted by Reserve Bank Innovation Hub in association with IIT Delhi TRYST.

The Stack

Tabular ML, done seriously.

  • Python
  • LightGBM
  • XGBoost
  • CatBoost
  • SHAP
  • Optuna
  • Confident Learning
  • Pandas / Polars
  • Parquet
  • NetworkX (graph features)
  • Louvain communities
  • PageRank / HITS
  • Next.js (showcase)
  • Vercel

If a hackathon model can score 0.985,
your production fraud team can do better.

I build explainable, ensemble-based fraud and risk models on real Indian financial data. Banks, payment processors and central-bank innovation hubs: this is the engineer to call.