RULEST v2.0 · GPU‑ACCELERATED RULE EXTRACTION

Rulest

Extract high‑probability Hashcat rules by analyzing transformation patterns between base and target wordlists. Four‑phase GPU extraction, built‑in seed families (A–M), Token‑Strip pre‑pass (Phase 0), Genetic Algorithm (Phase 3), and functional minimization — the most advanced rule extractor in the suite.

OpenCL GPU Four‑Phase ruleflow Seed Families A–M Functional Minimization Bloom Filter VRAM 5,600+ Validated Ops Token‑Strip Phase 0 Genetic Algorithm Phase 3
Overview

Rulest v2 replaces naive BFS chaining with a GPU‑first four‑phase extraction engine. Given a base wordlist (source) and a target wordlist (dictionary), it reverse‑engineers the Hashcat rules that transform base words into target words — using OpenCL parallelism, a VRAM Bloom filter, four distinct extraction phases, and an optional Genetic Algorithm for deep‑chain discovery.

The result is a production‑ready .rule file, ordered by hit count and minimized via signature‑based functional deduplication. 100% compatible with Hashcat’s GPU engine (max 31 ops, no rejection rules). An optional Phase 0 Token‑Strip CPU pre‑pass reverse‑engineers exact chains from target passwords using 14 extraction modes before any GPU work begins.

Core Features
Four‑phase extraction ruleflow
Phase 0 Token‑Strip (optional CPU) → Phase 1 (single‑rule sweep) → Phase S (built‑in seed families) → Phase 2 (hot‑rule biased chains) → Phase 3 GA (optional).
🌱
Built‑in seed families (A–M)
Thirteen deterministic families: digits, date patterns, special chars, leet substitutions (J), double‑transform chains (K), special‑before‑digit (L), leet+transform combos (M).
🧠
Bloom filter GPU lookup
FNV‑1a hash‑based membership test in VRAM (16–256 MB) for instant target validation.
🔬
Functional minimization
Signature‑based deduplication using probe words — removes 20–60% of equivalent rules without losing coverage.
🗜
Phase 0 — Token‑Strip (14 modes)
Optional CPU pre‑pass: reverse‑engineers exact rule chains from target passwords using 14 extraction modes with multiprocessing. Singles → Phase 1; chains → Phase S + Phase 2.
🧬
Phase 3 — Genetic Algorithm
Optional evolutionary search (--genetic) with 2× novelty bonus for new chains, 20% time reservation, stagnation guard, and tournament selection + crossover + mutation.
📊
Hit counting & ranking
Per‑rule hit frequency from GPU validation; final output sorted by real coverage.
Extraction ruleflow (v2)
Phase 0
Token‑Strip pre‑pass (optional)
CPU‑only pre‑pass (--token-strip) with 14 extraction modes and multiprocessing. Reverse‑engineers exact rule chains from target passwords; singles → Phase 1; chains → Phase S + Phase 2.
Phase 1
Single‑rule sweep
5,600+ validated single rules applied to every base word; Bloom filter checks membership in target. Collects effective single rules and hit counts.
Phase S
Built‑in seed families (A–M)
Thirteen predefined families covering digits, date patterns, special chars, leet substitutions (J), double‑transform chains (K), special‑before‑digit (L), and leet+transform combos (M). Depth 2–9 tested as dedicated GPU pass.
Phase 2
Hot‑biased chain generation
Chain builder with 60% hot‑rule bias (using Phase 1 winners), 30% seed extension, 10% random exploration. Respects per‑depth budgets and max 31 ops.
Phase 3
Genetic Algorithm (optional)
Evolutionary search (--genetic) with 2× novelty bonus, dedicated 20% time reservation (min 120s), stagnation guard, tournament selection + crossover + mutation.
Post‑processing
Minimization & export
Probe words compute functional signatures; SQLite‑backed dedup for >500k candidates; keep highest‑hit rule per equivalence class. Export frequency‑sorted .rule file.
Built‑in Seed Families (Phase S) — A through M
A – Pure Prepend
Digits prepended (^digit) depths 1–4
B – Pure Append
Digits appended ($digit) depths 1–4
C – Mixed Prepend/Append
All combos of ^d and $d depths 1–4
D – Transform + Digit/Bracket
l,u,c,C,t,… + 1‑4 digits or [ ]
E – Date Patterns
DDMM, MMDD, YYYY, DDMMYY, DDMMYYYY + transforms (up to depth 9)
F – Append Special Chars
Top‑15 specials appended ($char) depths 1‑3
G – Prepend Special Chars
Top‑15 specials prepended (^char) depths 1‑3
H – Transform + Special Char
Transform + 1‑2 specials (append/prepend), depths 2‑3
I – Digit(s) + Special Char
Digits + core specials (!@#$%*?) depths 2–4
J – Leet Substitutions
Top‑10 leet pairs (sa@, se3, so0, si1, sl1, ss5, ss$, st7, sa4, si!) — pure + digit/special combos
K – Double‑Transform Chains
All 225 ordered pairs of structural transforms (l, u, c, C, t, r, d, f, E, k, K, {, }, [, ])
L – Special‑before‑Digit
Special char first, then digits — covers word!12 and !12word patterns (depths 2‑3)
M – Leet + Transform
Every leet op paired with every structural transform in both orderings (≈300 chains)

Disabled with --no-builtin-seeds. These seeds run as a dedicated phase and are also forwarded to Phase 2 as scaffolding for deeper chains.

Toolchain Integration & Workflow

Rulest v2 integrates seamlessly with the A1131 ecosystem:

Rulest (v2)
Produces a ranked, minimized .rule file with hit counts and functional dedup.
Minimizer (optional)
For cross‑corpus signature minimization across multiple rule sets.
Ranker / Aether
Re‑benchmark or visualize rule performance on different validation sets.
💡 Tip: Rulest v2 already removes functionally equivalent rules via signature minimization (21+ probe words). To further deduplicate across multiple runs, pipe the output through Minimizer with a unified probe set.
# Typical workflow
python rulest_v2.py rockyou.txt targets.txt -d 3 -o extracted.rule
# Optional cross‑minimization
python minimizer.py --probe-words probes.txt extracted.rule final.rule
⚙️ GPU & Performance
  • VRAM‑aware batch sizing (baseline 8 GB, scales down to 4 GB)
  • Dynamic Bloom filter: 16–256 MB based on target size
  • Multi‑device support: --list-devices, --device index/name
  • Per‑depth chain budgets (--depth2-chains--depth10-chains)
  • Max rule ops: 31 (Hashcat GPU limit), rejection rules automatically excluded
📦 Requirements & Installation
Python ≥3.8 · numpy · pyopencl · tqdm
# Clone & install
git clone https://github.com/A113L/rulest.git
pip install numpy pyopencl tqdm

OpenCL 1.2+ GPU (NVIDIA, AMD, Intel). CPU fallback supported but slow.

Quick Examples
View rulest on GitHub
Basic extraction — depth 2
python rulest_v2.py rockyou.txt target.txt -d 2 -o myrules.rule
Specific GPU device + custom chain budget + time limit
python rulest_v2.py base.txt dict.txt --device 0 --depth3-chains 80000 --target-hours 1.5
External seed rules, skip built-in families, depth 4
python rulest_v2.py base.txt target.txt --seed-rules previous_seeds.txt --no-builtin-seeds -d 4
Phase 0 — Token-Strip pre-pass (14 modes, multiprocessing)
python rulest_v2.py rockyou.txt target.txt --token-strip --max-depth 4 --target-hours 1.0 -o with_ts.rule
Phase 3 — Genetic Algorithm (novelty-weighted, 20% time reservation)
python rulest_v2.py base.txt target.txt --max-depth 3 --target-hours 1.5 --genetic -o evolved.rule

Output includes header with total candidates, minimization stats, and per‑depth rule counts. Sorted by GPU hit frequency.

Get the Source

Rulest is open source and hosted on GitHub. MIT licensed.

View rulest on GitHub