RULEST v2.0 · GPU‑ACCELERATED RULE EXTRACTION

Rulest

Extract high‑probability Hashcat rules by analyzing transformation patterns between base and target wordlists. Three‑phase GPU extraction, built‑in seed families (A–I), and functional minimization — no BFS, no blind chaining.

OpenCL GPU Three‑Phase Seed Families A–I Functional Minimization Bloom Filter VRAM 5,600+ Validated Ops

Overview

Rulest v2 replaces naive BFS chaining with a GPU‑first extraction engine. Given a base wordlist (source) and a target wordlist (dictionary), it reverse‑engineers the Hashcat rules that transform base words into target words — using OpenCL parallelism, a VRAM Bloom filter, and three distinct extraction phases.

The result is a production‑ready .rule file, ordered by hit count and minimized via signature‑based functional deduplication. 100% compatible with Hashcat’s GPU engine (max 31 ops, no rejection rules).

Core Features

⚡

Three‑phase extraction

Phase 1 (single‑rule sweep) → Phase S (built‑in seed families) → Phase 2 (hot‑rule biased chains).

🌱

Built‑in seed families (A–I)

Nine deterministic families: digits, date patterns (DDMM, YYYY, DDMMYYYY), special chars, transforms + brackets.

🧠

Bloom filter GPU lookup

FNV‑1a hash‑based membership test in VRAM (16–256 MB) for instant target validation.

🔬

Functional minimization

Signature‑based deduplication using probe words — removes 20–60% of equivalent rules without losing coverage.

📊

Hit counting & ranking

Per‑rule hit frequency from GPU validation; final output sorted by real coverage.

Extraction Pipeline (v2)

Phase 1

Single‑rule sweep

5,600+ validated single rules applied to every base word; Bloom filter checks membership in target. Collects effective single rules and hit counts.

Phase S

Built‑in seed families (A–I)

Inject nine predefined families covering digits, date combos, special characters, and transform+digit patterns. Depth 2–9 seeds tested as dedicated pass.

Phase 2

Hot‑biased chain generation

Chain builder with 60% hot‑rule bias (using Phase 1 winners), 30% seed extension, 10% random exploration. Respects per‑depth budgets and max 31 ops.

Validation

GPU + Bloom filter

Every candidate rule validated on‑GPU for Hashcat compatibility; hit counters updated.

Minimization & export

Signature deduplication

Probe words from base list compute functional signatures; keep highest‑hit rule per equivalence class. Export frequency‑sorted .rule file.

Built‑in Seed Families (Phase S) — A through I

A – Pure Prepend
Digits prepended (^digit) depths 1–4

B – Pure Append
Digits appended ($digit) depths 1–4

C – Mixed Prepend/Append
All combos of ^d and $d depths 1–4

D – Transform + Digit/Bracket
l,u,c,C,t,… + 1‑4 digits or [ ]

E – Date Patterns
DDMM, MMDD, YYYY, DDMMYY, DDMMYYYY + transforms

F – Append Special Chars
Top‑15 specials appended ($char) depths 1‑2

G – Prepend Special Chars
Top‑15 specials prepended (^char) depths 1‑2

H – Transform + Special Char
Transform + 1‑2 specials (append/prepend)

I – Digit(s) + Special Char
Digits + core specials (!@#$%*?) depths 2‑4

Disabled with --no-builtin-seeds. These seeds run as a dedicated phase and are also forwarded to Phase 2 as scaffolding for deeper chains.

Toolchain Integration & Workflow

Rulest v2 integrates seamlessly with the A1131 ecosystem:

Rulest (v2)

Produces a ranked, minimized .rule file with hit counts and functional dedup.

Minimizer (optional)

For cross‑corpus signature minimization across multiple rule sets.

Ranker / Aether

Re‑benchmark or visualize rule performance on different validation sets.

💡 Tip: Rulest v2 already removes functionally equivalent rules via signature minimization (21+ probe words). To further deduplicate across multiple runs, pipe the output through Minimizer with a unified probe set.

# Typical workflow
python rulest_v2.py rockyou.txt targets.txt -d 3 -o extracted.rule
# Optional cross‑minimization
python minimizer.py --probe-words probes.txt extracted.rule final.rule

⚙️ GPU & Performance

VRAM‑aware batch sizing (baseline 8 GB, scales down to 4 GB)
Dynamic Bloom filter: 16–256 MB based on target size
Multi‑device support: --list-devices, --device index/name
Per‑depth chain budgets (--depth2-chains … --depth10-chains)
Max rule ops: 31 (Hashcat GPU limit), rejection rules automatically excluded

📦 Requirements & Installation

Python ≥3.8 · numpy · pyopencl · tqdm
# clone & install
git clone https://github.com/A113L/rulest.git
pip install numpy pyopencl tqdm

OpenCL 1.2+ GPU (NVIDIA, AMD, Intel). CPU fallback supported but slow.

Quick Examples

python rulest_v2.py rockyou.txt target.txt -d 2 -o myrules.rule

python rulest_v2.py base.txt dict.txt --device "RTX 4090" --depth3-chains 80000 --target-hours 1.5

python rulest_v2.py base.txt target.txt --seed-rules previous_seeds.txt --no-builtin-seeds -d 4

Output includes header with total candidates, minimization stats, and per‑depth rule counts. Sorted by GPU hit frequency.