Roadmap
Done
-
Reads & rewrites real obfuscated JS
Full modern JavaScript parsing, with repeated rewrite rounds until the code stops changing.
-
Control-flow recovery
Un-flattens the main dispatcher styles (state-machine while/switch loops) back into normal linear code.
-
Decoder lifting
Runs string/decoder functions in a safe sandbox and bakes the real values back into the source.
-
Constant & expression cleanup
Folds constants, inlines throwaway variables, and collapses operator wrappers and obfuscated math.
-
Dead code removal
Strips unused variables, dead stores, and uncalled methods.
-
Readable naming
Gives surviving variables meaningful names based on how they are used.
-
Soundness guarantees
Every pass bails out rather than guess; output is verified to behave identically to the input on the real-sample corpus.
-
Robustness
Adversarial input can’t crash the host or emit half-rewritten code; bad input degrades gracefully.
-
Test coverage
Per-feature unit tests, byte-exact regression snapshots, a readability scoreboard, and whole-pipeline behavior-equivalence on 15 real anti-bot samples (all converge cleanly).
In progress
-
Unified control-flow recovery engine
A single generic model that can absorb every flattening style instead of one recognizer per dialect. The substrate is built and tested but not yet wired into the main pipeline; it switches on once it matches today’s output exactly.
Next up
-
Nested decoder lifting
Handle decoders wrapped inside other decoders (multi-level indirection).
-
More obfuscator dialects
Onboard new flattening styles through the unified engine, with no per-sample rules.
-
Harder loop dispatchers
Cover the looping/partial dispatcher cases the current recovery declines on.
Out of scope (for now)
- Cross-function analysis (the engine works one function at a time).
- Truly irreducible control flow (handled safely by leaving it alone).