Done

  • Reads & rewrites real obfuscated JS

    Full modern JavaScript parsing, with repeated rewrite rounds until the code stops changing.

  • Control-flow recovery

    Un-flattens the main dispatcher styles (state-machine while/switch loops) back into normal linear code.

  • Decoder lifting

    Runs string/decoder functions in a safe sandbox and bakes the real values back into the source.

  • Constant & expression cleanup

    Folds constants, inlines throwaway variables, and collapses operator wrappers and obfuscated math.

  • Dead code removal

    Strips unused variables, dead stores, and uncalled methods.

  • Readable naming

    Gives surviving variables meaningful names based on how they are used.

  • Soundness guarantees

    Every pass bails out rather than guess; output is verified to behave identically to the input on the real-sample corpus.

  • Robustness

    Adversarial input can’t crash the host or emit half-rewritten code; bad input degrades gracefully.

  • Test coverage

    Per-feature unit tests, byte-exact regression snapshots, a readability scoreboard, and whole-pipeline behavior-equivalence on 15 real anti-bot samples (all converge cleanly).

In progress

  • Unified control-flow recovery engine

    A single generic model that can absorb every flattening style instead of one recognizer per dialect. The substrate is built and tested but not yet wired into the main pipeline; it switches on once it matches today’s output exactly.

Next up

  • Nested decoder lifting

    Handle decoders wrapped inside other decoders (multi-level indirection).

  • More obfuscator dialects

    Onboard new flattening styles through the unified engine, with no per-sample rules.

  • Harder loop dispatchers

    Cover the looping/partial dispatcher cases the current recovery declines on.

Out of scope (for now)

  • Cross-function analysis (the engine works one function at a time).
  • Truly irreducible control flow (handled safely by leaving it alone).