Skip to content

Conversation

@dhalf
Copy link
Contributor

@dhalf dhalf commented Nov 25, 2025

Implement EXT1, EXT2, and EXT4 Pickle Opcode Support

Problem

Fickling raises NotImplementedError: TODO: Add support for Opcode EXT1 (and EXT2, EXT4) when analyzing pickle files that use extension registry opcodes from pickle protocol 2+.

What Are EXT Opcodes?

EXT opcodes allow pickles to reference pre-registered objects from the global extension registry using integer codes instead of full module/name paths:

  • EXT1 (0x82): 1-byte unsigned integer (0-255)
  • EXT2 (0x83): 2-byte unsigned integer (0-65,535)
  • EXT4 (0x84): 4-byte signed integer (0-2,147,483,647)

These opcodes look up objects in copyreg._extension_registry which maps integer codes to (module, name) tuples.

Solution

Implemented three new opcode classes:

  1. Ext1: Base implementation that generates AST code showing the registry lookup
  2. Ext2: Inherits from Ext1 (same logic, 2-byte arg)
  3. Ext4: Inherits from Ext1 (same logic, 4-byte arg)

The implementation uses the "Middle Ground" approach:

  • Generates code showing copyreg._extension_registry.get(code, (None, None))
  • Uses .get() with a safe default so it won't crash if registry isn't populated
  • Provides informative output for security analysis

Example Output

Before: NotImplementedError: TODO: Add support for Opcode EXT1

After:
import copyreg
_var0 = copyreg._extension_registry.get(42, (None, None))
_var1 = _var0()
_var1.setstate({'value': 42})
result0 = _var1

Testing

  • ✅ Created test pickle with EXT1 opcode (using copyreg.add_extension)
  • ✅ Verified fickling successfully analyzes it without errors
  • ✅ All existing tests pass (20/20)
  • ✅ No regressions

Benefits

  • Fixes the NotImplementedError: Pickles with EXT opcodes now analyze successfully
  • Security analysis: Shows what extension codes are being used
  • Safe implementation: Won't crash if the extension isn't registered
  • Educational: Generated code shows the registry lookup mechanism

Add support for extension registry opcodes (EXT1, EXT2, EXT4) which are
part of pickle protocol 2+. These opcodes allow pickles to reference
pre-registered objects from copyreg._extension_registry using integer codes
instead of full module/name paths.

Implementation:
- Added Ext1 opcode class that generates AST code showing registry lookup
- Added Ext2 and Ext4 as subclasses (inherit same logic, different arg sizes)
- Uses copyreg._extension_registry.get(code, (None, None)) for safe lookup
- Generates informative code for security analysis

The fix uses the Middle Ground approach:
- Shows what extension code is being used (valuable for auditing)
- Won't crash if registry isn't populated (uses .get() with default)
- Generates readable AST output

Example generated code:
```python
import copyreg
_var0 = copyreg._extension_registry.get(42, (None, None))
```

Fixes NotImplementedError when analyzing pickles with EXT opcodes.
All existing tests pass (20/20).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@dhalf dhalf requested a review from ESultanik as a code owner November 25, 2025 16:39
@dhalf
Copy link
Contributor Author

dhalf commented Nov 25, 2025

#114

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants