Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"bad marshal data (unknown type code)" when invoking on pytest .pyc #25

Open
Lucas-C opened this issue Feb 14, 2015 · 4 comments
Open

Comments

@Lucas-C
Copy link

Lucas-C commented Feb 14, 2015

There is how to reproduce the issue, with the latest version from the repository:

$ cat stupid_test.py 
def test_dummy():
    assert True
$ py.test -q stupid_test.py                                                                                                      
.
1 passed in 0.01 seconds
$ python2.7 /opt/uncompyle2/scripts/uncompyle2 __pycache__/stupid_test.cpython-27-PYTEST.pyc                                 
#2015.02.14 10:47:50 CET
### Can't uncompyle __pycache__/stupid_test.cpython-27-PYTEST.pyc
Traceback (most recent call last):
  File "/home/lucas/.local/lib/python2.7/site-packages/uncompyle2/__init__.py", line 197, in main
    uncompyle_file(infile, outstream, showasm, showast, deob)
  File "/home/lucas/.local/lib/python2.7/site-packages/uncompyle2/__init__.py", line 129, in uncompyle_file
    version, co = _load_module(filename)
  File "/home/lucas/.local/lib/python2.7/site-packages/uncompyle2/__init__.py", line 77, in _load_module
    co = marshal.load(fp)
ValueError: bad marshal data (unknown type code)
# decompiled 0 files: 0 okay, 1 failed, 0 verify failed
#2015.02.14 10:47:50 CET

Any idea on the root cause and if it could be fixed ?

@rocky
Copy link

rocky commented Dec 18, 2015

In debugging https://github.com/rocky/python-uncompyle6 I've come across this a bit.

At this point in the code, uncompyle2 is trying to extract a Python code object from the byte-compiled file. It can't. This will most definitely happen if you try to use marshal.load on a version other than the version you are running.

But you might say: but I am running the same python version!

Maybe and maybe not. In Python 2.7 the magic numbers changed several times in Python 2.7. Here are the changes as best as I know:

magic  release and description
-----  -----------------------
62171: 2.7a0 (optimize list comprehensions/change LIST_APPEND)
62181: 2.7a0 (optimize conditional branches:
       introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE)
62191: 2.7a0 (introduce SETUP_WITH)
62201: 2.7a0 (introduce BUILD_SET)
62211: 2.7a0 (introduce MAP_ADD and SET_ADD)

There's nothing in the above that I know would change data characteristics needed by a marshal load, so this remains a mystery.

However in uncompyle6, I now only will use marshal.loads when the bytecode interpreter number is exactly the same as the running interpreter magic number. (Previously I was just comparing on Python major/minor numbers.) To be not-too much and not-too little one would have to test against the various magic values to see what works and what doesn't.

@Lucas-C
Copy link
Author

Lucas-C commented Dec 18, 2015

Nice explanation, thanks !
Is your python-uncompyle6 project usable already ?

@rocky
Copy link

rocky commented Dec 18, 2015

Is your python-uncompyle6 project usable already ?

Perhaps for Python 2 bytecode. You can run it from CPython2 (2.6 or 2.7) or CPython3. For Python3 bytecode, it still needs work.

It is easy to come up with lots of tests that cause a failure. One project is organizing the tests better and fixing some of the failures that occur there. But a large number of those also fail, also fail in uncompyle2. (#14 has been fixed though)

This and/or the other uncompyle projects all could use help in fixing bugs.

@rocky
Copy link

rocky commented Dec 19, 2015

One other clarification regarding this:

However in uncompyle6, I now only will use marshal.loads when the bytecode interpreter number is exactly the same as the running interpreter magic number.

uncompyle2 unconditionally uses marshal.loads() and when this works, it is most-likely correct. This change in behavior was a in commit 09b2adb.

The limitation with this is that you can only disassemble Python bytecode that have compatible bytecode formats. So although this project still has opcodes for around for Python 2.3-2.6, it is possible some of these after the commit won't survive a masrhal.loads(). That said, I trust Mysterie to have tested what's is provided here, so I'll assume that they are data compatible.

The older code (which is used in the PyPI version of uncompyle2) provides its own marshal load routine written in Python and uncompyle6 uses that as well. As I have recently found, that has problems too when using different versions of Python, especially when going between Python 3 and Python 2. See https://github.com/rocky/python-uncompyle6/blob/master/uncompyle6/marsh.py#L46-L150 and compare with https://github.com/Mysterie/uncompyle2/blob/master/uncompyle2/disas.py#L195-L270

So sorry for the long-winded clarification. What I meant was uncompyle6 uses marshal.loads when the magics are the same, it uses the all-version Python code that is supposed to be equivalent (and probably still has bugs) when the magics are different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants