gh-122071: Fixed traceback leaks global code when file does not exist #122126

lhish · 2024-07-22T13:54:17Z

Issue: Traceback leaks global code when exec:ed code raises #122071

cpython-cla-bot · 2024-07-22T13:54:21Z

All commit authors signed the Contributor License Agreement.

bedevere-app · 2024-07-22T13:54:24Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

devdanzin · 2024-07-22T14:24:16Z

I got the same test failures without is_main, so I added it. But if you can write a test that fails due to checking is_main, that'll lead to a better solution.

lhish · 2024-07-22T17:57:36Z

I got the same test failures without is_main, so I added it. But if you can write a test that fails due to checking is_main, that'll lead to a better solution.

t.py

import rely
import traceback
try:
    rely.fun()
except:
    traceback.print_exc()

rely.py

def fun():
    exec(compile("tuple()[1]","s","exec"))

lhish · 2024-07-22T18:00:09Z

By the way, based on preliminary testing, the zipimporter module does not appear to trigger issues related to this bug.

lhish · 2024-07-22T18:06:09Z

Regarding the modifications to the testing section: The previous testing method utilized the characteristic that when the target file didn't exist, the current file would be used as the content of the non-existent file for testing purposes. I have now changed this method to directly use the current file for testing.

lhish · 2024-08-24T17:39:41Z

After reviewing the discussion in GH-122145, I found it very insightful. The logic presented there is more general and avoids the need for a specific check for zipimporter, as I had implemented. Therefore, I have revised my code accordingly.

Lib/test/test_traceback.py

Lib/linecache.py

Misc/NEWS.d/next/Core and Builtins/2024-07-22-13-58-16.gh-issue-122071.c5EQZL.rst

Lib/linecache.py

Lib/test/test_traceback.py

picnixz · 2024-08-25T11:36:31Z

EDIT: It appears that test failures are important. So maybe the previous logic was fine. By the way, #122161 (comment) suggested returning False instead of True for lazycaching.

Co-authored-by: Bénédikt Tran <[email protected]>

This reverts commit 0fab629.

lhish · 2024-08-27T17:01:00Z

EDIT: It appears that test failures are important. So maybe the previous logic was fine. By the way, #122161 (comment) suggested returning False instead of True for lazycaching.

After some investigation, I found that the get_lines function seems to be unique to linecache, so it shouldn't cause any issues. However, I noticed that if we return False directly, using linecache["nonexistent_filename"] results in an error, whereas it didn't before the modification. Therefore, I'm uncertain if returning False is the correct approach.

lhish · 2024-08-28T04:39:00Z

To handle Fakeloader and zipfiles, I've added additional conditions. However, I don't believe this is the best solution. If anyone has any suggestions or alternative ideas, please help me.

picnixz

When you say that FakeLoader and ZipImporter do not work well, what are the issues with them? how do they fail? (sorry if you already replied to those questions but I need someone to refresh my memory :'))

Lib/linecache.py

picnixz · 2024-10-05T09:11:46Z

Lib/linecache.py

+        mod_file = module_globals.get('__file__')
+        if isinstance(loader, importlib._bootstrap_external.SourceFileLoader) and (not mod_file or (not mod_file.endswith(filename) and not mod_file.endswith('.pyc'))):


You said that returning False makes linecache["non-existant"] fail, but how does it fail exactly (namely what are the operations you do (i.e. please provide a reproducer)? is it possible to fix this case? technically, if the file does not exist linecache should just return an empty list.

If the issue is with an existing test, then the test might also need to be updated (since the logic has been changed here).

If we directly return False, then when attempting to lazycache something that meets the conditions in the newly added if statement, it won’t add a new element to linecache.cache. As a result, accessing this key later will raise a KeyError. However, if we return an empty getline instead and return True, this issue doesn’t occur. While I doubt anyone would use code like the following, this code did not produce an error before the change:

import linecache linecache.lazycache("11", globals()) print(linecache.cache["11"])

The output is:

print(linecache.cache["11"]) ~~~~~~~~~~~~~~~^^^^^^ KeyError: '11'

Additionally, regarding the loader, if we don’t add this sourceloader restriction, the test_loader in test_linecache will fail:

def test_loader(self): filename = 'scheme://path' for loader in (None, object(), NoSourceLoader()): linecache.clearcache() module_globals = {'__name__': 'a.b.c', '__loader__': loader} self.assertEqual(linecache.getlines(filename, module_globals), []) linecache.clearcache() module_globals = {'__name__': 'a.b.c', '__loader__': FakeLoader()} self.assertEqual(linecache.getlines(filename, module_globals), ['source for a.b.c\n'])

Since module_globals does not have filename set, mod_file = module_globals.get('__file__') definitely set mod_file to None. Without this if statement, it would normally retrieve source through the loader. However, the new if condition now covers this case, which means almost all non-standard loaders might return empty results (though I haven’t exhaustively tested this).
For ZipImporter, if inspect.getsource() is used to obtain the contents of a module imported via zipimporter, an error occurs because inspect.getsource relies on linecache.getline. When zipimporter goes through lazycache, it enters the newly added if statement, resulting in an empty getline. This is because the if checks whether mod_file and file_name have the same suffix. In the case of zipimporter, mod_file ends with .pyc, while file_name ends with .py. This ultimately causes failure in test.test_zipimport.UncompressedZipImportTestCase.testGetCompiledSource.

You're not meant to access the cache using .cache directly. The cache should be accessed using linecache.getlines(). So we should perhaps see whether other path need to be updated in updatecache().

I have two questions regarding your reply.

First, Since my English is not very good, I struggled to understand your final suggestion about updating other paths in updatecache(). If you meant that all direct accesses to the cache in updatecache should be replaced with linecache.getlines, I don't think that's feasible. This is because getlines itself calls updatecache, which would create a circular dependency if updatecache were modified this way.

Second, If simply returns False, it would cause two existing Linecache-related tests to fail:

Test Case 1: test_lazycache_already_cached:
This test directly accesses the cache and expects it to have content. Here’s a snippet of the test:

def test_lazycache_already_cached(self): linecache.clearcache() lines = linecache.getlines(NONEXISTENT_FILENAME, globals()) self.assertEqual( False, linecache.lazycache(NONEXISTENT_FILENAME, globals())) self.assertEqual(4, len(linecache.cache[NONEXISTENT_FILENAME]))

In this case, returning False would result in a failure because the test expects the cache to contain data.
Test Case 2: test_lazycache_smoke:
This test expects that lazycache should return True when called with a nonexistent filename. Here’s the relevant snippet:

def test_lazycache_smoke(self): lines = linecache.getlines(NONEXISTENT_FILENAME, globals()) linecache.clearcache() self.assertEqual( True, linecache.lazycache(NONEXISTENT_FILENAME, globals())) self.assertEqual(1, len(linecache.cache[NONEXISTENT_FILENAME])) # Note here that we're looking up a nonexistent filename with no # globals: this would error if the lazy value wasn't resolved. self.assertEqual(lines, linecache.getlines(NONEXISTENT_FILENAME))

First, Since my English is not very good

Don't worry, I replied a bit too fast. My English is not perfect either so don't hesistate to ask if I was unclear!

If you meant that all direct accesses to the cache in updatecache should be replaced with linecache.getlines, I don't think that's feasible

Sorry, I meant to see whether we correctly covered the cases (namely, to see if the algorithm needs to be updated because of this new logic).

For the tests, the logic could be changed. Leaking the global code is probably worse than not leaking it IMO. I'll have a better look at the lazy-caching interface and the question on non-existent filenames (maybe we could make it work).

Alright, it does seem that returning false is appropriate in this case. However, I'm not entirely sure if other direct usages of linecache.cache need to be modified beyond the linecache and test modules. My understanding is that no further changes are necessary because the other two usages occur in pyshell and timeit. In both of these cases, the cache is directly assigned to, rather than read from.

Additionally, I've adjusted two tests that were failing in my latest commit. However, I'm not completely certain if these adjustments are logically sound.

Fixed traceback leaks global code when exec

f1e1189

bedevere-app bot mentioned this pull request Jul 22, 2024

Traceback leaks global code when exec:ed code raises #122071

Open

bedevere-app bot added the awaiting review label Jul 22, 2024

📜🤖 Added by blurb_it.

f510454

lhish and others added 5 commits July 23, 2024 00:01

Merge branch 'main' into fix-issue-122071

dcc5718

adapt for zipimporter

b40bc28

Fixed missing file's lazycache

a9f8458

Update trace_back test

0fab629

Merge branch 'main' into fix-issue-122071

f3a3f97

lhish changed the title ~~gh-122071: Fixed traceback leaks global code when exec~~ gh-122071: Fixed traceback leaks global code when file does not exist Jul 22, 2024

lhish added 2 commits July 23, 2024 02:42

Merge branch 'main' into fix-issue-122071

9d7594a

Merge branch 'main' into fix-issue-122071

c28b698

lhish force-pushed the fix-issue-122071 branch from 3baaba0 to c28b698 Compare August 24, 2024 17:30

lhish and others added 2 commits August 25, 2024 01:31

Merge branch 'main' into fix-issue-122071

17e3545

Update the method to check the existence of the file

44715f2

picnixz reviewed Aug 25, 2024

View reviewed changes

lhish and others added 5 commits August 28, 2024 00:33

Merge branch 'main' into fix-issue-122071

1940da7

change description

78d1a14

Co-authored-by: Bénédikt Tran <[email protected]>

Apply suggestions from code review

363758a

Co-authored-by: Bénédikt Tran <[email protected]>

Revert "Update trace_back test"

fa2d76b

This reverts commit 0fab629.

change the implementation of get_lines

7f7308a

make restriction stronger

250f026

picnixz reviewed Oct 5, 2024

View reviewed changes

lhish and others added 4 commits October 6, 2024 19:54

Merge branch 'main' into fix-issue-122071

b2407ff

Apply suggestions from code review

e3796bb

change return value from true to false

3005636

change tests to make it correct

787ea13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-122071: Fixed traceback leaks global code when file does not exist #122126

gh-122071: Fixed traceback leaks global code when file does not exist #122126

lhish commented Jul 22, 2024 •

edited by Eclips4

Loading

cpython-cla-bot bot commented Jul 22, 2024 •

edited

Loading

bedevere-app bot commented Jul 22, 2024

devdanzin commented Jul 22, 2024

lhish commented Jul 22, 2024

lhish commented Jul 22, 2024

lhish commented Jul 22, 2024

lhish commented Aug 24, 2024

picnixz commented Aug 25, 2024

lhish commented Aug 27, 2024

lhish commented Aug 28, 2024

picnixz left a comment •

edited

Loading

picnixz Oct 5, 2024

lhish Oct 6, 2024

picnixz Oct 6, 2024 •

edited

Loading

lhish Oct 6, 2024

picnixz Oct 6, 2024

lhish Oct 8, 2024

		mod_file = module_globals.get('__file__')
		if isinstance(loader, importlib._bootstrap_external.SourceFileLoader) and (not mod_file or (not mod_file.endswith(filename) and not mod_file.endswith('.pyc'))):

gh-122071: Fixed traceback leaks global code when file does not exist #122126

Are you sure you want to change the base?

gh-122071: Fixed traceback leaks global code when file does not exist #122126

Conversation

lhish commented Jul 22, 2024 • edited by Eclips4 Loading

cpython-cla-bot bot commented Jul 22, 2024 • edited Loading

bedevere-app bot commented Jul 22, 2024

devdanzin commented Jul 22, 2024

lhish commented Jul 22, 2024

lhish commented Jul 22, 2024

lhish commented Jul 22, 2024

lhish commented Aug 24, 2024

picnixz commented Aug 25, 2024

lhish commented Aug 27, 2024

lhish commented Aug 28, 2024

picnixz left a comment • edited Loading

Choose a reason for hiding this comment

picnixz Oct 5, 2024

Choose a reason for hiding this comment

lhish Oct 6, 2024

Choose a reason for hiding this comment

picnixz Oct 6, 2024 • edited Loading

Choose a reason for hiding this comment

lhish Oct 6, 2024

Choose a reason for hiding this comment

picnixz Oct 6, 2024

Choose a reason for hiding this comment

lhish Oct 8, 2024

Choose a reason for hiding this comment

lhish commented Jul 22, 2024 •

edited by Eclips4

Loading

cpython-cla-bot bot commented Jul 22, 2024 •

edited

Loading

picnixz left a comment •

edited

Loading

picnixz Oct 6, 2024 •

edited

Loading