-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-122071: Fixed traceback leaks global code when file does not exist #122126
base: main
Are you sure you want to change the base?
Conversation
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
I got the same test failures without |
t.py import rely
import traceback
try:
rely.fun()
except:
traceback.print_exc() rely.py def fun():
exec(compile("tuple()[1]","s","exec")) |
By the way, based on preliminary testing, the |
Regarding the modifications to the testing section: The previous testing method utilized the characteristic that when the target file didn't exist, the current file would be used as the content of the non-existent file for testing purposes. I have now changed this method to directly use the current file for testing. |
3baaba0
to
c28b698
Compare
After reviewing the discussion in GH-122145, I found it very insightful. The logic presented there is more general and avoids the need for a specific check for zipimporter, as I had implemented. Therefore, I have revised my code accordingly. |
Misc/NEWS.d/next/Core and Builtins/2024-07-22-13-58-16.gh-issue-122071.c5EQZL.rst
Outdated
Show resolved
Hide resolved
EDIT: It appears that test failures are important. So maybe the previous logic was fine. By the way, #122161 (comment) suggested returning False instead of True for lazycaching. |
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Bénédikt Tran <[email protected]>
This reverts commit 0fab629.
After some investigation, I found that the |
To handle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say that FakeLoader
and ZipImporter
do not work well, what are the issues with them? how do they fail? (sorry if you already replied to those questions but I need someone to refresh my memory :'))
Lib/linecache.py
Outdated
mod_file = module_globals.get('__file__') | ||
if isinstance(loader, importlib._bootstrap_external.SourceFileLoader) and (not mod_file or (not mod_file.endswith(filename) and not mod_file.endswith('.pyc'))): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said that returning False makes linecache["non-existant"]
fail, but how does it fail exactly (namely what are the operations you do (i.e. please provide a reproducer)? is it possible to fix this case? technically, if the file does not exist linecache should just return an empty list.
If the issue is with an existing test, then the test might also need to be updated (since the logic has been changed here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we directly return False
, then when attempting to lazycache
something that meets the conditions in the newly added if
statement, it won’t add a new element to linecache.cache
. As a result, accessing this key later will raise a KeyError
. However, if we return an empty getline instead and return True
, this issue doesn’t occur. While I doubt anyone would use code like the following, this code did not produce an error before the change:
import linecache
linecache.lazycache("11", globals())
print(linecache.cache["11"])
The output is:
print(linecache.cache["11"])
~~~~~~~~~~~~~~~^^^^^^
KeyError: '11'
Additionally, regarding the loader
, if we don’t add this sourceloader
restriction, the test_loader
in test_linecache
will fail:
def test_loader(self):
filename = 'scheme://path'
for loader in (None, object(), NoSourceLoader()):
linecache.clearcache()
module_globals = {'__name__': 'a.b.c', '__loader__': loader}
self.assertEqual(linecache.getlines(filename, module_globals), [])
linecache.clearcache()
module_globals = {'__name__': 'a.b.c', '__loader__': FakeLoader()}
self.assertEqual(linecache.getlines(filename, module_globals),
['source for a.b.c\n'])
Since module_globals
does not have filename
set, mod_file = module_globals.get('__file__')
definitely set mod_file
to None
. Without this if
statement, it would normally retrieve source through the loader
. However, the new if
condition now covers this case, which means almost all non-standard loaders
might return empty results (though I haven’t exhaustively tested this).
For ZipImporter
, if inspect.getsource()
is used to obtain the contents of a module
imported via zipimporter
, an error occurs because inspect.getsource
relies on linecache.getline
. When zipimporter
goes through lazycache
, it enters the newly added if
statement, resulting in an empty getline. This is because the if
checks whether mod_file
and file_name
have the same suffix. In the case of zipimporter
, mod_file
ends with .pyc
, while file_name
ends with .py
. This ultimately causes failure in test.test_zipimport.UncompressedZipImportTestCase.testGetCompiledSource
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not meant to access the cache using .cache
directly. The cache should be accessed using linecache.getlines()
. So we should perhaps see whether other path need to be updated in updatecache()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two questions regarding your reply.
First, Since my English is not very good, I struggled to understand your final suggestion about updating other paths in updatecache()
. If you meant that all direct accesses to the cache
in updatecache
should be replaced with linecache.getlines
, I don't think that's feasible. This is because getlines
itself calls updatecache
, which would create a circular dependency if updatecache
were modified this way.
Second, If simply returns False, it would cause two existing Linecache-related tests to fail:
Test Case 1: test_lazycache_already_cached
:
This test directly accesses the cache
and expects it to have content. Here’s a snippet of the test:
def test_lazycache_already_cached(self):
linecache.clearcache()
lines = linecache.getlines(NONEXISTENT_FILENAME, globals())
self.assertEqual(
False,
linecache.lazycache(NONEXISTENT_FILENAME, globals()))
self.assertEqual(4, len(linecache.cache[NONEXISTENT_FILENAME]))
In this case, returning False would result in a failure because the test expects the cache to contain data.
Test Case 2: test_lazycache_smoke
:
This test expects that lazycache
should return True when called with a nonexistent filename. Here’s the relevant snippet:
def test_lazycache_smoke(self):
lines = linecache.getlines(NONEXISTENT_FILENAME, globals())
linecache.clearcache()
self.assertEqual(
True, linecache.lazycache(NONEXISTENT_FILENAME, globals()))
self.assertEqual(1, len(linecache.cache[NONEXISTENT_FILENAME]))
# Note here that we're looking up a nonexistent filename with no
# globals: this would error if the lazy value wasn't resolved.
self.assertEqual(lines, linecache.getlines(NONEXISTENT_FILENAME))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, Since my English is not very good
Don't worry, I replied a bit too fast. My English is not perfect either so don't hesistate to ask if I was unclear!
If you meant that all direct accesses to the cache in updatecache should be replaced with linecache.getlines, I don't think that's feasible
Sorry, I meant to see whether we correctly covered the cases (namely, to see if the algorithm needs to be updated because of this new logic).
For the tests, the logic could be changed. Leaking the global code is probably worse than not leaking it IMO. I'll have a better look at the lazy-caching interface and the question on non-existent filenames (maybe we could make it work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, it does seem that returning false
is appropriate in this case. However, I'm not entirely sure if other direct usages of linecache.cache
need to be modified beyond the linecache
and test
modules. My understanding is that no further changes are necessary because the other two usages occur in pyshell
and timeit
. In both of these cases, the cache is directly assigned to, rather than read from.
Additionally, I've adjusted two tests that were failing in my latest commit. However, I'm not completely certain if these adjustments are logically sound.
exec
:ed code raises #122071