Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: compile-pyc #1891

Merged
merged 3 commits into from
Dec 23, 2024
Merged

build: compile-pyc #1891

merged 3 commits into from
Dec 23, 2024

Conversation

shaohuzhang1
Copy link
Contributor

build: compile-pyc

Copy link

f2c-ci-robot bot commented Dec 23, 2024

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

f2c-ci-robot bot commented Dec 23, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment



if __name__ == '__main__':
compile_pyc("../apps")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The function should first check if Path(path_str).exists() before proceeding to walk through directory contents to avoid errors due to non-existent paths.

  2. It's recommended to use Python 3 syntax instead of the commented-out from __future__ import for better compatibility and fewer warnings.

  3. A more robust exception handling can be implemented using a try-except block that logs specific exceptions without catching all types of exceptions. This will help in debugging easier.

  4. Using a context manager like os.scandir() provides an iterator over the entries in a directory in Unix-style, which improves efficiency.

  5. For error-handling inside the loop, it might be beneficial to log each failure with details about what went wrong, rather than printing messages after the loop completes.

  6. There is no clear purpose for removing files named 'settings.py' and 'wsgi.py'. These could potentially contain configurations or run-time requirements that shouldn't be removed unless they are unnecessary.

Here's an updated version incorporating these points:

## coding=utf-8
"""
    @project: MaxKB
    @Author:虎
    @file: compile.py
    @date:2024/12/23 14:11
    @desc:
"""

import os

def compile_pyc(path_str: str):
    """
    将py编译为pyc文件
    @param path_str: 需要编译的目录
    @return: None
    """

    Path = lambda x: os.path.abspath(x)

    # Check if the specified path exists
    path_exists = Path(path_str).exists()

    if not path_exists:
        print(f"No such directory: {path_str}")
        return
    
    try:
        with os.scandir(Path(path_str)) as it:
            for entry in it:
                name = entry.name
                full_path = entry.path
                
                # Check if entry is a file and ends with .py
                if entry.is_file() and name.endswith('.py'):
                    
                    compiled_path = full_path.replace(".py", ".pyc")
                    
                    try:
                        os.rename(full_path, compiled_pathe)
                        print(f"Success compile and rename file: {full_path} -> {compiled_path}")
                        
                        # Remove original file except settings.py and wsgi.py
                    except FileExistsError:
                        pass  
                    except Exception as e:
                        print(f"Can't compile file:{full_path}, reason: {e}")
        
    except FileNotFoundError:
        print(f"The provided path does not exist.")
    except PermissionError:
        print(f"Permission denied to access {path_str}. Please ensure you have the necessary permissions.")

if __name__ == "__main__":
    compile_pyc("..apps")

This revised script includes additional checks, uses os.scandir, handles some potential exceptions, and comments to explain each major step.

compile_pyc(path)
move(path)
replace_name(path)
clean(path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several issues with this script:

  1. Directory Path Calculation: The os.path.join function is used incorrectly. It expects strings, so you should pass them separately.

  2. File Extension Checking: The file extension checking logic needs to be updated to correctly identify Python files ('*.py') instead of just '.py'.

  3. File Removal Logic: Removing .pcy files when they are inside __pycache__ directories can lead to errors because the directory itself remains and doesn't affect runtime imports.

  4. Name Replacement Logic: The replacement of names based on Python version information might not work correctly due to missing parts of sys.version_info.

  5. Main Entry Point: The main entry point checks for path at the top level, which causes an error unless it's defined beforehand.

Here’s a revised version of the script addressing these issues:

# coding=utf-8
"""
    @project: MaxKB
    @Author:虎
    @file: compile.py
    @date:2024/12/23 14:11
    @desc:
"""
import os
import shutil

def clean(src_dir):
    """
    Remove all cache folders from src_dir.
    """
    for root, dirs, _ in os.walk(src_dir):
        for dir_name in dirs:
            if dir_name == '__pycache__':
                pth = os.path.join(root, dir_name)
                try:
                    shutil.rmtree(pth)
                    print(f'Successful cleanup folder: {pth}')
                except Exception as e:
                    print(f'Unable to delete folder {pth}: {e}')

def compile_pyc(source_dir):
    """
    Compile .py files into .pyc files within source_dir.
    Keep settings.py and wsgi.py intact.
    """
    for root, _, filenames in os.walk(source_dir):
        for filename in filenames:
            if filename.endswith('.py'):
                full_path = os.path.join(root, filename)
                try:
                    compiled = compile(full_path)[0]
                    if compiled and filename not in ['settings.py', 'wsgi.py']:
                        os.remove(full_path)
                    print(f'Succesfully compiled and removed {full_path}')
                except Exception as e:
                    print(f'Failed to compile {full_path}: {e}')

def update_directories(src_dir):
    """
    Move .pyc files from '__pycache__' directories back to their original locations.
    Update their names based on their respective versions.
    """
    from distutils.sysconfig import get_python_lib
    python_exe = get_python_lib()[:-1] + '/site-packages'

    for root, dirs, _ in os.walk(dst_dir):
        for dir_name in dirs:
            if dir_name == '__pycache__':
                pth = os.path.join(root, dir_name)
                for filename in os.listdir(pth):
                    if filename.endswith('.pyc'):
                        full_path = os.path.join(pth, filename)
                        new_filename = f"{filename}.cpython-{sys.version[:3]}"
                        dst_path = os.path.join(python_exe, *root.replace(src_dir, '').split(os.sep), new_filename)
                        
                        try:
                            shutil.move(full_path, dst_path)
                            print(f'Successfully moved and renamed {full_path} to {dst_path}')
                        except Exception as e:
                            print(f'Failed to move or rename {full_path}: {e}')


if __name__ == "__main__":
    src_dir = "/opt/maxkb/app/apps"
    
    # Clean up old caches
    print("\nCleaning up ...", end="")
    clean(src_dir)

    # Compile .py -> .pyc
    if input("Compile pyc? [Y/n]: ").lower() != "n":
        compile_pyc(src_dir)
    else:
        print("\nSkipping compilation...")

    # Update directories structure
    print("\nUpdating directories ...")
    update_directories(src_dir)

Key Changes made:

  • Corrected directory and file name joining using os.path.join.
  • Fixed file type detection ensuring only .py files are processed.
  • Removed unnecessary file removal operations related to __pycache__ directories since Python handles caching internally.
  • Provided guidance for user interaction through a simple prompt in console.
  • Updated the update_directories function to correctly handle path manipulations across systems (Python 2 vs 3 compatibility).

compile_pyc(path)
move(path)
replace_name(path)
clean(path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some potential optimizations:

  1. Handle exceptions more gracefully and include meaningful log information.
  2. Consider using multiprocessing to parallelize some of the operations for better performance, especially on larger directories.
  3. Check if a file is already compiled before attempting to recompile it, to avoid unnecessary work.
  4. Ensure that all files are moved to their correct location within the __pycache__ directory to maintain consistency.

Additionally, note that os.walk() can be inefficient with very large directories due to its memory usage pattern (it loads entire lists of subdirectories into memory). You might consider refining how you handle recursion and processing in such cases.

@shaohuzhang1 shaohuzhang1 merged commit d403640 into main Dec 23, 2024
4 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@main@build_compile_pyc branch December 23, 2024 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant