Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement cleanup of old AppMap binaries #1009

Closed
wants to merge 2 commits into from

Conversation

zermelo-wisen
Copy link
Collaborator

Fixes #1007.

  • Added 'cleanUpOldBinaries' and 'cleanUpOldBinariesFromDir' functions to handle the removal of older binaries.
  • Integrated 'cleanUpOldBinaries' to run after downloading a binary.
  • Added a unit test to verify the cleanup process.

This change addresses storage bloat by ensuring that only the latest versions of binaries are kept, enhancing the efficiency and storage management of the AppMap plugin.

Problem

The AppMap plugin currently does not clean up older versions of AppMap binaries located in the $HOME/.appmap (Linux/macOS) or %HOME%/.appmap (Windows) directory. This can lead to unnecessary storage usage as multiple old versions accumulate over time. The goal is to remove all older versions of the binaries and ensure that only the most recent version is retained moving forward.

Analysis

To resolve this issue, we need to implement a cleanup function that scans the AppMap binaries directory and deletes all but the most recent binary. The most recent binary can be determined by the semver version number in the file name.

Solution

This PR introduces:

  1. Cleanup Function: A function that removes all but the latest version of the binaries in the $HOME/.appmap directory.
  2. Automation: The cleanup function is automatically executed whenever a new AppMap binary is downloaded or during start up.

Code Changes

1. Directory Management and Utility Enhancements

  • Introduced getAppmapDir to centralize the base directory path for .appmap.

2. Cleanup Logic Implementation

  • Added cleanUpOldBinaries and cleanUpOldBinariesFromDir to handle the removal of older binaries.

3. Integration

  • Integrated cleanUpOldBinaries to run after downloading a binary.

Modifications:

import * as fs from 'fs';
import * as path from 'path';
import * as semver from 'semver';

...
export default class AssetService {
   ...
  public static async updateAll(throwOnError = false): Promise<void> {
    ...
    await this.cleanUpOldBinaries();
    ...
  }

  public static async updateOne(assetId: AssetIdentifier): Promise<void> {
    ...
    await this.cleanUpOldBinaries();
    return result;
  }

  static async cleanUpOldBinaries() {
    await this.cleanUpOldBinariesFromDir(join(AssetService.getAppmapDir(), 'lib', 'appmap'));
    await this.cleanUpOldBinariesFromDir(join(AssetService.getAppmapDir(), 'lib', 'java'));
    await this.cleanUpOldBinariesFromDir(join(AssetService.getAppmapDir(), 'lib', 'scanner'));
  }

  private static async cleanUpOldBinariesFromDir(directory: string) {
    try {
      const files = await fs.promises.readdir(directory);
      const fileGroups: { [key: string]: { name: string; version: string }[] } = {};

      files.forEach((file) => {
        const version = this.extractVersion(file);
        if (version && semver.valid(version)) {
          const groupName = file.replace(version, '');
          if (!fileGroups[groupName]) {
            fileGroups[groupName] = [];
          }
          fileGroups[groupName].push({ name: file, version });
        }
      });

      for (const groupName in fileGroups) {
        // Sort in descending order of versions
        const binaries = fileGroups[groupName].sort((a, b) => semver.compare(b.version, a.version));

        // Keep the latest version, delete the rest
        for (let i = 1; i < binaries.length; i++) {
          const filePath = path.join(directory, binaries[i].name);
          await fs.promises.unlink(filePath);
          log.info(`Deleted old binary: ${binaries[i].name}`);
        }
      }
    } catch (error) {
      log.error(`Failed to clean up old binaries: ${error}`);
    }
  }

  private static extractVersion(fileName: string): string | null {
    // Match patterns like '-v1.2.3', '-v1.2.3-alpha', '-v1.2.3+build', '-1.2.3-alpha+build'
    // with or without file extension
    const versionMatch = fileName.match(
      /-v?(\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?(\+[a-zA-Z0-9-]+)?)\b(\.[^.]+)?$/
    );
    return versionMatch ? versionMatch[1] : null;
  }
}

4. Testing

  • Added a unit test to verify the cleanup process.

Test Modifications:

it('cleans up old binaries', async () => {
  ...
  await AssetService.updateAll(true);

  // Verify the results
  const verifyFiles = async (directory: string, expectedFiles: string[]) => {
    const files = await readdir(join(homeDir, '.appmap', directory));
    const expectedSet = new Set(expectedFiles);
    const remainingSet = new Set(files);

    expect(expectedSet).to.deep.equal(remainingSet);
  };

  // Verify each directory's expected files
  await verifyFiles(join('lib', 'appmap'), ['appmap-v0.0.0-TEST']);
  await verifyFiles(join('lib', 'java'), [
    'appmap-1.26.2.jar',
    'appmap-agent-1.26.2.jar',
    'appmap.jar',
  ]);
  await verifyFiles(join('lib', 'scanner'), ['scanner-v1.88.0']);
});

PR Summary

This PR addresses storage bloat by implementing cleanup functionality for old binaries in the AppMap plugin. This ensures that only the latest versions of binaries are kept, enhancing the efficiency and storage management of the AppMap plugin. This update is crucial to maintain optimal performance and prevent unnecessary storage consumption.

Comment on lines 162 to 189
await fs.promises.unlink(filePath);
log.info(`Deleted old binary: ${binaries[i].name}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth some sort of assertion that we're not destroying a linked object. These binaries are put in the lib directory, then a symlink is created in the bin directory pointing at the latest binary in lib.

It seems unlikely that this would happen, but if for some reason an asset download failed and the symlink wasn't updated, it may be possible that the linked binary gets deleted here.

Comment on lines 125 to 126
const result = asset();
await result;
Copy link
Contributor

@dustinbyrne dustinbyrne Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const result = asset();
await result;
const result = await asset();

@dustinbyrne dustinbyrne self-requested a review August 28, 2024 16:30
Copy link
Contributor

@dustinbyrne dustinbyrne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Please see the requested changes above. Also, the build is failing but it's not clear to me that it's related. Could you look in to it?

@zermelo-wisen zermelo-wisen force-pushed the fix/clean-up-old-binaries branch 2 times, most recently from 72990d0 to 8e2d488 Compare August 29, 2024 15:26
@zermelo-wisen
Copy link
Collaborator Author

This is looking good. Please see the requested changes above. Also, the build is failing but it's not clear to me that it's related. Could you look in to it?

Changes are done. The integration test is failing, but I re-ran the build of the last release commit, and it seems to be failing as well. This is probably unrelated, but I can try to solve it even if it's unrelated.

Comment on lines 135 to 142
const symlinks = (await fs.promises.readdir(binDir, { withFileTypes: true }))
.filter((f) => f.isSymbolicLink())
.map((f) => join(binDir, f.name));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth resolving the linked path once here instead of N times in the loop below (hasSymlink). It won't change during this time.

// Match patterns like '-v1.2.3', '-v1.2.3-alpha', '-v1.2.3+build', '-1.2.3-alpha+build'
// with or without file extension
const versionMatch = fileName.match(
/-v?(\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?(\+[a-zA-Z0-9-]+)?)\b(\.[^.]+)?$/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that we are using a recommended RegExp from here - https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string

@@ -94,13 +102,14 @@ export default class AssetService {
sync.emit('error', e);
if (e instanceof AbortError) return reject(e);
}
await this.cleanUpOldBinaries();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trap, log, and ignore any errors that occur here, so that the cleanup doesn't interfere with any other update functionality.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was doing it inside the inner most function AssetService.cleanUpOldBinariesFromDir, which is called for appmap, java and scanner folders separately. In order to cover symlink resolutions as well, now I moved the try catch logic to AssetService.cleanUpOldBinaries instead of the call site above, because AssetService.cleanUpOldBinaries is called from two different places.

continue;
}

await fs.promises.unlink(binaryPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other processes may be concurrently cleaning up the same binaries. Therefore, there is a race condition here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs inside execute method of a new LockfileSynchronizer(appmapDir). Doesn't it prevent race condition, since binaryPath is inside appmapDir? Or maybe I'm missing something.

@zermelo-wisen zermelo-wisen force-pushed the fix/clean-up-old-binaries branch 6 times, most recently from c3faa8c to f7412c3 Compare September 10, 2024 15:12
- Added 'cleanUpOldBinaries' and 'cleanUpOldBinariesFromDir' functions
  to handle the removal of older binaries.
- Integrated 'cleanUpOldBinaries' to run after downloading a binary.
- Added a unit test to verify the cleanup process.

This change addresses storage bloat by ensuring that only the latest
versions of binaries are kept, enhancing the efficiency and storage
management of the AppMap plugin.
@zermelo-wisen zermelo-wisen force-pushed the fix/clean-up-old-binaries branch 7 times, most recently from 7e31c09 to d9747f2 Compare September 11, 2024 13:51
@dustinbyrne
Copy link
Contributor

Closing this for now. The plan is to instead implement this within the appmap CLI.

@dustinbyrne dustinbyrne closed this Oct 4, 2024
@dustinbyrne dustinbyrne deleted the fix/clean-up-old-binaries branch October 4, 2024 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delete old AppMap binaries
3 participants