Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] A probabilistic failure of check-mount script redirection in PreStartHook occurs due to slow startup of the Fuse container. #4455

Open
Syspretor opened this issue Dec 23, 2024 · 0 comments · May be fixed by #4456
Labels
bug Something isn't working

Comments

@Syspretor
Copy link
Collaborator

The following postStartHook will be injected in the fuse-sidecar to check the mount status.

lifecycle:
  postStart:
    exec:
      command: [ "/bin/sh", "-c", "time /check-mount.sh >> /proc/1/fd/1" ]

However, the startup of the Container and the execution of the PostStartHook occur in parallel. it is possible that when the PostStartHook is executed, the PID 1 process inside the container has not completed its startup. This can lead to a failure in redirecting to /proc/1/fd/1 when the PostStartHook runs the check-mount script.

The relevant error log in Kubelet (1.31) is as follows:

PostStartHookError: Exec lifecycle hook ([bash -c time /check-mount.sh >> /proc/1/fd/1]) for Container "fluid-fuse-0" in Pod "" failed - error: command 'bash -c time /check-mount.sh >> /proc/1/fd/1' exited with 1: bash: /proc/1/fd/1: Permission denied
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant