-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add code for the GCS FUSE guide #1380
base: main
Are you sure you want to change the base?
add code for the GCS FUSE guide #1380
Conversation
@kiryl-filatau Can you pls add licenses in all of the to be merged filed?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kiryl-filatau and reviewers,
I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:
This pull request adds code for a Google Cloud Storage (GCS) FUSE guide. The intent is to provide sample code and configurations demonstrating how to use GCS FUSE to integrate GCS with Kubernetes. The changes include:
- New Cloud Build configurations: Several new YAML files (
cloudbuild-preload.yaml
,cloudbuild-prepare-autopilot.yaml
,cloudbuild-prepare-standard.yaml
) define the build process for setting up the environment. These scripts handle bucket creation, service account setup, model cloning, and uploading to GCS. - Model deployment YAMLs: Two new YAML files (
model-deployment-autopilot.yaml
,model-deployment-standard.yaml
) define Kubernetes deployments for the model, leveraging GCS FUSE for data access. These files specify containers, resources, and volume mounts for GCS FUSE. - Run script: A
run-commands.sh
script orchestrates the entire process, including Cloud Build invocations, kubectl commands, and cleanup actions. This script manages secrets, permissions, and deployment steps.
The changes are substantial and well-organized, demonstrating a clear understanding of the GCS FUSE integration process. The use of Cloud Build for automation is a significant strength. I'll be looking for potential improvements in terms of error handling and documentation in my full review.
Here's a little haiku to brighten your day:
Cloud storage flows,
Fuse mounts, models load with ease,
Code review's delight.
Looking forward to discussing this further!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request adds code for a GCS FUSE guide. The YAML files for Cloud Build and the Kubernetes deployment look good. However, the run-commands.sh
script could be improved for readability and maintainability. The commit message and PR description could also be more detailed. No style guide was provided, so I'm defaulting to the Google Shell Style Guide for the shell script, and general Kubernetes best practices for the YAML files.
Here's a summary of the style guide elements I referenced:
- Google Shell Style Guide: I've suggested changes to variable naming and quoting in the shell script to adhere to this style guide.
- Kubernetes best practices: I've recommended adding more descriptive comments to the YAML files to improve their readability and maintainability.
It's also important to ensure the placeholder values in the deployment YAML files are replaced with actual values before deployment. The PR description should clearly state how these placeholders should be replaced and what the expected values are. Additionally, consider adding a section to the README explaining how to use these new files.
export PROJECT_ID=$(gcloud config get project) \ | ||
&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \ | ||
&& export REGION=REGION \ | ||
&& export ZONE=ZONE \ | ||
&& export CLUSTER_NAME=CLUSTER_NAME \ | ||
&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \ | ||
&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \ | ||
&& export MODEL_PATH=MODEL_PATH_NAME \ | ||
&& export ROLE_NAME=ROLE_NAME \ | ||
&& export DISK_IMAGE=DISK_IMAGE_NAME \ | ||
&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \ | ||
&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the Google Shell Style Guide, variable names should be lowercase with underscores. Also, it's generally a good practice to quote variables to prevent word splitting and globbing issues. Consider using local
variables when possible to limit scope.
export PROJECT_ID=$(gcloud config get project) \ | |
&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \ | |
&& export REGION=REGION \ | |
&& export ZONE=ZONE \ | |
&& export CLUSTER_NAME=CLUSTER_NAME \ | |
&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \ | |
&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \ | |
&& export MODEL_PATH=MODEL_PATH_NAME \ | |
&& export ROLE_NAME=ROLE_NAME \ | |
&& export DISK_IMAGE=DISK_IMAGE_NAME \ | |
&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \ | |
&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME | |
export project_id=$(gcloud config get-value project) \ | |
&& export project_number=$(gcloud projects describe "${project_id}" --format="value(projectNumber)") \ | |
&& local region="REGION" \ | |
&& local zone="ZONE" \ | |
&& local cluster_name="CLUSTER_NAME" \ | |
&& local bucket_name="MODEL_FILES_BUCKET_NAME" \ | |
&& local ksa_name="K8S_SERVICE_ACCOUNT_NAME" \ | |
&& local model_path="MODEL_PATH_NAME" \ | |
&& local role_name="ROLE_NAME" \ | |
&& local disk_image="DISK_IMAGE_NAME" \ | |
&& local log_bucket_name="${bucket_name}-disk-creation-logs" \ | |
&& local container_image="CONTAINER_IMAGE_NAME" |
echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \ | ||
&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good practice to quote the variables here as well. Also, consider checking the exit codes of these commands to ensure they succeeded.
echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \ | |
&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=- | |
echo -n "${hf_username}" | gcloud secrets create hf-username --data-file=- || exit 1 \ | |
&& echo -n "${hf_token}" | gcloud secrets create hf-token --data-file=- || exit 1 |
gcloud builds submit \ | ||
--config cloudbuild-prepare-autopilot.yaml --no-source \ | ||
--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining what this Cloud Build step does? This will improve readability. Also, quote the substituted variables.
gcloud builds submit \ | |
--config cloudbuild-prepare-autopilot.yaml --no-source \ | |
--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID | |
# Prepare the Autopilot cluster | |
gcloud builds submit \ | |
--config cloudbuild-prepare-autopilot.yaml --no-source \ | |
--substitutions=_DISK_IMAGE="${disk_image}",_CONTAINER_IMAGE="${container_image}",_BUCKET_NAME="${log_bucket_name}",_REGION="${region}",_ZONE="${zone}",_CLUSTER_NAME="${cluster_name}",_PROJECT_ID="${project_id}" |
--location=${REGION} | ||
|
||
# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in AUTOPILOT | ||
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quoting here is a bit complex and could be error-prone. Consider using a heredoc instead to make it more readable and ensure that single quotes within the YAML are handled correctly.
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f - | |
cat model-deployment-autopilot.yaml | sed -E "s|<BUCKET_NAME>|${bucket_name}|g; s|<KSA_NAME>|${ksa_name}|g; s|<CONTAINER_IMAGE>|${container_image}|g; s|<DISK_IMAGE_NAME>|${disk_image}|g; s|<PROJECT_ID>|${project_id}|g" | kubectl apply -f - |
ai.gke.io/inference-server: vllm | ||
examples.ai.gke.io/source: user-guide | ||
spec: | ||
containers: | ||
- name: inference-server | ||
image: us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20240930_0945_RC00 | ||
image: <CONTAINER_IMAGE> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driver: gcsfuse.csi.storage.gke.io | ||
readOnly: false | ||
volumeAttributes: | ||
bucketName: <BUCKET_NAME> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ai.gke.io/inference-server: vllm | ||
examples.ai.gke.io/source: user-guide | ||
spec: | ||
containers: | ||
- name: inference-server | ||
image: us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20240930_0945_RC00 | ||
image: <CONTAINER_IMAGE> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: gcs-fuse-csi-ephemeral | ||
mountPath: /model | ||
readOnly: false | ||
serviceAccountName: <KSA_NAME> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driver: gcsfuse.csi.storage.gke.io | ||
readOnly: false | ||
volumeAttributes: | ||
bucketName: <BUCKET_NAME> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spec: | ||
allowedResourcePatterns: | ||
- "projects/${_PROJECT_ID}/global/images/.*" | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kiryl-filatau and reviewers,
I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:
This pull request adds code for the GCS FUSE guide. The intent is to provide complete, working examples demonstrating how to use GCS FUSE to integrate Google Cloud Storage with Kubernetes. The changes include:
- New Cloud Build configuration files:
cloudbuild-preload.yaml
,cloudbuild-prepare-autopilot.yaml
,cloudbuild-prepare-standard.yaml
These files define the steps for building and deploying the GCS FUSE example, including creating a GCS bucket, setting up Kubernetes service accounts, cloning model weights, and uploading files to the bucket. They also handle the creation of a GKE cluster with the GCS Fuse addon enabled, and the creation of a node pool with a secondary boot disk containing a cached image for serving the model. - New Kubernetes deployment manifests:
model-deployment-autopilot.yaml
,model-deployment-standard.yaml
These files define the deployment of the inference server using the GCS Fuse CSI driver to mount the model from GCS. - New shell script:
run-commands.sh
This script orchestrates the entire process, from setting up environment variables and secrets to running Cloud Build jobs, deploying the Kubernetes manifests, and finally cleaning up resources.
The code additions are substantial and well-structured, leveraging Cloud Build for automation and Kubernetes for deployment. The use of separate configuration files for Autopilot and Standard cluster setups is a nice touch, offering flexibility to users.
However, I'll need to perform a more thorough review to assess the robustness of error handling and the overall security posture. I'll also verify the correctness of the deployment and cleanup steps.
Here's a little haiku to lighten the mood:
Cloud storage mounts,
Containers dance with data streams,
Code flows smoothly.
Looking forward to a productive review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request adds code for a GCS FUSE guide. The YAML files for Cloud Build and the Kubernetes deployment look good. However, the run-commands.sh
script could be improved for readability and maintainability. I've also included some suggestions based on general shell scripting best practices and the Google Shell Style Guide.
Here's a summary of the style guide elements I referenced:
- Shell Scripting Best Practices: Use of
[[ ]]
for conditional expressions, quoting variables, and using-o pipefail
to catch errors in pipelines. - Google Shell Style Guide: Use of lowercase variable names where applicable, and consistent indentation.
export PROJECT_ID=$(gcloud config get project) \ | ||
&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \ | ||
&& export REGION=REGION \ | ||
&& export ZONE=ZONE \ | ||
&& export CLUSTER_NAME=CLUSTER_NAME \ | ||
&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \ | ||
&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \ | ||
&& export MODEL_PATH=MODEL_PATH_NAME \ | ||
&& export ROLE_NAME=ROLE_NAME \ | ||
&& export DISK_IMAGE=DISK_IMAGE_NAME \ | ||
&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \ | ||
&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The long line for setting environment variables can be made more readable by separating each variable declaration onto its own line. This adheres to the Google Shell Style Guide for better readability. It also makes it easier to modify individual variables later.
export PROJECT_ID=$(gcloud config get project) \ | |
&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \ | |
&& export REGION=REGION \ | |
&& export ZONE=ZONE \ | |
&& export CLUSTER_NAME=CLUSTER_NAME \ | |
&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \ | |
&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \ | |
&& export MODEL_PATH=MODEL_PATH_NAME \ | |
&& export ROLE_NAME=ROLE_NAME \ | |
&& export DISK_IMAGE=DISK_IMAGE_NAME \ | |
&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \ | |
&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME | |
export PROJECT_ID=$(gcloud config get project) | |
export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") | |
export REGION=REGION | |
export ZONE=ZONE | |
export CLUSTER_NAME=CLUSTER_NAME | |
export BUCKET_NAME=MODEL_FILES_BUCKET_NAME | |
export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME | |
export MODEL_PATH=MODEL_PATH_NAME | |
export ROLE_NAME=ROLE_NAME | |
export DISK_IMAGE=DISK_IMAGE_NAME | |
export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs | |
export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME |
echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \ | ||
&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally a good practice to quote variables, especially when dealing with secrets, to prevent word splitting or globbing issues. This is a general shell scripting best practice.
echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \ | |
&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=- | |
echo -n "${HF_USERNAME}" | gcloud secrets create hf-username --data-file=- \ | |
&& echo -n "${HF_TOKEN}" | gcloud secrets create hf-token --data-file=- |
gcloud projects add-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/storage.admin" \ | ||
--condition=None \ | ||
&& gcloud projects add-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/secretmanager.secretAccessor" \ | ||
--condition=None \ | ||
&& gcloud projects add-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/container.clusterAdmin" \ | ||
--condition=None \ | ||
&& gcloud projects add-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/compute.instanceAdmin.v1" \ | ||
--condition=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a loop here would make this code more concise and easier to maintain, following the Google Shell Style Guide's preference for avoiding unnecessary repetition. Also, consider using gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:[email protected] --roles $roles
to bind multiple roles at once.
gcloud projects add-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/storage.admin" \ | |
--condition=None \ | |
&& gcloud projects add-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/secretmanager.secretAccessor" \ | |
--condition=None \ | |
&& gcloud projects add-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/container.clusterAdmin" \ | |
--condition=None \ | |
&& gcloud projects add-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/compute.instanceAdmin.v1" \ | |
--condition=None | |
roles="roles/storage.admin,roles/secretmanager.secretAccessor,roles/container.clusterAdmin,roles/compute.instanceAdmin.v1" | |
gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com --roles $roles |
--location=${REGION} | ||
|
||
# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in AUTOPILOT | ||
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a variable for the deployment file name to avoid repetition and improve maintainability. This is a general shell scripting best practice.
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f - | |
DEPLOYMENT_FILE=model-deployment-autopilot.yaml | |
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" "$DEPLOYMENT_FILE" | kubectl apply -f - |
|
||
|
||
# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in STANDARD | ||
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous suggestion, use a variable for the deployment file name here as well for consistency and maintainability. This is a general shell scripting best practice.
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f - | |
DEPLOYMENT_FILE=model-deployment-standard.yaml | |
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" "$DEPLOYMENT_FILE" | kubectl apply -f - |
gcloud secrets delete hf-username \ | ||
--quiet \ | ||
&& gcloud secrets delete hf-token \ | ||
--quiet \ | ||
&& gcloud container clusters delete ${CLUSTER_NAME} \ | ||
--region=${REGION} \ | ||
--quiet \ | ||
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/storage.admin" \ | ||
--condition=None \ | ||
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/secretmanager.secretAccessor" \ | ||
--condition=None \ | ||
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/container.clusterAdmin" \ | ||
--condition=None \ | ||
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | ||
--member="serviceAccount:[email protected]" \ | ||
--role="roles/compute.instanceAdmin.v1" \ | ||
--condition=None \ | ||
&& gcloud storage rm --recursive gs://$BUCKET_NAME \ | ||
&& gcloud storage rm --recursive gs://$LOG_BUCKET_NAME \ | ||
&& gcloud compute images delete $DISK_IMAGE \ | ||
--quiet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanup section can also benefit from a loop to reduce redundancy and improve readability, per the Google Shell Style Guide. Also, group the remove-iam-policy-binding
calls similarly to the add-iam-policy-binding
calls.
gcloud secrets delete hf-username \ | |
--quiet \ | |
&& gcloud secrets delete hf-token \ | |
--quiet \ | |
&& gcloud container clusters delete ${CLUSTER_NAME} \ | |
--region=${REGION} \ | |
--quiet \ | |
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/storage.admin" \ | |
--condition=None \ | |
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/secretmanager.secretAccessor" \ | |
--condition=None \ | |
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/container.clusterAdmin" \ | |
--condition=None \ | |
&& gcloud projects remove-iam-policy-binding $PROJECT_ID \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--role="roles/compute.instanceAdmin.v1" \ | |
--condition=None \ | |
&& gcloud storage rm --recursive gs://$BUCKET_NAME \ | |
&& gcloud storage rm --recursive gs://$LOG_BUCKET_NAME \ | |
&& gcloud compute images delete $DISK_IMAGE \ | |
--quiet | |
for secret in hf-username hf-token; do | |
gcloud secrets delete "$secret" --quiet | |
done | |
gcloud container clusters delete "${CLUSTER_NAME}" --region="${REGION}" --quiet | |
roles="roles/storage.admin,roles/secretmanager.secretAccessor,roles/container.clusterAdmin,roles/compute.instanceAdmin.v1" | |
gcloud projects remove-iam-policy-binding "$PROJECT_ID" \ | |
--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \ | |
--roles="$roles" \ | |
--condition=None | |
gcloud storage rm --recursive gs://"$BUCKET_NAME" | |
gcloud storage rm --recursive gs://"$LOG_BUCKET_NAME" | |
gcloud compute images delete "$DISK_IMAGE" --quiet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kiryl-filatau and reviewers,
I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:
This pull request adds code for the GCS FUSE guide. The intent is to provide complete, runnable examples demonstrating how to use GCS FUSE to integrate Google Cloud Storage with Kubernetes. The changes include:
- New Cloud Build configurations: Several new
.yaml
files (cloudbuild-preload.yaml
,cloudbuild-prepare-autopilot.yaml
,cloudbuild-prepare-standard.yaml
) define the build process for setting up the environment. These scripts handle bucket creation, service account setup, model cloning, and deployment. - Updated Deployment Manifests: Two Kubernetes deployment manifests (
model-deployment-autopilot.yaml
,model-deployment-standard.yaml
) are added, configuring deployments for both Autopilot and Standard GKE clusters. These leverage GCS FUSE for model storage. - Run Script: A
run-commands.sh
script orchestrates the entire process, including setting up secrets, managing permissions, running Cloud Build jobs, applying Kubernetes manifests, and performing cleanup.
The changes are extensive and well-structured, using Cloud Build for automation and clearly separating concerns. The use of placeholders in the YAML files and the run-commands.sh
script makes it easy to adapt the examples to different environments.
Here's a little something to lighten the mood:
Code flows like a river,
Building bridges, strong and true,
Testing makes it soar.
Looking forward to a thorough discussion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request adds code for a GCS FUSE guide. The YAML files for Cloud Build and the Kubernetes deployment look good. However, the run-commands.sh
script could be improved for readability and maintainability. The commit message and PR description could also be more detailed. No style guide was provided, so I'm defaulting to the Google Shell Style Guide for the shell script and commonly used YAML best practices.
Specifically, I've noted areas in the shell script where quoting and variable usage could be more robust, and where functions could be used to improve organization. I've also suggested improvements to the error handling and logging. Additionally, I've recommended adding more detail to the PR description and commit messages to clearly explain the purpose and functionality of the added code. It's also important to ensure the placeholder values in the deployment YAMLs and the script are replaced with actual values before merging.
export PROJECT_ID=$(gcloud config get project) \ | ||
&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \ | ||
&& export REGION=REGION \ | ||
&& export ZONE=ZONE \ | ||
&& export CLUSTER_NAME=CLUSTER_NAME \ | ||
&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \ | ||
&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \ | ||
&& export MODEL_PATH=MODEL_PATH_NAME \ | ||
&& export ROLE_NAME=ROLE_NAME \ | ||
&& export DISK_IMAGE=DISK_IMAGE_NAME \ | ||
&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \ | ||
&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally a good practice to quote variable assignments, especially when dealing with command substitutions, to prevent word splitting and globbing issues. Refer to the Google Shell Style Guide section on variable expansion. Also, consider using lowercase variable names for better portability and consistency with common shell scripting conventions.
export PROJECT_ID=$(gcloud config get project) \ | |
&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \ | |
&& export REGION=REGION \ | |
&& export ZONE=ZONE \ | |
&& export CLUSTER_NAME=CLUSTER_NAME \ | |
&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \ | |
&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \ | |
&& export MODEL_PATH=MODEL_PATH_NAME \ | |
&& export ROLE_NAME=ROLE_NAME \ | |
&& export DISK_IMAGE=DISK_IMAGE_NAME \ | |
&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \ | |
&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME | |
export project_id="$(gcloud config get project)" \ | |
&& export project_number="$(gcloud projects list --filter="$project_id" --format="value(PROJECT_NUMBER)")" \ | |
&& export region="REGION" \ | |
&& export zone="ZONE" \ | |
&& export cluster_name="CLUSTER_NAME" \ | |
&& export bucket_name="MODEL_FILES_BUCKET_NAME" \ | |
&& export ksa_name="K8S_SERVICE_ACCOUNT_NAME" \ | |
&& export model_path="MODEL_PATH_NAME" \ | |
&& export role_name="ROLE_NAME" \ | |
&& export disk_image="DISK_IMAGE_NAME" \ | |
&& export log_bucket_name="${bucket_name}-disk-creation-logs" \ | |
&& export container_image="CONTAINER_IMAGE_NAME" |
echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \ | ||
&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using printf
instead of echo -n
for better portability and to avoid potential issues with certain special characters in the variables. See the Google Shell Style Guide section on echo. Also, ensure that the variables HF_USERNAME
and HF_TOKEN
are properly set before this script is run. It's best practice to avoid relying on implicit shell variable inheritance.
echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \ | |
&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=- | |
printf "%s" "${hf_username}" | gcloud secrets create hf-username --data-file=- \ | |
&& printf "%s" "${hf_token}" | gcloud secrets create hf-token --data-file=- |
gcloud builds submit \ | ||
--config cloudbuild-prepare-autopilot.yaml --no-source \ | ||
--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID | ||
|
||
# Run the Cloud Build command to prepare the cluster with all required substitutions | ||
gcloud builds submit \ | ||
--config cloudbuild-prepare-standard.yaml --no-source \ | ||
--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These gcloud builds submit
commands could be refactored into functions for better readability, organization, and error handling. This would also make it easier to reuse the logic if needed. Additionally, consider adding error checking after each command to ensure the build steps complete successfully. Refer to the Google Shell Style Guide section on functions and exit codes.
gcloud builds submit \ | |
--config cloudbuild-prepare-autopilot.yaml --no-source \ | |
--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID | |
# Run the Cloud Build command to prepare the cluster with all required substitutions | |
gcloud builds submit \ | |
--config cloudbuild-prepare-standard.yaml --no-source \ | |
--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID | |
build_image() { | |
gcloud builds submit \ | |
--config "$1" --no-source \ | |
--substitutions="$2" \ | |
|| { echo "gcloud builds submit failed for $1"; exit 1; } | |
} | |
build_image cloudbuild-prepare-autopilot.yaml "_DISK_IMAGE=${disk_image},_CONTAINER_IMAGE=${container_image},_BUCKET_NAME=${log_bucket_name},_REGION=${region},_ZONE=${zone},_CLUSTER_NAME=${cluster_name},_PROJECT_ID=${project_id}" | |
build_image cloudbuild-prepare-standard.yaml "_DISK_IMAGE=${disk_image},_CONTAINER_IMAGE=${container_image},_BUCKET_NAME=${log_bucket_name},_REGION=${region},_ZONE=${zone},_CLUSTER_NAME=${cluster_name},_PROJECT_ID=${project_id}" | |
build_image cloudbuild-preload.yaml "_BUCKET_NAME=${bucket_name},_CLUSTER_NAME=${cluster_name},_REGION=${region},_KSA_NAME=${ksa_name},_PROJECT_NUMBER=${project_number},_PROJECT_ID=${project_id},_ROLE_NAME=${role_name},_MODEL_PATH=${model_path}" |
--location=${REGION} | ||
|
||
# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in AUTOPILOT | ||
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quoting in this sed
command looks a bit complex and could lead to errors if the replacement values contain special characters. Consider using alternative delimiters for the s
command to avoid escaping the single quotes, or breaking the command into multiple, simpler sed
commands. Also, ensure that the placeholder values are actually replaced with real values before running this command.
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f - | |
sed -E "s|<BUCKET_NAME>|${bucket_name}|g; s|<KSA_NAME>|${ksa_name}|g; s|<CONTAINER_IMAGE>|'${container_image}'|g; s|<DISK_IMAGE_NAME>|${disk_image}|g; s|<PROJECT_ID>|${project_id}|g" model-deployment-autopilot.yaml | kubectl apply -f - |
|
||
|
||
# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in STANDARD | ||
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment, simplify the quoting in this sed
command and ensure placeholders are replaced with actual values.
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f - | |
sed -E "s|<BUCKET_NAME>|${bucket_name}|g; s|<KSA_NAME>|${ksa_name}|g; s|<CONTAINER_IMAGE>|'${container_image}'|g" model-deployment-standard.yaml | kubectl apply -f - |
sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f - | ||
|
||
# Check the logs of the pod | ||
kubectl logs $(kubectl get pods -o jsonpath='{.items[0].metadata.name}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good practice to check the exit code of commands and handle potential errors. For example, what happens if no pods are running? The Google Shell Style Guide recommends checking exit codes and providing informative error messages. See Exit Codes.
kubectl logs $(kubectl get pods -o jsonpath='{.items[0].metadata.name}') | |
if pod_name=$(kubectl get pods -o jsonpath='{.items[0].metadata.name}') &> /dev/null; then | |
kubectl logs "$pod_name" | |
else | |
echo "No pods found or error retrieving pod name." | |
fi |
spec: | ||
allowedResourcePatterns: | ||
- "projects/${_PROJECT_ID}/global/images/.*" | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…netes-engine-samples into gcsfuse-adapter-preload-model gitlab.com,172.65.251.78 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY=
Description
This PR....
Tasks