add code for the GCS FUSE guide #1380

kiryl-filatau · 2024-07-18T14:25:00Z

Description

This PR....

Tasks

The contributing guide has been read and followed.
The samples added / modified have been fully tested.
Workflow files have been added / modified, if applicable.
Region tags have been properly added, if new samples.
All dependencies are set to up-to-date versions, as applicable.

brandonroyal · 2024-11-08T16:20:25Z

@kiryl-filatau Can you pls add licenses in all of the to be merged filed?

# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

code-review-assist

Hi @kiryl-filatau and reviewers,

I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:

This pull request adds code for a Google Cloud Storage (GCS) FUSE guide. The intent is to provide sample code and configurations demonstrating how to use GCS FUSE to integrate GCS with Kubernetes. The changes include:

New Cloud Build configurations: Several new YAML files (cloudbuild-preload.yaml, cloudbuild-prepare-autopilot.yaml, cloudbuild-prepare-standard.yaml) define the build process for setting up the environment. These scripts handle bucket creation, service account setup, model cloning, and uploading to GCS.
Model deployment YAMLs: Two new YAML files (model-deployment-autopilot.yaml, model-deployment-standard.yaml) define Kubernetes deployments for the model, leveraging GCS FUSE for data access. These files specify containers, resources, and volume mounts for GCS FUSE.
Run script: A run-commands.sh script orchestrates the entire process, including Cloud Build invocations, kubectl commands, and cleanup actions. This script manages secrets, permissions, and deployment steps.

The changes are substantial and well-organized, demonstrating a clear understanding of the GCS FUSE integration process. The use of Cloud Build for automation is a significant strength. I'll be looking for potential improvements in terms of error handling and documentation in my full review.

Here's a little haiku to brighten your day:

Cloud storage flows,
Fuse mounts, models load with ease,
Code review's delight.

Looking forward to discussing this further!

code-review-assist

This pull request adds code for a GCS FUSE guide. The YAML files for Cloud Build and the Kubernetes deployment look good. However, the run-commands.sh script could be improved for readability and maintainability. The commit message and PR description could also be more detailed. No style guide was provided, so I'm defaulting to the Google Shell Style Guide for the shell script, and general Kubernetes best practices for the YAML files.

Here's a summary of the style guide elements I referenced:

Google Shell Style Guide: I've suggested changes to variable naming and quoting in the shell script to adhere to this style guide.
Kubernetes best practices: I've recommended adding more descriptive comments to the YAML files to improve their readability and maintainability.

It's also important to ensure the placeholder values in the deployment YAML files are replaced with actual values before deployment. The PR description should clearly state how these placeholders should be replaced and what the expected values are. Additionally, consider adding a section to the README explaining how to use these new files.

code-review-assist · 2024-12-30T16:05:19Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+export PROJECT_ID=$(gcloud config get project) \
+&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \
+&& export REGION=REGION \
+&& export ZONE=ZONE \
+&& export CLUSTER_NAME=CLUSTER_NAME \
+&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \
+&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \
+&& export MODEL_PATH=MODEL_PATH_NAME \
+&& export ROLE_NAME=ROLE_NAME \
+&& export DISK_IMAGE=DISK_IMAGE_NAME \
+&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \
+&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME


According to the Google Shell Style Guide, variable names should be lowercase with underscores. Also, it's generally a good practice to quote variables to prevent word splitting and globbing issues. Consider using local variables when possible to limit scope.

Suggested change

export PROJECT_ID=$(gcloud config get project) \

&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \

&& export REGION=REGION \

&& export ZONE=ZONE \

&& export CLUSTER_NAME=CLUSTER_NAME \

&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \

&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \

&& export MODEL_PATH=MODEL_PATH_NAME \

&& export ROLE_NAME=ROLE_NAME \

&& export DISK_IMAGE=DISK_IMAGE_NAME \

&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \

&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME

export project_id=$(gcloud config get-value project) \

&& export project_number=$(gcloud projects describe "${project_id}" --format="value(projectNumber)") \

&& local region="REGION" \

&& local zone="ZONE" \

&& local cluster_name="CLUSTER_NAME" \

&& local bucket_name="MODEL_FILES_BUCKET_NAME" \

&& local ksa_name="K8S_SERVICE_ACCOUNT_NAME" \

&& local model_path="MODEL_PATH_NAME" \

&& local role_name="ROLE_NAME" \

&& local disk_image="DISK_IMAGE_NAME" \

&& local log_bucket_name="${bucket_name}-disk-creation-logs" \

&& local container_image="CONTAINER_IMAGE_NAME"

code-review-assist · 2024-12-30T16:05:19Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \
+&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=-


It's good practice to quote the variables here as well. Also, consider checking the exit codes of these commands to ensure they succeeded.

Suggested change

echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \

&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=-

echo -n "${hf_username}" | gcloud secrets create hf-username --data-file=- || exit 1 \

&& echo -n "${hf_token}" | gcloud secrets create hf-token --data-file=- || exit 1

code-review-assist · 2024-12-30T16:05:19Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+gcloud builds submit \
+  --config cloudbuild-prepare-autopilot.yaml --no-source \
+  --substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID


Can you add a comment explaining what this Cloud Build step does? This will improve readability. Also, quote the substituted variables.

Suggested change

gcloud builds submit \

--config cloudbuild-prepare-autopilot.yaml --no-source \

--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID

# Prepare the Autopilot cluster

gcloud builds submit \

--config cloudbuild-prepare-autopilot.yaml --no-source \

--substitutions=_DISK_IMAGE="${disk_image}",_CONTAINER_IMAGE="${container_image}",_BUCKET_NAME="${log_bucket_name}",_REGION="${region}",_ZONE="${zone}",_CLUSTER_NAME="${cluster_name}",_PROJECT_ID="${project_id}"

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+  --location=${REGION}
+
+# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in AUTOPILOT
+sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f -


The quoting here is a bit complex and could be error-prone. Consider using a heredoc instead to make it more readable and ensure that single quotes within the YAML are handled correctly.

Suggested change

sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f -

cat model-deployment-autopilot.yaml | sed -E "s|<BUCKET_NAME>|${bucket_name}|g; s|<KSA_NAME>|${ksa_name}|g; s|<CONTAINER_IMAGE>|${container_image}|g; s|<DISK_IMAGE_NAME>|${disk_image}|g; s|<PROJECT_ID>|${project_id}|g" | kubectl apply -f -

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/model-deployment-autopilot.yaml

        ai.gke.io/inference-server: vllm
        examples.ai.gke.io/source: user-guide
    spec:
      containers:
      - name: inference-server
-        image: us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20240930_0945_RC00
+        image: <CONTAINER_IMAGE>


Replace <CONTAINER_IMAGE> with the actual container image to be used.

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/model-deployment-autopilot.yaml

+          driver: gcsfuse.csi.storage.gke.io
+          readOnly: false
+          volumeAttributes:
+            bucketName: <BUCKET_NAME>


Replace <BUCKET_NAME> with the actual bucket name.

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/model-deployment-standard.yaml

        ai.gke.io/inference-server: vllm
        examples.ai.gke.io/source: user-guide
    spec:
      containers:
      - name: inference-server
-        image: us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20240930_0945_RC00
+        image: <CONTAINER_IMAGE>


Replace <CONTAINER_IMAGE> with the actual container image to be used.

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/model-deployment-standard.yaml

+        - name: gcs-fuse-csi-ephemeral
+          mountPath: /model
+          readOnly: false
+      serviceAccountName: <KSA_NAME>


Replace <KSA_NAME> with the actual Kubernetes service account name.

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/model-deployment-standard.yaml

+          driver: gcsfuse.csi.storage.gke.io
+          readOnly: false
+          volumeAttributes:
+            bucketName: <BUCKET_NAME>


Replace <BUCKET_NAME> with the actual bucket name.

code-review-assist · 2024-12-30T16:05:20Z

ai-ml/gcs-fuse-adapter/cloudbuild-prepare-autopilot.yaml

+        spec:
+          allowedResourcePatterns:
+          - "projects/${_PROJECT_ID}/global/images/.*"
+        EOF


Nit: Trailing newline.

Suggested change

EOF

EOF

code-review-assist

Hi @kiryl-filatau and reviewers,

I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:

This pull request adds code for the GCS FUSE guide. The intent is to provide complete, working examples demonstrating how to use GCS FUSE to integrate Google Cloud Storage with Kubernetes. The changes include:

New Cloud Build configuration files: cloudbuild-preload.yaml, cloudbuild-prepare-autopilot.yaml, cloudbuild-prepare-standard.yaml These files define the steps for building and deploying the GCS FUSE example, including creating a GCS bucket, setting up Kubernetes service accounts, cloning model weights, and uploading files to the bucket. They also handle the creation of a GKE cluster with the GCS Fuse addon enabled, and the creation of a node pool with a secondary boot disk containing a cached image for serving the model.
New Kubernetes deployment manifests: model-deployment-autopilot.yaml, model-deployment-standard.yaml These files define the deployment of the inference server using the GCS Fuse CSI driver to mount the model from GCS.
New shell script: run-commands.sh This script orchestrates the entire process, from setting up environment variables and secrets to running Cloud Build jobs, deploying the Kubernetes manifests, and finally cleaning up resources.

The code additions are substantial and well-structured, leveraging Cloud Build for automation and Kubernetes for deployment. The use of separate configuration files for Autopilot and Standard cluster setups is a nice touch, offering flexibility to users.

However, I'll need to perform a more thorough review to assess the robustness of error handling and the overall security posture. I'll also verify the correctness of the deployment and cleanup steps.

Here's a little haiku to lighten the mood:

Cloud storage mounts,
Containers dance with data streams,
Code flows smoothly.

Looking forward to a productive review!

code-review-assist

This pull request adds code for a GCS FUSE guide. The YAML files for Cloud Build and the Kubernetes deployment look good. However, the run-commands.sh script could be improved for readability and maintainability. I've also included some suggestions based on general shell scripting best practices and the Google Shell Style Guide.

Here's a summary of the style guide elements I referenced:

Shell Scripting Best Practices: Use of [[ ]] for conditional expressions, quoting variables, and using -o pipefail to catch errors in pipelines.
Google Shell Style Guide: Use of lowercase variable names where applicable, and consistent indentation.

code-review-assist · 2024-12-30T16:07:41Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+export PROJECT_ID=$(gcloud config get project) \
+&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \
+&& export REGION=REGION \
+&& export ZONE=ZONE \
+&& export CLUSTER_NAME=CLUSTER_NAME \
+&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \
+&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \
+&& export MODEL_PATH=MODEL_PATH_NAME \
+&& export ROLE_NAME=ROLE_NAME \
+&& export DISK_IMAGE=DISK_IMAGE_NAME \
+&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \
+&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME


The long line for setting environment variables can be made more readable by separating each variable declaration onto its own line. This adheres to the Google Shell Style Guide for better readability. It also makes it easier to modify individual variables later.

Suggested change

export PROJECT_ID=$(gcloud config get project) \

&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \

&& export REGION=REGION \

&& export ZONE=ZONE \

&& export CLUSTER_NAME=CLUSTER_NAME \

&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \

&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \

&& export MODEL_PATH=MODEL_PATH_NAME \

&& export ROLE_NAME=ROLE_NAME \

&& export DISK_IMAGE=DISK_IMAGE_NAME \

&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \

&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME

export PROJECT_ID=$(gcloud config get project)

export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)")

export REGION=REGION

export ZONE=ZONE

export CLUSTER_NAME=CLUSTER_NAME

export BUCKET_NAME=MODEL_FILES_BUCKET_NAME

export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME

export MODEL_PATH=MODEL_PATH_NAME

export ROLE_NAME=ROLE_NAME

export DISK_IMAGE=DISK_IMAGE_NAME

export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs

export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME

code-review-assist · 2024-12-30T16:07:41Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \
+&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=-


It's generally a good practice to quote variables, especially when dealing with secrets, to prevent word splitting or globbing issues. This is a general shell scripting best practice.

Suggested change

echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \

&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=-

echo -n "${HF_USERNAME}" | gcloud secrets create hf-username --data-file=- \

&& echo -n "${HF_TOKEN}" | gcloud secrets create hf-token --data-file=-

code-review-assist · 2024-12-30T16:07:41Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+gcloud projects add-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/storage.admin" \
+    --condition=None \
+&& gcloud projects add-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/secretmanager.secretAccessor" \
+    --condition=None \
+&& gcloud projects add-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/container.clusterAdmin" \
+    --condition=None \
+&& gcloud projects add-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/compute.instanceAdmin.v1" \
+    --condition=None


Using a loop here would make this code more concise and easier to maintain, following the Google Shell Style Guide's preference for avoiding unnecessary repetition. Also, consider using gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:[email protected] --roles $roles to bind multiple roles at once.

Suggested change

gcloud projects add-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/storage.admin" \

--condition=None \

&& gcloud projects add-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/secretmanager.secretAccessor" \

--condition=None \

&& gcloud projects add-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/container.clusterAdmin" \

--condition=None \

&& gcloud projects add-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/compute.instanceAdmin.v1" \

--condition=None

roles="roles/storage.admin,roles/secretmanager.secretAccessor,roles/container.clusterAdmin,roles/compute.instanceAdmin.v1"

gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com --roles $roles

code-review-assist · 2024-12-30T16:07:41Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+  --location=${REGION}
+
+# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in AUTOPILOT
+sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f -


code-review-assist · 2024-12-30T16:07:41Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+
+
+# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in STANDARD
+sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f -


code-review-assist · 2024-12-30T16:07:41Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+gcloud secrets delete hf-username \
+  --quiet \
+&& gcloud secrets delete hf-token \
+    --quiet \
+&& gcloud container clusters delete ${CLUSTER_NAME} \
+    --region=${REGION} \
+    --quiet \
+&& gcloud projects remove-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/storage.admin" \
+    --condition=None \
+&& gcloud projects remove-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/secretmanager.secretAccessor" \
+    --condition=None \
+&& gcloud projects remove-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/container.clusterAdmin" \
+    --condition=None \
+&& gcloud projects remove-iam-policy-binding $PROJECT_ID \
+    --member="serviceAccount:[email protected]" \
+    --role="roles/compute.instanceAdmin.v1" \
+    --condition=None \
+&& gcloud storage rm --recursive gs://$BUCKET_NAME \
+&& gcloud storage rm --recursive gs://$LOG_BUCKET_NAME \
+&& gcloud compute images delete $DISK_IMAGE \
+    --quiet


The cleanup section can also benefit from a loop to reduce redundancy and improve readability, per the Google Shell Style Guide. Also, group the remove-iam-policy-binding calls similarly to the add-iam-policy-binding calls.

Suggested change

gcloud secrets delete hf-username \

--quiet \

&& gcloud secrets delete hf-token \

--quiet \

&& gcloud container clusters delete ${CLUSTER_NAME} \

--region=${REGION} \

--quiet \

&& gcloud projects remove-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/storage.admin" \

--condition=None \

&& gcloud projects remove-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/secretmanager.secretAccessor" \

--condition=None \

&& gcloud projects remove-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/container.clusterAdmin" \

--condition=None \

&& gcloud projects remove-iam-policy-binding $PROJECT_ID \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--role="roles/compute.instanceAdmin.v1" \

--condition=None \

&& gcloud storage rm --recursive gs://$BUCKET_NAME \

&& gcloud storage rm --recursive gs://$LOG_BUCKET_NAME \

&& gcloud compute images delete $DISK_IMAGE \

--quiet

for secret in hf-username hf-token; do

gcloud secrets delete "$secret" --quiet

done

gcloud container clusters delete "${CLUSTER_NAME}" --region="${REGION}" --quiet

roles="roles/storage.admin,roles/secretmanager.secretAccessor,roles/container.clusterAdmin,roles/compute.instanceAdmin.v1"

gcloud projects remove-iam-policy-binding "$PROJECT_ID" \

--member="serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com" \

--roles="$roles" \

--condition=None

gcloud storage rm --recursive gs://"$BUCKET_NAME"

gcloud storage rm --recursive gs://"$LOG_BUCKET_NAME"

gcloud compute images delete "$DISK_IMAGE" --quiet

code-review-assist

Hi @kiryl-filatau and reviewers,

I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:

This pull request adds code for the GCS FUSE guide. The intent is to provide complete, runnable examples demonstrating how to use GCS FUSE to integrate Google Cloud Storage with Kubernetes. The changes include:

New Cloud Build configurations: Several new .yaml files (cloudbuild-preload.yaml, cloudbuild-prepare-autopilot.yaml, cloudbuild-prepare-standard.yaml) define the build process for setting up the environment. These scripts handle bucket creation, service account setup, model cloning, and deployment.
Updated Deployment Manifests: Two Kubernetes deployment manifests (model-deployment-autopilot.yaml, model-deployment-standard.yaml) are added, configuring deployments for both Autopilot and Standard GKE clusters. These leverage GCS FUSE for model storage.
Run Script: A run-commands.sh script orchestrates the entire process, including setting up secrets, managing permissions, running Cloud Build jobs, applying Kubernetes manifests, and performing cleanup.

The changes are extensive and well-structured, using Cloud Build for automation and clearly separating concerns. The use of placeholders in the YAML files and the run-commands.sh script makes it easy to adapt the examples to different environments.

Here's a little something to lighten the mood:

Code flows like a river,
Building bridges, strong and true,
Testing makes it soar.

Looking forward to a thorough discussion!

code-review-assist

This pull request adds code for a GCS FUSE guide. The YAML files for Cloud Build and the Kubernetes deployment look good. However, the run-commands.sh script could be improved for readability and maintainability. The commit message and PR description could also be more detailed. No style guide was provided, so I'm defaulting to the Google Shell Style Guide for the shell script and commonly used YAML best practices.

Specifically, I've noted areas in the shell script where quoting and variable usage could be more robust, and where functions could be used to improve organization. I've also suggested improvements to the error handling and logging. Additionally, I've recommended adding more detail to the PR description and commit messages to clearly explain the purpose and functionality of the added code. It's also important to ensure the placeholder values in the deployment YAMLs and the script are replaced with actual values before merging.

code-review-assist · 2024-12-30T16:10:15Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+export PROJECT_ID=$(gcloud config get project) \
+&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \
+&& export REGION=REGION \
+&& export ZONE=ZONE \
+&& export CLUSTER_NAME=CLUSTER_NAME \
+&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \
+&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \
+&& export MODEL_PATH=MODEL_PATH_NAME \
+&& export ROLE_NAME=ROLE_NAME \
+&& export DISK_IMAGE=DISK_IMAGE_NAME \
+&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \
+&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME


It's generally a good practice to quote variable assignments, especially when dealing with command substitutions, to prevent word splitting and globbing issues. Refer to the Google Shell Style Guide section on variable expansion. Also, consider using lowercase variable names for better portability and consistency with common shell scripting conventions.

Suggested change

export PROJECT_ID=$(gcloud config get project) \

&& export PROJECT_NUMBER=$(gcloud projects list --filter="$PROJECT_ID" --format="value(PROJECT_NUMBER)") \

&& export REGION=REGION \

&& export ZONE=ZONE \

&& export CLUSTER_NAME=CLUSTER_NAME \

&& export BUCKET_NAME=MODEL_FILES_BUCKET_NAME \

&& export KSA_NAME=K8S_SERVICE_ACCOUNT_NAME \

&& export MODEL_PATH=MODEL_PATH_NAME \

&& export ROLE_NAME=ROLE_NAME \

&& export DISK_IMAGE=DISK_IMAGE_NAME \

&& export LOG_BUCKET_NAME=$BUCKET_NAME-disk-creation-logs \

&& export CONTAINER_IMAGE=CONTAINER_IMAGE_NAME

export project_id="$(gcloud config get project)" \

&& export project_number="$(gcloud projects list --filter="$project_id" --format="value(PROJECT_NUMBER)")" \

&& export region="REGION" \

&& export zone="ZONE" \

&& export cluster_name="CLUSTER_NAME" \

&& export bucket_name="MODEL_FILES_BUCKET_NAME" \

&& export ksa_name="K8S_SERVICE_ACCOUNT_NAME" \

&& export model_path="MODEL_PATH_NAME" \

&& export role_name="ROLE_NAME" \

&& export disk_image="DISK_IMAGE_NAME" \

&& export log_bucket_name="${bucket_name}-disk-creation-logs" \

&& export container_image="CONTAINER_IMAGE_NAME"

code-review-assist · 2024-12-30T16:10:15Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \
+&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=-


Consider using printf instead of echo -n for better portability and to avoid potential issues with certain special characters in the variables. See the Google Shell Style Guide section on echo. Also, ensure that the variables HF_USERNAME and HF_TOKEN are properly set before this script is run. It's best practice to avoid relying on implicit shell variable inheritance.

Suggested change

echo -n ${HF_USERNAME} | gcloud secrets create hf-username --data-file=- \

&& echo -n ${HF_TOKEN} | gcloud secrets create hf-token --data-file=-

printf "%s" "${hf_username}" | gcloud secrets create hf-username --data-file=- \

&& printf "%s" "${hf_token}" | gcloud secrets create hf-token --data-file=-

code-review-assist · 2024-12-30T16:10:15Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+gcloud builds submit \
+  --config cloudbuild-prepare-autopilot.yaml --no-source \
+  --substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID
+
+# Run the Cloud Build command to prepare the cluster with all required substitutions
+gcloud builds submit \
+  --config cloudbuild-prepare-standard.yaml --no-source \
+  --substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID


These gcloud builds submit commands could be refactored into functions for better readability, organization, and error handling. This would also make it easier to reuse the logic if needed. Additionally, consider adding error checking after each command to ensure the build steps complete successfully. Refer to the Google Shell Style Guide section on functions and exit codes.

Suggested change

gcloud builds submit \

--config cloudbuild-prepare-autopilot.yaml --no-source \

--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID

# Run the Cloud Build command to prepare the cluster with all required substitutions

gcloud builds submit \

--config cloudbuild-prepare-standard.yaml --no-source \

--substitutions=_DISK_IMAGE=$DISK_IMAGE,_CONTAINER_IMAGE=$CONTAINER_IMAGE,_BUCKET_NAME=$LOG_BUCKET_NAME,_REGION=$REGION,_ZONE=$ZONE,_CLUSTER_NAME=$CLUSTER_NAME,_PROJECT_ID=$PROJECT_ID

build_image() {

gcloud builds submit \

--config "$1" --no-source \

--substitutions="$2" \

|| { echo "gcloud builds submit failed for $1"; exit 1; }

}

build_image cloudbuild-prepare-autopilot.yaml "_DISK_IMAGE=${disk_image},_CONTAINER_IMAGE=${container_image},_BUCKET_NAME=${log_bucket_name},_REGION=${region},_ZONE=${zone},_CLUSTER_NAME=${cluster_name},_PROJECT_ID=${project_id}"

build_image cloudbuild-prepare-standard.yaml "_DISK_IMAGE=${disk_image},_CONTAINER_IMAGE=${container_image},_BUCKET_NAME=${log_bucket_name},_REGION=${region},_ZONE=${zone},_CLUSTER_NAME=${cluster_name},_PROJECT_ID=${project_id}"

build_image cloudbuild-preload.yaml "_BUCKET_NAME=${bucket_name},_CLUSTER_NAME=${cluster_name},_REGION=${region},_KSA_NAME=${ksa_name},_PROJECT_NUMBER=${project_number},_PROJECT_ID=${project_id},_ROLE_NAME=${role_name},_MODEL_PATH=${model_path}"

code-review-assist · 2024-12-30T16:10:15Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+  --location=${REGION}
+
+# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in AUTOPILOT
+sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f -


The quoting in this sed command looks a bit complex and could lead to errors if the replacement values contain special characters. Consider using alternative delimiters for the s command to avoid escaping the single quotes, or breaking the command into multiple, simpler sed commands. Also, ensure that the placeholder values are actually replaced with real values before running this command.

Suggested change

sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g; s|<DISK_IMAGE_NAME>|$DISK_IMAGE|g; s|<PROJECT_ID>|$PROJECT_ID|g" model-deployment-autopilot.yaml | kubectl apply -f -

sed -E "s|<BUCKET_NAME>|${bucket_name}|g; s|<KSA_NAME>|${ksa_name}|g; s|<CONTAINER_IMAGE>|'${container_image}'|g; s|<DISK_IMAGE_NAME>|${disk_image}|g; s|<PROJECT_ID>|${project_id}|g" model-deployment-autopilot.yaml | kubectl apply -f -

code-review-assist · 2024-12-30T16:10:16Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+
+
+# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in STANDARD
+sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f -


Similar to the previous comment, simplify the quoting in this sed command and ensure placeholders are replaced with actual values.

Suggested change

sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f -

sed -E "s|<BUCKET_NAME>|${bucket_name}|g; s|<KSA_NAME>|${ksa_name}|g; s|<CONTAINER_IMAGE>|'${container_image}'|g" model-deployment-standard.yaml | kubectl apply -f -

code-review-assist · 2024-12-30T16:10:16Z

ai-ml/gcs-fuse-adapter/run-commands.sh

+sed "s|<BUCKET_NAME>|$BUCKET_NAME|g; s|<KSA_NAME>|$KSA_NAME|g; s|<CONTAINER_IMAGE>|'$CONTAINER_IMAGE'|g" model-deployment-standard.yaml | kubectl apply -f -
+
+# Check the logs of the pod
+kubectl logs $(kubectl get pods -o jsonpath='{.items[0].metadata.name}')


It's good practice to check the exit code of commands and handle potential errors. For example, what happens if no pods are running? The Google Shell Style Guide recommends checking exit codes and providing informative error messages. See Exit Codes.

Suggested change

kubectl logs $(kubectl get pods -o jsonpath='{.items[0].metadata.name}')

if pod_name=$(kubectl get pods -o jsonpath='{.items[0].metadata.name}') &> /dev/null; then

kubectl logs "$pod_name"

else

echo "No pods found or error retrieving pod name."

fi

code-review-assist · 2024-12-30T16:10:16Z

ai-ml/gcs-fuse-adapter/cloudbuild-prepare-autopilot.yaml

+        spec:
+          allowedResourcePatterns:
+          - "projects/${_PROJECT_ID}/global/images/.*"
+        EOF


The EOF heredoc marker should be on its own line, and preferably indented to match the surrounding code. This improves readability and avoids potential issues with certain shells.

Suggested change

EOF

EOF

…netes-engine-samples into gcsfuse-adapter-preload-model gitlab.com,172.65.251.78 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY=

kiryl-filatau added 11 commits July 18, 2024 16:24

add code for the GCS FUSE guide

6e9255b

file name fix

1625945

add the HF secrets to the secret manager

4c97577

add clean-up section and check the files

c89e710

typo fix

7e05a23

minor updates

a693db5

remove cluster version, change machineType for cloudbuild

a59789c

add updates and checks

2031d8f

minor updates in comments

f327d13

clean the code

5a4bfca

debug line removed

0b62b97

kiryl-filatau added 4 commits November 8, 2024 20:41

add lines with license

d737aac

typo fix

a363de9

indent fix

a20c8a1

add autopilot

4d5c813

kiryl-filatau marked this pull request as ready for review December 30, 2024 16:02

kiryl-filatau requested review from alizaidis, yoshi-approver and a team as code owners December 30, 2024 16:02

code-review-assist bot reviewed Dec 30, 2024

View reviewed changes

Merge branch 'main' into gcsfuse-adapter-preload-model

095318a

code-review-assist bot reviewed Dec 30, 2024

View reviewed changes

kiryl-filatau added 3 commits December 30, 2024 17:32

minor updates

90938aa

minor updates

d791c89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add code for the GCS FUSE guide #1380

add code for the GCS FUSE guide #1380

kiryl-filatau commented Jul 18, 2024

brandonroyal commented Nov 8, 2024

code-review-assist bot left a comment

code-review-assist bot left a comment

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot left a comment

code-review-assist bot left a comment

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot left a comment

code-review-assist bot left a comment

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

code-review-assist bot Dec 30, 2024

		echo -n ${HF_USERNAME} \| gcloud secrets create hf-username --data-file=- \
		&& echo -n ${HF_TOKEN} \| gcloud secrets create hf-token --data-file=-

	sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g; s\|<DISK_IMAGE_NAME>\|$DISK_IMAGE\|g; s\|<PROJECT_ID>\|$PROJECT_ID\|g" model-deployment-autopilot.yaml \| kubectl apply -f -
	cat model-deployment-autopilot.yaml \| sed -E "s\|<BUCKET_NAME>\|${bucket_name}\|g; s\|<KSA_NAME>\|${ksa_name}\|g; s\|<CONTAINER_IMAGE>\|${container_image}\|g; s\|<DISK_IMAGE_NAME>\|${disk_image}\|g; s\|<PROJECT_ID>\|${project_id}\|g" \| kubectl apply -f -

	sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g; s\|<DISK_IMAGE_NAME>\|$DISK_IMAGE\|g; s\|<PROJECT_ID>\|$PROJECT_ID\|g" model-deployment-autopilot.yaml \| kubectl apply -f -
	DEPLOYMENT_FILE=model-deployment-autopilot.yaml
	sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g; s\|<DISK_IMAGE_NAME>\|$DISK_IMAGE\|g; s\|<PROJECT_ID>\|$PROJECT_ID\|g" "$DEPLOYMENT_FILE" \| kubectl apply -f -



		# Apply the manifest and change the placeholder with the name of the requred bucket and required k8s service account in STANDARD
		sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g" model-deployment-standard.yaml \| kubectl apply -f -

	sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g" model-deployment-standard.yaml \| kubectl apply -f -
	DEPLOYMENT_FILE=model-deployment-standard.yaml
	sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g" "$DEPLOYMENT_FILE" \| kubectl apply -f -

	sed "s\|<BUCKET_NAME>\|$BUCKET_NAME\|g; s\|<KSA_NAME>\|$KSA_NAME\|g; s\|<CONTAINER_IMAGE>\|'$CONTAINER_IMAGE'\|g; s\|<DISK_IMAGE_NAME>\|$DISK_IMAGE\|g; s\|<PROJECT_ID>\|$PROJECT_ID\|g" model-deployment-autopilot.yaml \| kubectl apply -f -
	sed -E "s\|<BUCKET_NAME>\|${bucket_name}\|g; s\|<KSA_NAME>\|${ksa_name}\|g; s\|<CONTAINER_IMAGE>\|'${container_image}'\|g; s\|<DISK_IMAGE_NAME>\|${disk_image}\|g; s\|<PROJECT_ID>\|${project_id}\|g" model-deployment-autopilot.yaml \| kubectl apply -f -

-kubectl logs $(kubectl get pods -o jsonpath='{.items[0].metadata.name}')
+if pod_name=$(kubectl get pods -o jsonpath='{.items[0].metadata.name}') &> /dev/null; then
+  kubectl logs "$pod_name"
+else
+  echo "No pods found or error retrieving pod name."
+fi

add code for the GCS FUSE guide #1380

Are you sure you want to change the base?

add code for the GCS FUSE guide #1380

Conversation

kiryl-filatau commented Jul 18, 2024

Description

Tasks

brandonroyal commented Nov 8, 2024

code-review-assist bot left a comment

Choose a reason for hiding this comment

code-review-assist bot left a comment

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot left a comment

Choose a reason for hiding this comment

code-review-assist bot left a comment

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot left a comment

Choose a reason for hiding this comment

code-review-assist bot left a comment

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment

code-review-assist bot Dec 30, 2024

Choose a reason for hiding this comment