Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add tutorial colab for grounding with Vertex AI Search. #1672

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

diemtvu
Copy link

@diemtvu diemtvu commented Jan 30, 2025

Description

Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Follow the CONTRIBUTING Guide.
  • You are listed as the author in your notebook or README file.
    • Your account is listed in CODEOWNERS for the file(s).
  • Make your Pull Request title in the https://www.conventionalcommits.org/ specification.
  • Ensure the tests and linter pass (Run nox -s format from the repository root to format).
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@diemtvu diemtvu requested a review from a team as a code owner January 30, 2025 01:39
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will provide my feedback in a few minutes. In the meantime, I'm providing this summary of the pull request's content for you and other reviewers to quickly get up to speed on the changes and intent.

This pull request, authored by @diemtvu, adds a tutorial Colab notebook for grounding with Vertex AI Search. The notebook demonstrates how to:

  • Create a Vertex AI Search datastore.
  • Ingest PDF documents into the datastore from Google Cloud Storage.
  • Use the Vertex AI Search datastore for grounding with a Gemini LLM.
  • Clean up resources (delete the datastore and documents).

The main changes are within the file gemini/grounding/grounding_with_vais.ipynb. This file is a Jupyter Notebook containing the complete tutorial. The notebook includes code for authentication (using service accounts or user credentials), datastore creation, document ingestion, grounding with Gemini, and cleanup. The notebook is quite comprehensive, covering various aspects of using Vertex AI Search for grounding.

I've reviewed the diff and it appears to be well-structured and the code is clean. The comments are helpful and easy to understand. I'll be checking the functionality and testing the notebook shortly.


Here's a little poem to brighten your day:

Code flows like a river,
Grounding models, ever clear,
Knowledge finds its way.

Looking forward to your feedback!

Best,
Gemini

Copy link
Contributor

Gemini encountered an error creating the review. You can try again by commenting @code-review-assist review.

@holtskinner holtskinner changed the title Add tutorial colab for grounding with Vertex AI Search. feat: Add tutorial colab for grounding with Vertex AI Search. Jan 30, 2025
Copy link
Collaborator

@holtskinner holtskinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I'll add more comments once these are resolved.

gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
Comment on lines +371 to +395
"# Write the html documents into GCS\n",
"from urllib.request import urlopen\n",
"import requests\n",
"\n",
"file_urls = [\n",
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q1_Earnings_Transcript.pdf\",\n",
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q2_Earnings_Transcript.pdf\",\n",
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q3_Earnings_Transcript.pdf\",\n",
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q4_Earnings_Transcript.pdf\"\n",
"]\n",
"\n",
"bucket = storage_client.bucket(bucket_name)\n",
"\n",
"for url in file_urls:\n",
" file_name = url.split(\"/\")[-1]\n",
" print(f\"Downloading: {file_name}\")\n",
"\n",
" try:\n",
" response = requests.get(url)\n",
" response.raise_for_status()\n",
"\n",
" # Construct the full blob path (including prefix)\n",
" blob_name = f\"{file_name}\"\n",
" blob = bucket.blob(blob_name)\n",
"\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to use the public bucket shown in the docs so the user doesn't have to create their own bucket.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which one do you mean?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I guess you mean the one in the other colab (gs://cloud-samples-data/gen-app-builder/search/cymbal-bank-employee) ?. Probably we can pre-create one like that too. Wdyt, @undertwig ?

gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
"source": [
"## Clean up\n",
"\n",
"Use [`DeleteCorpusRequest`](https://ai.google.dev/api/python/google/generativeai/protos/DeleteCorpusRequest) to delete a user corpus and all associated `Document`s & `Chunk`s.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the correct information

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@undertwig Can you check?

gemini/grounding/grounding_with_vais.ipynb Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants