-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add tutorial colab for grounding with Vertex AI Search. #1672
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi team,
I'm currently reviewing this pull request and will provide my feedback in a few minutes. In the meantime, I'm providing this summary of the pull request's content for you and other reviewers to quickly get up to speed on the changes and intent.
This pull request, authored by @diemtvu, adds a tutorial Colab notebook for grounding with Vertex AI Search. The notebook demonstrates how to:
- Create a Vertex AI Search datastore.
- Ingest PDF documents into the datastore from Google Cloud Storage.
- Use the Vertex AI Search datastore for grounding with a Gemini LLM.
- Clean up resources (delete the datastore and documents).
The main changes are within the file gemini/grounding/grounding_with_vais.ipynb
. This file is a Jupyter Notebook containing the complete tutorial. The notebook includes code for authentication (using service accounts or user credentials), datastore creation, document ingestion, grounding with Gemini, and cleanup. The notebook is quite comprehensive, covering various aspects of using Vertex AI Search for grounding.
I've reviewed the diff and it appears to be well-structured and the code is clean. The comments are helpful and easy to understand. I'll be checking the functionality and testing the notebook shortly.
Here's a little poem to brighten your day:
Code flows like a river,
Grounding models, ever clear,
Knowledge finds its way.
Looking forward to your feedback!
Best,
Gemini
Gemini encountered an error creating the review. You can try again by commenting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, I'll add more comments once these are resolved.
"# Write the html documents into GCS\n", | ||
"from urllib.request import urlopen\n", | ||
"import requests\n", | ||
"\n", | ||
"file_urls = [\n", | ||
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q1_Earnings_Transcript.pdf\",\n", | ||
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q2_Earnings_Transcript.pdf\",\n", | ||
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q3_Earnings_Transcript.pdf\",\n", | ||
" \"https://abc.xyz/assets/investor/static/pdf/2022_Q4_Earnings_Transcript.pdf\"\n", | ||
"]\n", | ||
"\n", | ||
"bucket = storage_client.bucket(bucket_name)\n", | ||
"\n", | ||
"for url in file_urls:\n", | ||
" file_name = url.split(\"/\")[-1]\n", | ||
" print(f\"Downloading: {file_name}\")\n", | ||
"\n", | ||
" try:\n", | ||
" response = requests.get(url)\n", | ||
" response.raise_for_status()\n", | ||
"\n", | ||
" # Construct the full blob path (including prefix)\n", | ||
" blob_name = f\"{file_name}\"\n", | ||
" blob = bucket.blob(blob_name)\n", | ||
"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to use the public bucket shown in the docs so the user doesn't have to create their own bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure which one do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I guess you mean the one in the other colab (gs://cloud-samples-data/gen-app-builder/search/cymbal-bank-employee) ?. Probably we can pre-create one like that too. Wdyt, @undertwig ?
"source": [ | ||
"## Clean up\n", | ||
"\n", | ||
"Use [`DeleteCorpusRequest`](https://ai.google.dev/api/python/google/generativeai/protos/DeleteCorpusRequest) to delete a user corpus and all associated `Document`s & `Chunk`s.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the correct information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@undertwig Can you check?
Co-authored-by: Holt Skinner <[email protected]>
Co-authored-by: Holt Skinner <[email protected]>
Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
CONTRIBUTING
Guide.CODEOWNERS
for the file(s).nox -s format
from the repository root to format).Fixes #<issue_number_goes_here> 🦕