This repository has been archived by the owner on Oct 25, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 211
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
quickstart neuralchat notebook for workshops (#1269)
* hackathon content to introduce neuralchat Signed-off-by: kta-intel <[email protected]>
- Loading branch information
Showing
2 changed files
with
411 additions
and
0 deletions.
There are no files selected for viewing
382 changes: 382 additions & 0 deletions
382
...nsion_for_transformers/neural_chat/docs/notebooks/workshop/01_quickstart_neuralchat.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,382 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5c6ddfc4-ecd6-4f0a-9e32-5ff4ab014db7", | ||
"metadata": {}, | ||
"source": [ | ||
"# QuickStart: Intel® Extension For Transformers*: NeuralChat on 4th Generation Intel® Xeon® Scalable Processors" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0a628ae6-ddd2-41d3-acf1-855b6f31aef5", | ||
"metadata": {}, | ||
"source": [ | ||
"## Prepare Environment" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7ae386d8-6138-478d-ae4d-8572a561967c", | ||
"metadata": {}, | ||
"source": [ | ||
"Follow the README to install the necessary requirements to run this tutorial. In summary, you will need to install the following:\n", | ||
"- Intel(R) Extension for Transformers* from source (to get latest updates)\n", | ||
"- NeuralChat requirements\n", | ||
"- Retrieval Plugin Requirements\n", | ||
"- Audio Plugin (TTS and ASR) Requirements\n", | ||
" " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5733fa69-5e68-453c-8f6f-401ee096f4a7", | ||
"metadata": {}, | ||
"source": [ | ||
"Check hardware" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "20d39d47-087f-4e2a-b766-b3707ee342e9", | ||
"metadata": { | ||
"scrolled": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!lscpu" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c3c6d1e2-61f1-4ee4-98c7-4f6202e7f2ea", | ||
"metadata": {}, | ||
"source": [ | ||
"## Building a simple chatbot\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "a5555f5b-84d1-4154-9388-8aa17041fec0", | ||
"metadata": { | ||
"scrolled": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n", | ||
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1')\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n", | ||
"print(response)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "db00b822-eef0-4b92-9a9a-9d7a8d148d6f", | ||
"metadata": {}, | ||
"source": [ | ||
"## Optimizing your chatbot\n", | ||
"Enable mixed precision with bfloat16" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "6e832b1a-0f7a-4715-b245-85666729e206", | ||
"metadata": { | ||
"scrolled": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n", | ||
"from intel_extension_for_transformers.transformers import MixedPrecisionConfig\n", | ||
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1',\n", | ||
" optimization_config=MixedPrecisionConfig(dtype='bfloat16'))\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n", | ||
"print(response)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "4ff9ca77-149b-47cc-9a5b-a6f70877f09e", | ||
"metadata": {}, | ||
"source": [ | ||
"INT4 weight only quantization" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "a804023f-5a79-47ce-9b4b-61ea9746df5f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n", | ||
"from intel_extension_for_transformers.transformers import WeightOnlyQuantConfig\n", | ||
"from intel_extension_for_transformers.neural_chat.config import LoadingModelConfig\n", | ||
"config = PipelineConfig(model_name_or_path=\"Intel/neural-chat-7b-v3-1\",\n", | ||
" optimization_config=WeightOnlyQuantConfig(compute_dtype=\"int8\", weight_dtype=\"int4_fullrange\"), \n", | ||
" loading_config=LoadingModelConfig(use_llm_runtime=False))\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n", | ||
"print(response)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "adc089c9-623f-43ec-a470-31e5f8d4cac4", | ||
"metadata": {}, | ||
"source": [ | ||
"## Customizing your chatbot\n", | ||
"### Plugin: Retrieval\n", | ||
"Without retrieval plugin" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "640b5c7d-15e0-4438-b451-9172159398fc", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n", | ||
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1')\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"response = chatbot.predict(query=\"How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?\")\n", | ||
"print(response)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "4a65c5a6-1d33-43f6-8c61-92322b74e2cc", | ||
"metadata": {}, | ||
"source": [ | ||
"With retrieval plugin" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "aaef691b-048a-4d5c-ada9-39acec7eeea2", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!mkdir docs\n", | ||
"!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.txt\n", | ||
"!mv sample.txt ./docs" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "40e6474f-565b-4399-b250-a314447c2120", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!cat ./docs/sample.txt" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "3779e396-a963-4503-8a85-a0d897445c58", | ||
"metadata": { | ||
"scrolled": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import PipelineConfig\n", | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot\n", | ||
"from intel_extension_for_transformers.neural_chat import plugins\n", | ||
"plugins.retrieval.enable=True\n", | ||
"plugins.retrieval.args[\"input_path\"]=\"./docs/sample.txt\"\n", | ||
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1',\n", | ||
" plugins=plugins)\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"response = chatbot.predict(query=\"How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?\")\n", | ||
"print(response)\n", | ||
"\n", | ||
"plugins.retrieval.enable=False # disable retrieval" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e2da2eef-e383-4e54-b9c1-4079cf053a33", | ||
"metadata": {}, | ||
"source": [ | ||
"### Plugin: ASR & TTS\n", | ||
"Enable voice chat" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "72fe3865-b23a-405d-8199-4a869f448ec0", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "103340b5-1b57-486c-9fba-3060de63ab42", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n", | ||
"from intel_extension_for_transformers.neural_chat import plugins\n", | ||
"plugins.tts.enable = True\n", | ||
"plugins.tts.args[\"output_audio_path\"] = \"./response.wav\"\n", | ||
"plugins.asr.enable = True\n", | ||
"\n", | ||
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1',\n", | ||
" plugins=plugins)\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"result = chatbot.predict(query=\"./sample.wav\")\n", | ||
"print(result)\n", | ||
"\n", | ||
"plugins.tts.enable = False\n", | ||
"plugins.asr.enable = False" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e93a0779-8c26-4c62-a514-64bf9c58896f", | ||
"metadata": {}, | ||
"source": [ | ||
"### [Optional]: Fine-tuning" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "b8419deb-8a06-41ba-af86-875d1d1c0fc7", | ||
"metadata": {}, | ||
"source": [ | ||
"We use the [Alpaca dataset](https://github.com/tatsu-lab/stanford_alpaca) from Stanford University as the general domain dataset to fine-tune the model. This dataset is provided in the form of a JSON file, [alpaca_data.json](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json). In Alpaca, researchers have manually crafted 175 seed tasks to guide `text-davinci-003` in generating 52K instruction data for diverse tasks." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "45f5c30c-ecb4-4341-a586-4efdb3906bbd", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!curl -OL https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "660a3a94-230a-497f-9ad9-2645b8f2d186", | ||
"metadata": {}, | ||
"source": [ | ||
"Finetune the model on Alpaca-format dataset to conduct text generation.\n", | ||
"\n", | ||
"We employ the [LoRA approach](https://arxiv.org/pdf/2106.09685.pdf) to finetune the LLM efficiently." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "1008618d-a82e-4249-813e-ae46b95c64d5", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from transformers import TrainingArguments\n", | ||
"from intel_extension_for_transformers.neural_chat.config import (\n", | ||
" ModelArguments,\n", | ||
" DataArguments,\n", | ||
" FinetuningArguments,\n", | ||
" TextGenerationFinetuningConfig,\n", | ||
")\n", | ||
"from intel_extension_for_transformers.neural_chat.chatbot import finetune_model\n", | ||
"model_args = ModelArguments(model_name_or_path=\"Intel/neural-chat-7b-v3-1\")\n", | ||
"data_args = DataArguments(train_file=\"alpaca_data.json\")\n", | ||
"training_args = TrainingArguments(\n", | ||
" output_dir='./finetuned_model_path',\n", | ||
" do_train=True,\n", | ||
" do_eval=True,\n", | ||
" num_train_epochs=3,\n", | ||
" overwrite_output_dir=True,\n", | ||
" per_device_train_batch_size=4,\n", | ||
" per_device_eval_batch_size=4,\n", | ||
" gradient_accumulation_steps=1,\n", | ||
" save_strategy=\"no\",\n", | ||
" log_level=\"info\",\n", | ||
" save_total_limit=2,\n", | ||
" bf16=True\n", | ||
")\n", | ||
"finetune_args = FinetuningArguments()\n", | ||
"finetune_cfg = TextGenerationFinetuningConfig(\n", | ||
" model_args=model_args,\n", | ||
" data_args=data_args,\n", | ||
" training_args=training_args,\n", | ||
" finetune_args=finetune_args,\n", | ||
" )\n", | ||
"finetune_model(finetune_cfg)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "701abb4e-41fa-4aa4-81e0-2abc3f15f093", | ||
"metadata": {}, | ||
"source": [ | ||
"Load the fine tuned model" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "4112b3b3-ef0d-4157-8c23-cf3ed5d59e8a", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat import build_chatbot\n", | ||
"from intel_extension_for_transformers.neural_chat import PipelineConfig\n", | ||
"from intel_extension_for_transformers.neural_chat.config import LoadingModelConfig\n", | ||
"\n", | ||
"config = PipelineConfig(model_name_or_path=\"Intel/neural-chat-7b-v3-1\",\n", | ||
" loading_config=LoadingModelConfig(peft_path=\"./finetuned_model_path\"))\n", | ||
"chatbot = build_chatbot(config)\n", | ||
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n", | ||
"print(response)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "186e46d1-3f94-423c-abcb-4a533777e26c", | ||
"metadata": {}, | ||
"source": [ | ||
"### Congratulations! You have completed the NeuralChat quickstart\n", | ||
"Visit [notebooks directory](https://github.com/intel/intel-extension-for-transformers/blob/c30353fcb0e5ceab440a7508b5980ccebcac8750/intel_extension_for_transformers/neural_chat/docs/full_notebooks.md) to see more examples" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "itrex", | ||
"language": "python", | ||
"name": "itrex" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.16" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.