Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Commit

Permalink
quickstart neuralchat notebook for workshops (#1269)
Browse files Browse the repository at this point in the history
* hackathon content to introduce neuralchat

Signed-off-by: kta-intel <[email protected]>
  • Loading branch information
kta-intel authored Feb 8, 2024
1 parent f5a7435 commit 033b7d5
Show file tree
Hide file tree
Showing 2 changed files with 411 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,382 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "5c6ddfc4-ecd6-4f0a-9e32-5ff4ab014db7",
"metadata": {},
"source": [
"# QuickStart: Intel® Extension For Transformers*: NeuralChat on 4th Generation Intel® Xeon® Scalable Processors"
]
},
{
"cell_type": "markdown",
"id": "0a628ae6-ddd2-41d3-acf1-855b6f31aef5",
"metadata": {},
"source": [
"## Prepare Environment"
]
},
{
"cell_type": "markdown",
"id": "7ae386d8-6138-478d-ae4d-8572a561967c",
"metadata": {},
"source": [
"Follow the README to install the necessary requirements to run this tutorial. In summary, you will need to install the following:\n",
"- Intel(R) Extension for Transformers* from source (to get latest updates)\n",
"- NeuralChat requirements\n",
"- Retrieval Plugin Requirements\n",
"- Audio Plugin (TTS and ASR) Requirements\n",
" "
]
},
{
"cell_type": "markdown",
"id": "5733fa69-5e68-453c-8f6f-401ee096f4a7",
"metadata": {},
"source": [
"Check hardware"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "20d39d47-087f-4e2a-b766-b3707ee342e9",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"!lscpu"
]
},
{
"cell_type": "markdown",
"id": "c3c6d1e2-61f1-4ee4-98c7-4f6202e7f2ea",
"metadata": {},
"source": [
"## Building a simple chatbot\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a5555f5b-84d1-4154-9388-8aa17041fec0",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1')\n",
"chatbot = build_chatbot(config)\n",
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "db00b822-eef0-4b92-9a9a-9d7a8d148d6f",
"metadata": {},
"source": [
"## Optimizing your chatbot\n",
"Enable mixed precision with bfloat16"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6e832b1a-0f7a-4715-b245-85666729e206",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
"from intel_extension_for_transformers.transformers import MixedPrecisionConfig\n",
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1',\n",
" optimization_config=MixedPrecisionConfig(dtype='bfloat16'))\n",
"chatbot = build_chatbot(config)\n",
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "4ff9ca77-149b-47cc-9a5b-a6f70877f09e",
"metadata": {},
"source": [
"INT4 weight only quantization"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a804023f-5a79-47ce-9b4b-61ea9746df5f",
"metadata": {},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
"from intel_extension_for_transformers.transformers import WeightOnlyQuantConfig\n",
"from intel_extension_for_transformers.neural_chat.config import LoadingModelConfig\n",
"config = PipelineConfig(model_name_or_path=\"Intel/neural-chat-7b-v3-1\",\n",
" optimization_config=WeightOnlyQuantConfig(compute_dtype=\"int8\", weight_dtype=\"int4_fullrange\"), \n",
" loading_config=LoadingModelConfig(use_llm_runtime=False))\n",
"chatbot = build_chatbot(config)\n",
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "adc089c9-623f-43ec-a470-31e5f8d4cac4",
"metadata": {},
"source": [
"## Customizing your chatbot\n",
"### Plugin: Retrieval\n",
"Without retrieval plugin"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "640b5c7d-15e0-4438-b451-9172159398fc",
"metadata": {},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1')\n",
"chatbot = build_chatbot(config)\n",
"response = chatbot.predict(query=\"How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "4a65c5a6-1d33-43f6-8c61-92322b74e2cc",
"metadata": {},
"source": [
"With retrieval plugin"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aaef691b-048a-4d5c-ada9-39acec7eeea2",
"metadata": {},
"outputs": [],
"source": [
"!mkdir docs\n",
"!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.txt\n",
"!mv sample.txt ./docs"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "40e6474f-565b-4399-b250-a314447c2120",
"metadata": {},
"outputs": [],
"source": [
"!cat ./docs/sample.txt"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3779e396-a963-4503-8a85-a0d897445c58",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import PipelineConfig\n",
"from intel_extension_for_transformers.neural_chat import build_chatbot\n",
"from intel_extension_for_transformers.neural_chat import plugins\n",
"plugins.retrieval.enable=True\n",
"plugins.retrieval.args[\"input_path\"]=\"./docs/sample.txt\"\n",
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1',\n",
" plugins=plugins)\n",
"chatbot = build_chatbot(config)\n",
"response = chatbot.predict(query=\"How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?\")\n",
"print(response)\n",
"\n",
"plugins.retrieval.enable=False # disable retrieval"
]
},
{
"cell_type": "markdown",
"id": "e2da2eef-e383-4e54-b9c1-4079cf053a33",
"metadata": {},
"source": [
"### Plugin: ASR & TTS\n",
"Enable voice chat"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "72fe3865-b23a-405d-8199-4a869f448ec0",
"metadata": {},
"outputs": [],
"source": [
"!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "103340b5-1b57-486c-9fba-3060de63ab42",
"metadata": {},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig\n",
"from intel_extension_for_transformers.neural_chat import plugins\n",
"plugins.tts.enable = True\n",
"plugins.tts.args[\"output_audio_path\"] = \"./response.wav\"\n",
"plugins.asr.enable = True\n",
"\n",
"config = PipelineConfig(model_name_or_path='Intel/neural-chat-7b-v3-1',\n",
" plugins=plugins)\n",
"chatbot = build_chatbot(config)\n",
"result = chatbot.predict(query=\"./sample.wav\")\n",
"print(result)\n",
"\n",
"plugins.tts.enable = False\n",
"plugins.asr.enable = False"
]
},
{
"cell_type": "markdown",
"id": "e93a0779-8c26-4c62-a514-64bf9c58896f",
"metadata": {},
"source": [
"### [Optional]: Fine-tuning"
]
},
{
"cell_type": "markdown",
"id": "b8419deb-8a06-41ba-af86-875d1d1c0fc7",
"metadata": {},
"source": [
"We use the [Alpaca dataset](https://github.com/tatsu-lab/stanford_alpaca) from Stanford University as the general domain dataset to fine-tune the model. This dataset is provided in the form of a JSON file, [alpaca_data.json](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json). In Alpaca, researchers have manually crafted 175 seed tasks to guide `text-davinci-003` in generating 52K instruction data for diverse tasks."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45f5c30c-ecb4-4341-a586-4efdb3906bbd",
"metadata": {},
"outputs": [],
"source": [
"!curl -OL https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json"
]
},
{
"cell_type": "markdown",
"id": "660a3a94-230a-497f-9ad9-2645b8f2d186",
"metadata": {},
"source": [
"Finetune the model on Alpaca-format dataset to conduct text generation.\n",
"\n",
"We employ the [LoRA approach](https://arxiv.org/pdf/2106.09685.pdf) to finetune the LLM efficiently."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1008618d-a82e-4249-813e-ae46b95c64d5",
"metadata": {},
"outputs": [],
"source": [
"from transformers import TrainingArguments\n",
"from intel_extension_for_transformers.neural_chat.config import (\n",
" ModelArguments,\n",
" DataArguments,\n",
" FinetuningArguments,\n",
" TextGenerationFinetuningConfig,\n",
")\n",
"from intel_extension_for_transformers.neural_chat.chatbot import finetune_model\n",
"model_args = ModelArguments(model_name_or_path=\"Intel/neural-chat-7b-v3-1\")\n",
"data_args = DataArguments(train_file=\"alpaca_data.json\")\n",
"training_args = TrainingArguments(\n",
" output_dir='./finetuned_model_path',\n",
" do_train=True,\n",
" do_eval=True,\n",
" num_train_epochs=3,\n",
" overwrite_output_dir=True,\n",
" per_device_train_batch_size=4,\n",
" per_device_eval_batch_size=4,\n",
" gradient_accumulation_steps=1,\n",
" save_strategy=\"no\",\n",
" log_level=\"info\",\n",
" save_total_limit=2,\n",
" bf16=True\n",
")\n",
"finetune_args = FinetuningArguments()\n",
"finetune_cfg = TextGenerationFinetuningConfig(\n",
" model_args=model_args,\n",
" data_args=data_args,\n",
" training_args=training_args,\n",
" finetune_args=finetune_args,\n",
" )\n",
"finetune_model(finetune_cfg)"
]
},
{
"cell_type": "markdown",
"id": "701abb4e-41fa-4aa4-81e0-2abc3f15f093",
"metadata": {},
"source": [
"Load the fine tuned model"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4112b3b3-ef0d-4157-8c23-cf3ed5d59e8a",
"metadata": {},
"outputs": [],
"source": [
"from intel_extension_for_transformers.neural_chat import build_chatbot\n",
"from intel_extension_for_transformers.neural_chat import PipelineConfig\n",
"from intel_extension_for_transformers.neural_chat.config import LoadingModelConfig\n",
"\n",
"config = PipelineConfig(model_name_or_path=\"Intel/neural-chat-7b-v3-1\",\n",
" loading_config=LoadingModelConfig(peft_path=\"./finetuned_model_path\"))\n",
"chatbot = build_chatbot(config)\n",
"response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "186e46d1-3f94-423c-abcb-4a533777e26c",
"metadata": {},
"source": [
"### Congratulations! You have completed the NeuralChat quickstart\n",
"Visit [notebooks directory](https://github.com/intel/intel-extension-for-transformers/blob/c30353fcb0e5ceab440a7508b5980ccebcac8750/intel_extension_for_transformers/neural_chat/docs/full_notebooks.md) to see more examples"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "itrex",
"language": "python",
"name": "itrex"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading

0 comments on commit 033b7d5

Please sign in to comment.