Releases: heshengtao/comfyui_LLM_party
Releases · heshengtao/comfyui_LLM_party
(v0.6.0) 【New Year's party】 The convergence of the OpenAI ecosystem and the MCP ecosystem.
✨v0.6.0✨【新年派对】【New Year's party】
This release includes the following features:
- The MCP tool has been updated. You can modify the configuration in the 'mcp_config.json' file located in the party project folder to connect to your desired MCP server. You can find various MCP server configuration parameters that you may want to add here: modelcontextprotocol/servers. The default configuration for this project is the Everything server, which serves as a testing MCP server to verify its functionality. Reference workflow: start_with_MCP. Developer note: The MCP tool node can connect to the MCP server you have configured and convert the tools from the server into tools that can be directly used by LLMs. By configuring different local or cloud servers, you can experience all LLM tools available in the world.
- A new browser tool node has been developed based on browser-use, which allows the LLM to automatically perform the browser tasks you publish.
The nodes for loading files, loading folders, loading web content, and all word embedding-related nodes have been upgraded. Now, the file content you load will always include the file name and paragraph index. The loading folder node can filter the files you wish to load through related_characters. - A local model tool for speech-to-text has been added, which is theoretically compatible with all ASR models on HF. For example: openai/whisper-small, nyrahealth/CrisperWhisper, and so forth.
Added ASR and TTS nodes for fish audio, please refer to the API documentation of fish audio for usage instructions. - Added the aisuite loader node, which is compatible with all APIs that aisuite can accommodate, including: ["openai", "anthropic", "aws", "azure", "vertex", "huggingface"]. Example workflow: start_with_aisuite.
- A new category has been added: memory nodes, which can be utilized to manage your LLM conversation history. Currently, memory nodes support three modes for managing your conversation history: local JSON files, Redis, and SQL. By decoupling the LLM's conversation history from the LLM itself, you can employ word embedding models to compress and organize your conversation history, thus saving tokens and context windows for the LLM. Example workflow: External Memory.
本次发行包含如下功能:
- MCP 工具已更新。您可以修改 party 项目文件夹中的 'mcp_config.json' 文件中的配置,以连接到您想要的 MCP 服务器。您可以在这里找到可能需要添加的各种 MCP 服务器配置参数:modelcontextprotocol/servers。此项目的默认配置是 Everything 服务器,用作测试 MCP 服务器以验证其功能。参考工作流程:start_with_MCP。
开发者注意:MCP工具节点可以连接到您已配置的MCP服务器,并将服务器上的工具转换为LLM可以直接使用的工具。通过配置不同的本地或云服务器,您可以体验到世界上所有可用的LLM工具。 - 基于 browser-use 开发了一个新的浏览器工具节点,它允许 LLM 自动执行你发布的浏览器任务。
加载文件、加载文件夹、加载网页内容以及所有与词嵌入相关的节点都已升级。现在,您加载的文件内容将始终包含文件名和段落索引。加载文件夹节点可以通过related_characters来筛选您希望加载的文件。 - 添加了一个用于语音转文本的本地模型工具,理论上可以兼容HF上的所有ASR模型。例如:openai/whisper-small,nyrahealth/CrisperWhisper,等等。
为 fish audio 添加了 ASR 和 TTS 节点,请参考 fish audio 的 API 文档以获取使用说明。 - 添加了aisuite加载器节点,它兼容aisuite可以支持的所有API,包括:["openai","anthropic","aws","azure","vertex","huggingface"]。示例工作流程:start_with_aisuite。
- 新增了一个新类别:记忆节点,可以用来管理你的LLM对话历史。目前,记忆节点支持三种管理对话历史的方式:本地JSON文件、Redis和SQL。通过将LLM的对话历史与LLM本身分离,你可以使用词嵌入模型来压缩和组织你的对话历史,从而为LLM节省令牌和上下文窗口。示例工作流程:外部记忆。
(v0.5.0) 【Starry Night Fireworks】 Thank you all, a thousand stars have gathered at the party!
✨v0.5.0✨【星夜烟火】【Starry Night Fireworks】
This release includes the following features:
- Updated a series of conversion nodes: markdown to HTML, svg to image, HTML to image, mermaid to image, markdown to Excel.
- Compatible with the llama3.2 vision model, supports multi-turn dialogue, visual functions. Model address: meta-llama/Llama-3.2-11B-Vision-Instruct. Example workflow: llama3.2_vision.
- Adapted GOT-OCR2, supports formatted output results, supports fine text recognition using position boxes and colors. Model address: GOT-OCR2. Example workflow converts a screenshot of a webpage into HTML code and then opens the browser to display this webpage: img2web.
- The local LLM loader nodes have been significantly adjusted, so you no longer need to choose the model type yourself. The llava loader node and GGUF loader node have been re-added. The model type on the local LLM model chain node has been changed to LLM, VLM-GGUF, and LLM-GGUF, corresponding to directly loading LLM models, loading VLM models, and loading GGUF format LLM models. VLM models and GGUF format LLM models are now supported again. Local calls can now be compatible with more models! Example workflows: LLM_local, llava, GGUF
- Added EasyOCR node for recognizing text and positions in images. It can generate corresponding masks and return a JSON string for LLM to view. There are standard and premium versions available for everyone to choose from!
- In the comfyui LLM party, the strawberry system of the chatgpt-o1 series model was reproduced, referring to the prompts of Llamaberry. Example workflow: Strawberry system compared to o1.
- A new GPT-sovits node has been added, allowing you to call the GPT-sovits model to convert text into speech based on your reference audio. You can also fill in the path of your fine-tuned model (if not filled, the base model will be used for inference) to get any desired voice. To use it, you need to download the GPT-sovits project and the corresponding base model locally, then start the API service with
runtime\python.exe api_v2.py
in the GPT-sovits project folder. Additionally, the chatTTS node has been moved to comfyui LLM mafia. The reason is that chatTTS has many dependencies, and its license on PyPi is CC BY-NC 4.0, which is a non-commercial license. Even though the chatTTS GitHub project is under the AGPL license, we moved the chatTTS node to comfyui LLM mafia to avoid unnecessary trouble. We hope everyone understands! - Now supports OpenAI’s latest model, the o1 series!
- Added a local file control tool that allows the LLM to control files in your specified folder, such as reading, writing, appending, deleting, renaming, moving, and copying files.Due to the potential danger of this node, it is included in comfyui LLM mafia.
- New SQL tools allow LLM to query SQL databases.
- Updated the multilingual version of the README. Workflow for translating the README document: translate_readme
- Updated 4 iterator nodes (text iterator, picture iterator, excel iterator, json iterator). The iterator modes are: sequential, random, and infinite. The order will be output in sequence until the index limit is exceeded, the process will be automatically aborted, and the index value will be reset to 0. Random will choose a random index output, and infinite will loop output.
- Added Gemini API loader node, now compatible with Gemini official API!Since Gemini generates an error with a return code of 500 if the returned parameter contains Chinese characters during the tool call, some tool nodes are unavailable.example workflow:start_with_gemini
- Added lore book node, you can insert your background settings when talking to LLM, example workflow: lorebook
- Added FLUX prompt word generator mask node, which can generate Hearthstone cards, Game King cards, posters, comics and other styles of prompt words, which can make the FLUX model straight out. Reference workflow: FLUX prompt word
- A local file reading tool has been added. In comparison to the previous local file control tool in ComfyUI LLM Mafia, this tool can only read files or the file tree within a specific folder, thus ensuring greater security.
- Forked chatgpt-on-wechat, created a new repository party-on-wechat. The installation and usage methods are the same as the original project, no configuration is required, just start the party's FastAPI. By default, it calls the wx_api workflow and supports image output. It will be updated gradually to ensure a smooth experience of party on WeChat.
- Added an In-Context-LoRA mask node, used for generating consistent In-Context-LoRA prompts.
- We have added a frontend component with features laid out from left to right as follows:
- Saves your API key and Base URL to the
config.ini
file. When you use thefix node
function on the API LLM loader node, it will automatically read the updated API key and Base URL from theconfig.ini
file. - Starts a FastAPI service that can be used to call your ComfyUI workflow. If you run it directly, you get an OpenAI interface at
http://127.0.0.1:8817/v1/
. You need to connect the start and end of your workflow to the 'Start Workflow' and 'End Workflow', then save in API format to theworkflow_api
folder. Then, in any frontend that can call the OpenAI interface, inputmodel name=<your workflow name without the .json extension>
,Base URL=http://127.0.0.1:8817/v1/
, and the API key can be filled with any value. - Starts a Streamlit application; the workflow saving process is as above. You can select your saved workflow in the 'Settings' of the Streamlit app and interact with your workflow agent in the 'Chat'.
- 'About Us', which introduces this project.
- Saves your API key and Base URL to the
- The automatic model name list node has been removed and replaced with a simple API LLM loader node, which automatically retrieves your model name list from the configuration in your config.ini file. You just need to select a name to load the model. Additionally, the simple LLM loader, simple LLM-GGUF loader, simple VLM loader, simple VLM-GGUF loader, and simple LLM lora loader nodes have been updated. They all automatically read the model paths from the model folder within the party folder, making it easier for everyone to load various local models.
- LLMs can now dynamically load lora like SD and FLUX. You can chain multiple loras to load more loras on the same LLM. Example workflow: start_with_LLM_LORA.
- Added the searxng tool, which can aggregate searches across the entire web. Perplexica also relies on this aggregation search tool, so you can set up a Perplexica at your party. You can deploy the searxng/searxng public image in Docker, then start it using
docker run -d -p 8080:8080 searxng/searxng
, and access it usinghttp://localhost:8080
. You can fill in this URLhttp://localhost:8080
in the party's searxng tool, and then you can use searxng as a tool for LLM. - Major Update!!! Now you can encapsulate any ComfyUI workflow into an LLM tool node. You can have your LLM control multiple ComfyUI workflows simultaneously. When you want it to complete some tasks, it can choose the appropriate ComfyUI workflow based on your prompt, complete your task, and return the result to you. Example workflow: comfyui_workflows_tool. The specific steps are as follows:
- First, connect the text input interface of the workflow you want to encapsulate as a tool to the "user_prompt" output of the "Start Workflow" node. This is where the prompt passed in when the LLM calls the tool.
- Connect the positions where you want to output text and images to the corresponding input positions of the "End Workflow" node.
- Save this workflow as an API (you need to enable developer mode in the settings to see this button).
- Save this workflow to the workflow_api folder of this project.
- Restart ComfyUI and create a simple LLM workflow, such as: start_with_LLM_api.
- Add a "Workflow Tool" node to this LLM node and connect it to the tool input of the LLM node.
- In the "Workflow Tool" node, write the name of the workflow file you want to call in the first input box, for example: draw.json. You can write multiple workflow file names. In the second input box, write the function of each workflow so that the LLM understands how to use these workflows.
- Run it to see the LLM call your encapsulated workflow and return the result to you. If the return is an image, connect the "Preview Image" node to the image output of the LLM node to view the generated image. Note! This method calls a new ComfyUI on your 8190 port, please do not occupy this port. A new terminal will be opened on Windows and Mac systems, please do not close it. The Linux system uses the screen process to achieve this, when you do not need to use it, close this screen process, otherwise, it will a...
(v0.4.0) 【Bookish Tea Talk】 chatTTS support! KG graphRAG neo4j support!More access to social apps!
✨v0.4.0✨【书香茶话】【Bookish Tea Talk】
This release includes the following features:
- Added a text iterator that outputs a portion of characters each time, safely splitting the text based on newline characters and chunk size without splitting in the middle of a text.
chunk_overlap
refers to how many characters overlap in the split text. This allows for batch input of long texts; just click mindlessly or enable loop execution in ComfyUI, and it will automatically execute. Remember to enable theis_locked
attribute to automatically lock the workflow at the end of the input, preventing further execution. Example workflow: Text Iterative Input - Added the
model name
attribute to the local LLM loader, local llava loader, and local guff loader. If empty, it uses the various local paths in the node. If not empty, it will use the path parameters you filled inconfig.ini
. If not empty and not inconfig.ini
, it will download from Hugging Face or load from the Hugging Face model save directory. If you want to download from Hugging Face, please fill in themodel name
attribute in the format likeTHUDM/glm-4-9b-chat
. Note! The model loaded this way must be compatible with the transformer library. - Adapted CosyVoice, now you can use TTS functionality without downloading any models or API keys. Currently, this interface only supports Chinese.
- Added JSON file parsing node and JSON value extraction node, allowing you to get the value of a key from a file or text. Thanks to guobalove for the contribution!
- Improved the tool invocation code, now LLMs without tool invocation functionality can also enable the
is_tools_in_sys_prompt
attribute (local LLMs do not need to enable it by default, automatically adapted). After enabling, tool information will be added to the system prompt, allowing LLM to call tools. Related paper on the implementation principle: Achieving Tool Calling Functionality in LLMs Using Only Prompt Engineering Without Fine-Tuning - Created a
custom_tool
folder for storing custom tool code. You can refer to the code in the custom_tool folder, place the custom tool code in thecustom_tool
folder, and then call the custom tool in LLM. - Added a knowledge graph tool, allowing LLM to interact perfectly with the knowledge graph. LLM can modify the knowledge graph based on your input and reason on the knowledge graph to get the answers you need. Example workflow reference: graphRAG_neo4j
- Added the functionality to connect agents to Discord. (Still in testing)
- Added the functionality to connect agents to Feishu, thanks a lot to guobalove for the contribution! Reference workflow Feishu Bot.
- Added a universal API call node and many auxiliary nodes for constructing request bodies and capturing information from responses.
- Added a model clearing node, allowing you to unload LLM from memory at any position!
- Added the chatTTS node, thanks a lot to guobalove for the contribution! The
model_path
parameter can be empty! It is recommended to use the HF mode to load the model, which will automatically download from Hugging Face without manual download; if using local loading, please place the model'sasset
andconfig
folders in the root directory. Baidu Cloud Address, extraction code: qyhu; if usingcustom
mode loading, please place the model'sasset
andconfig
folders in themodel_path
.
本次发行包含如下功能:
- 新增了一个文本迭代器,可以每次只输出一部分的字符,是根据回车符号和chunk size来安全分割文本的,不会从文本中间分割。chunk_overlap是指分割的文本重叠多少字符。这样可以批量输入超长文本,只要无脑点击,或者开启comfyui里的循环执行就行了,就可以自动执行完了。记得开启is_locked属性,可以在输入结束时,自动锁住工作流,不会继续执行。示例工作流:文本迭代输入
- 在本地LLM加载器、本地llava加载器、本地guff加载器上添加了model name属性,如果为空,则使用节点中的各类本地path加载。如果不为空,则会使用
config.ini
中你自己填写的路径参数加载。如果不为空且不在config.ini
中,则会从huggingface上下载或则从huggingface的模型保存目录中加载。如果你想从huggingface上下载,请按照例如:THUDM/glm-4-9b-chat
的格式填写model name属性。注意!这样子加载的模型必须适配transformer库。 - 适配了CosyVoice,现在可以无需下载任何模型或者任何API key,直接使用TTS功能。目前该接口只适配了中文。
- 新增了JSON文件解析节点和JSON取值节点,可以让你从文件或者文本中获取某一个键的值。感谢guobalove的贡献!
- 改进了工具调用的代码,现在没有工具调用功能的LLM也可以开启is_tools_in_sys_prompt属性(本地LLM默认无需开启,自动适配),开启之后,工具信息会添加到系统提示词中,这样LLM就可以调用工具了。实现原理的相关论文:Achieving Tool Calling Functionality in LLMs Using Only Prompt Engineering Without Fine-Tuning
- 新建了custom_tool文件夹,用于存放自定义工具的代码,可以参考custom_tool文件夹中的代码,将自定义工具的代码放入custom_tool文件夹中,即可在LLM中调用自定义工具。
- 新增了知识图谱工具,让LLM与知识图谱可以完美交互,LLM可以根据你的输入修改知识图谱,可以在知识图谱上推理以获取你需要的答案。示例工作流参考:graphRAG_neo4j
- 新增将智能体接入discord的功能。(还在测试中)
- 新增将智能体接入飞书的功能,超级感谢guobalove的贡献!参考工作流
飞书机器人。 - 新增了万能API调用节点以及大量的辅助节点,用于构造请求体和抓取响应中的信息。
- 新增了清空模型节点,可以在任意位置将LLM从显存中卸载!
- 已添加了chatTTS节点,超级感谢guobalove的贡献!
model_path
参数可以为空!推荐使用HF模式加载模型,模型会自动从hugging face上下载,无需手动下载;如果使用local加载,请将模型的asset
和config
文件夹放到根目录下。百度云地址,提取码:qyhu;如果使用custom
模式加载,请将模型的asset
和config
文件夹放到model_path
下。
(v0.3.0)【Masked Ball】Wireless conversation, permanent memory, stable personality!
✨v0.3.0✨【假面舞会】【Masked Ball】
This release includes the following features:
- Added Knowledge Graph tool, so that LLM and Knowledge Graph can interact perfectly. LLM can modify Knowledge Graph according to your input, and can reason on Knowledge Graph to get the answers you need. Example workflow reference: Knowledge_Graph
- Added personality AI function, 0 code to develop your own girlfriend AI or boyfriend AI, unlimited dialogue, permanent memory, stable personality. Example workflow reference: Mylover Personality AI
- You can build your own interactive novel game, and go to different endings according to the user's choice! Example workflow reference: interactive_novel
- Adapted to OpenAI's whisper and tts functions, voice input and output can be realized. Example workflow reference: voice_input&voice_output
- Compatible with Omost!!! Please download omost-llama-3-8b-4bits to experience it now! Sample workflow reference: start_with_OMOST
- Added LLM tools to send messages to WeCom, DingTalk, and Feishu, as well as external functions to call.
本次发行包含如下功能:
- 新增了知识图谱工具,让LLM与知识图谱可以完美交互,LLM可以根据你的输入修改知识图谱,可以在知识图谱上推理以获取你需要的答案。示例工作流参考:知识图谱
- 新增了人格AI功能,0代码开发自己的女友AI或男友AI,无限对话,永久记忆,人设稳定。示例工作流参考:麦洛薇人格AI
- 可以搭建自己的互动小说游戏了,根据用户的选择,走向不同的结局!示例工作流参考:互动小说
- 适配了openai的whisper和tts功能,可以实现语音输入和输出。示例工作流参考:语音输入+语音输出
- 兼容Omost啦!!!请下载omost-llama-3-8b-4bits立即体验吧!示例工作流参考:start_with_OMOST
- 新增了将消息发送到企业微信、钉钉和飞书的LLM工具以及可供调用的外部函数。
(v0.2.0)【Model Debut】More model support and more flexible workflow gameplay!
✨v0.2.0✨【模特登场】【Model Debut】
This release includes the following features:
LLM
- The LLM node has been split, separating the LLM loader and the LLM model chain for more flexible reuse of these models.
- Adapted all models with an interface similar to OpenAI, such as: Tongyi Qianwen/qwen, Zhipu Qingyan/GLM, deepseek, kimi/moonshot. Please fill in the base_url, api_key, and model_name of these models into the LLM node to call them.
- Added a new LVM loader, now you can call LVM models locally, supporting the llava-llama-3-8b-v1_1-gguf model. Other LVM models, if in GUFF format, should theoretically also be runnable.
- MacOS and mps devices are now supported! Thanks to bigcat88 for their contribution!
Tools
- Added Wikipedia tool, web summary tool, and arXiv paper tool.
- Another workflow can now be used as a tool. However, this version is not yet perfected and is just barely usable.
Workflow
- Wrote a
fastapi.py
file, if you run it directly, you will get an OpenAI interface athttp://127.0.0.1:8817/v1/
, any application that can call GPT can now use your comfyui workflow! I will produce a tutorial to demonstrate how to operate it in detail~ - Wrote an Excel iterator, which can output your CSV table row by row to the next node.
本次发行包含如下功能:
LLM
- 拆分了LLM节点,将LLM的加载器和LLM模型链拆分,方便更灵活的复用这些模型。
- 适配了所有具有类似openai接口的模型,例如:通义千问/qwen、智谱清言/GLM、deepseek、kimi/moonshot。请将这些模型的base_url、api_key、model_name填入LLM节点以调用它们。
- 新增了一个LVM加载器,现在可以本地调用LVM模型了,支持llava-llama-3-8b-v1_1-gguf模型,其他LVM模型如果是GUFF格式,理论上应该也可以运行。
- 目前已经支持了macOS以及mps设备!感谢bigcat88对此的贡献!
工具
- 新增了维基百科工具、网页总结工具、arxiv论文工具。
- 可以将另一个工作流作为一个工具使用。不过目前这个版本还没有完善,只是勉强能用。
工作流
- 写了一个
fastapi.py
文件,如果你直接运行它,你就获得了一个http://127.0.0.1:8817/v1/
上的openai接口,任何可以调用GPT的应用都可以调用你的comfyui工作流了!详细怎么操作我会出一期教程来演示~ - 写了一个Excel迭代器,可以将你的csv表格按行输出到下一个节点。
(v0.1.0)【Party Invitation】The first release of comfyui_LLM_party!
✨v0.1.0✨【派对邀请】【Party Invitation】
This release includes the following features:
LLM
- Supports two modes of operation: API calls and local deployment
- Allows the LLM to be used as a tool attached to another larger model
- Supports the visual capabilities of GPT-4
- Can load word embedding models independently, or attach word embedding models as a tool to the LLM.
Tools
- Supports basic intelligent agent needs such as knowledge bases, code interpreters, and online queries
- Added practical tools for time, weather, Wikipedia, and academic paper queries
- Introduced a omnipotent interpreter tool that enables the LLM to perform any task by automatically installing the necessary libraries for the LLM-generated code in a virtual environment, essentially allowing any code-executable task to be performed automatically
- Added an API tool that allows the LLM to call user-defined APIs
Workflow
- Added start_dialog and end_dialog nodes, enabling workflows to create loopback links
- Added classify persona and classify function, allowing the workflow to execute different parts based on user input
- Introduced string logic nodes for quick construction of conditional judgments within workflows
- Added workflow transfer nodes that allow one workflow to be embedded within another, working in conjunction with string logic nodes to execute different large-scale workflows based on the LLM's output
Frontend Application
- Added the setup_streamlit_app.bat file for quick setup of your LLM workflow application
本次发行包含如下功能:
LLM
- 支持API调用和本地部署两种方式
- 可以将LLM也作为一个工具挂接到另一个大模型上
- 支持GPT4的视觉功能
- 可以单独加载词嵌入模型,也可以将词嵌入模型作为工具挂接到LLM上
工具
- 支持知识库、代码解释器、联网查询等基本智能体需求
- 新增了时间、天气、维基百科、论文查询等实用工具
- 新增了万能解释器工具,可以让LLM执行任何任务,原理是在一个虚拟环境中自动安装LLM生成代码所需的库,相当于只要是代码可以执行的任务都可以自动执行
- 新增了API工具,可以让LLM调用用户自定义API
工作流
- 新增了start_dialog和end_dialog节点,可以让工作流实现回环链接
- 新增了分类器面具和分类器函数,可以让工作流根据用户的输入,执行不同的部分
- 新增了字符串逻辑节点,可以快速构建工作流中的条件判断
- 新增了工作流中转器节点,可以将一个工作流嵌入另一个工作流,配合字符串逻辑节点,根据LLM的输出来执行不同的大型工作流
前端应用
- 新增了setup_streamlit_app.bat文件,用于快速搭建你的LLM工作流应用