基于 InternLM 和 LangChain 搭建私人知识库

本篇介绍基于 InternLM 和 LangChain 搭建私人知识库。

环境配置

!conda create --name internlm_langchain --clone=/root/share/conda_envs/internlm-base
!/root/.conda/envs/internlm_langchain/bin/python -m pip install ipykernel ipywidgets
!/root/.conda/envs/internlm_langchain/bin/python -m ipykernel install --user --name internlm_langchain --display-name internlm_langchain

# refresh web and use new kernel internlm_langchain

# 升级pip
%pip install -q --upgrade pip

%pip install -q modelscope==1.9.5 transformers==4.35.2 streamlit==1.24.0 sentencepiece==0.1.99 accelerate==0.24.1

1	%pip install -q langchain==0.0.292 gradio==4.4.0 chromadb==0.4.15 sentence-transformers==2.2.2 unstructured==0.10.30 markdown==3.3.7

1	%pip install -q -U huggingface_hub

模型和NLTK 相关资源下载

1 2	%mkdir -p /root/data/model/Shanghai_AI_Laboratory %cp -r /root/share/temp/model_repos/internlm-chat-7b /root/data/model/Shanghai_AI_Laboratory/internlm-chat-7b

1	import sys, os

# 设置环境变量
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

# 下载模型
os.system(f'{os.path.join(sys.exec_prefix, "bin/huggingface-cli")} download --resume-download sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 --local-dir /root/data/model/sentence-transformer')

1 2	%cd /root !git clone https://gitee.com/yzy0612/nltk_data.git --branch gh-pages

/root
Cloning into 'nltk_data'...


/root/.conda/envs/internlm_langchain/lib/python3.10/site-packages/IPython/core/magics/osm.py:417: UserWarning: using dhist requires you to install the `pickleshare` library.
  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


remote: Enumerating objects: 1692, done.
remote: Counting objects: 100% (1692/1692), done.
remote: Compressing objects: 100% (775/775), done.
remote: Total 1692 (delta 909), reused 1692 (delta 909), pack-reused 0
Receiving objects: 100% (1692/1692), 952.80 MiB | 5.84 MiB/s, done.
Resolving deltas: 100% (909/909), done.
Updating files: 100% (244/244), done.

1 2	%cd nltk_data %mv packages/* ./

/root/nltk_data

1 2	%cd tokenizers !unzip punkt.zip

/root/nltk_data/tokenizers
Archive:  punkt.zip
   creating: punkt/
  inflating: punkt/greek.pickle      
  inflating: punkt/estonian.pickle   
  inflating: punkt/turkish.pickle    
  inflating: punkt/polish.pickle     
   creating: punkt/PY3/
  inflating: punkt/PY3/greek.pickle  
  inflating: punkt/PY3/estonian.pickle  
  inflating: punkt/PY3/turkish.pickle  
  inflating: punkt/PY3/polish.pickle  
  inflating: punkt/PY3/russian.pickle  
  inflating: punkt/PY3/czech.pickle  
  inflating: punkt/PY3/portuguese.pickle  
  inflating: punkt/PY3/README        
  inflating: punkt/PY3/dutch.pickle  
  inflating: punkt/PY3/norwegian.pickle  
  inflating: punkt/PY3/slovene.pickle  
  inflating: punkt/PY3/english.pickle  
  inflating: punkt/PY3/danish.pickle  
  inflating: punkt/PY3/finnish.pickle  
  inflating: punkt/PY3/swedish.pickle  
  inflating: punkt/PY3/spanish.pickle  
  inflating: punkt/PY3/german.pickle  
  inflating: punkt/PY3/italian.pickle  
  inflating: punkt/PY3/french.pickle  
  inflating: punkt/russian.pickle    
  inflating: punkt/czech.pickle      
  inflating: punkt/portuguese.pickle  
  inflating: punkt/README            
  inflating: punkt/dutch.pickle      
  inflating: punkt/norwegian.pickle  
  inflating: punkt/slovene.pickle    
  inflating: punkt/english.pickle    
  inflating: punkt/danish.pickle     
  inflating: punkt/finnish.pickle    
  inflating: punkt/swedish.pickle    
  inflating: punkt/spanish.pickle    
  inflating: punkt/german.pickle     
  inflating: punkt/italian.pickle    
  inflating: punkt/french.pickle     
  inflating: punkt/.DS_Store         
  inflating: punkt/PY3/malayalam.pickle  
  inflating: punkt/malayalam.pickle

1 2	%cd ../taggers !unzip averaged_perceptron_tagger.zip

/root/nltk_data/taggers
Archive:  averaged_perceptron_tagger.zip
   creating: averaged_perceptron_tagger/
  inflating: averaged_perceptron_tagger/averaged_perceptron_tagger.pickle

项目代码下载

1 2	%cd /root/data !git clone https://github.com/InternLM/tutorial

/root/data
Cloning into 'tutorial'...
remote: Enumerating objects: 352, done.
remote: Counting objects: 100% (202/202), done.
remote: Compressing objects: 100% (127/127), done.
remote: Total 352 (delta 117), reused 136 (delta 74), pack-reused 150
Receiving objects: 100% (352/352), 11.28 MiB | 8.07 MiB/s, done.
Resolving deltas: 100% (142/142), done.

知识库搭建

数据收集

为语料处理方便，我们将选用上海人工智能实验室开源的一系列大模型工具开源仓库作为语料库来源，仓库中所有的 markdown、txt 文件作为示例语料库。注意，也可以选用其中的代码文件加入到知识库中，但需要针对代码文件格式进行额外处理（因为代码文件对逻辑联系要求较高，且规范性较强，在分割时最好基于代码模块进行分割再加入向量数据库）。

# 进入到数据库盘
%cd /root/data
# clone 上述开源仓库
!git clone https://gitee.com/open-compass/opencompass.git
!git clone https://gitee.com/InternLM/lmdeploy.git
!git clone https://gitee.com/InternLM/xtuner.git
!git clone https://gitee.com/InternLM/InternLM-XComposer.git
!git clone https://gitee.com/InternLM/lagent.git
!git clone https://gitee.com/InternLM/InternLM.git

/root/data
Cloning into 'opencompass'...
remote: Enumerating objects: 4843, done.
remote: Total 4843 (delta 0), reused 0 (delta 0), pack-reused 4843
Receiving objects: 100% (4843/4843), 1.48 MiB | 1.39 MiB/s, done.
Resolving deltas: 100% (2941/2941), done.
Updating files: 100% (1154/1154), done.
Cloning into 'lmdeploy'...
remote: Enumerating objects: 4485, done.
remote: Counting objects: 100% (4485/4485), done.
remote: Compressing objects: 100% (1494/1494), done.
remote: Total 4485 (delta 2914), reused 4485 (delta 2914), pack-reused 0
Receiving objects: 100% (4485/4485), 2.23 MiB | 1.34 MiB/s, done.
Resolving deltas: 100% (2914/2914), done.
Updating files: 100% (455/455), done.
Cloning into 'xtuner'...
remote: Enumerating objects: 3735, done.
remote: Counting objects: 100% (1150/1150), done.
remote: Compressing objects: 100% (252/252), done.
remote: Total 3735 (delta 920), reused 1106 (delta 895), pack-reused 2585
Receiving objects: 100% (3735/3735), 742.80 KiB | 864.00 KiB/s, done.
Resolving deltas: 100% (2741/2741), done.
Updating files: 100% (450/450), done.
Cloning into 'InternLM-XComposer'...
remote: Enumerating objects: 680, done.
remote: Counting objects: 100% (680/680), done.
remote: Compressing objects: 100% (273/273), done.
remote: Total 680 (delta 361), reused 680 (delta 361), pack-reused 0
Receiving objects: 100% (680/680), 10.74 MiB | 2.61 MiB/s, done.
Resolving deltas: 100% (361/361), done.
Cloning into 'lagent'...
remote: Enumerating objects: 414, done.
remote: Counting objects: 100% (414/414), done.
remote: Compressing objects: 100% (188/188), done.
remote: Total 414 (delta 197), reused 414 (delta 197), pack-reused 0
Receiving objects: 100% (414/414), 214.97 KiB | 974.00 KiB/s, done.
Resolving deltas: 100% (197/197), done.
Cloning into 'InternLM'...
remote: Enumerating objects: 2604, done.
remote: Counting objects: 100% (592/592), done.
remote: Compressing objects: 100% (264/264), done.
remote: Total 2604 (delta 324), reused 581 (delta 318), pack-reused 2012
Receiving objects: 100% (2604/2604), 4.87 MiB | 1.69 MiB/s, done.
Resolving deltas: 100% (1608/1608), done.

首先将上述仓库中所有满足条件的文件路径找出来，我们定义一个函数，该函数将递归指定文件夹路径，返回其中所有满足条件（即后缀名为 .md 或者 .txt 的文件）的文件路径：

import os 
def get_files(dir_path):
    # args：dir_path，目标文件夹路径
    file_list = []
    for filepath, dirnames, filenames in os.walk(dir_path):
        # os.walk 函数将递归遍历指定文件夹
        for filename in filenames:
            # 通过后缀名判断文件类型是否满足要求
            if filename.endswith(".md"):
                # 如果满足要求，将其绝对路径加入到结果列表
                file_list.append(os.path.join(filepath, filename))
            elif filename.endswith(".txt"):
                file_list.append(os.path.join(filepath, filename))
    return file_list

加载数据

得到所有目标文件路径之后，我们可以使用 LangChain 提供的 FileLoader 对象来加载目标文件，得到由目标文件解析出的纯文本内容。由于不同类型的文件需要对应不同的 FileLoader，我们判断目标文件类型，并针对性调用对应类型的 FileLoader，同时，调用 FileLoader 对象的 load 方法来得到加载之后的纯文本对象：

from tqdm import tqdm
from langchain.document_loaders import UnstructuredFileLoader
from langchain.document_loaders import UnstructuredMarkdownLoader

def get_text(dir_path):
    # args：dir_path，目标文件夹路径
    # 首先调用上文定义的函数得到目标文件路径列表
    file_lst = get_files(dir_path)
    # docs 存放加载之后的纯文本对象，docs 为一个纯文本对象对应的列表
    docs = []
    # 遍历所有目标文件
    for one_file in tqdm(file_lst):
        file_type = one_file.split('.')[-1]
        if file_type == 'md':
            loader = UnstructuredMarkdownLoader(one_file)
        elif file_type == 'txt':
            loader = UnstructuredFileLoader(one_file)
        else:
            # 如果是不符合条件的文件，直接跳过
            continue
        docs.extend(loader.load())
    return docs

构建向量数据库

得到该列表之后，我们就可以将它引入到 LangChain 框架中构建向量数据库。由纯文本对象构建向量数据库，我们需要先对文本进行分块，接着对文本块进行向量化。

LangChain 提供了多种文本分块工具，此处我们使用字符串递归分割器，并选择分块大小为 500，块重叠长度为 150（由于篇幅限制，此处没有展示切割效果，学习者可以自行尝试一下，想要深入学习 LangChain 文本分块可以参考教程《LangChain - Chat With Your Data》：

from langchain.text_splitter import RecursiveCharacterTextSplitter

# 目标文件夹
tar_dir = [
    "/root/data/InternLM",
    "/root/data/InternLM-XComposer",
    "/root/data/lagent",
    "/root/data/lmdeploy",
    "/root/data/opencompass",
    "/root/data/xtuner"
]

# 加载目标文件
docs = []
for dir_path in tar_dir:
    docs.extend(get_text(dir_path))

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=150)
split_docs = text_splitter.split_documents(docs)

  0%|          | 0/25 [00:00<?, ?it/s]/root/.conda/envs/internlm_langchain/lib/python3.10/site-packages/unstructured/documents/html.py:498: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
  rows = body.findall("tr") if body else []
 40%|████      | 10/25 [00:28<00:23,  1.56s/it]/root/.conda/envs/internlm_langchain/lib/python3.10/site-packages/unstructured/documents/html.py:498: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
  rows = body.findall("tr") if body else []
100%|██████████| 25/25 [00:28<00:00,  1.16s/it]
100%|██████████| 9/9 [00:00<00:00, 19.51it/s]
100%|██████████| 18/18 [00:00<00:00, 34.18it/s]
100%|██████████| 72/72 [00:02<00:00, 24.73it/s]
100%|██████████| 113/113 [00:05<00:00, 18.85it/s]
100%|██████████| 26/26 [00:01<00:00, 18.29it/s]

接着我们选用开源词向量模型 Sentence Transformer 来进行文本向量化。LangChain 提供了直接引入 HuggingFace 开源社区中的模型进行向量化的接口：

1
2
3

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="/root/data/model/sentence-transformer")

同时，考虑到 Chroma 是目前最常用的入门数据库，我们选择 Chroma 作为向量数据库，基于上文分块后的文档以及加载的开源向量化模型，将语料加载到指定路径下的向量数据库：

from langchain.vectorstores import Chroma

# 定义持久化路径
persist_directory = 'data_base/vector_db/chroma'
# 加载数据库
vectordb = Chroma.from_documents(
    documents=split_docs,
    embedding=embeddings,
    persist_directory=persist_directory  # 允许我们将persist_directory目录保存到磁盘上
)
# 将加载的向量数据库持久化到磁盘上
vectordb.persist()

可以在 /root/data 下新建一个 demo目录，将该脚本和后续脚本均放在该目录下运行。运行上述脚本，即可在本地构建已持久化的向量数据库，后续直接导入该数据库即可，无需重复构建。

InternLM 接入 LangChain

为便捷构建 LLM 应用，我们需要基于本地部署的 InternLM，继承 LangChain 的 LLM 类自定义一个 InternLM LLM 子类，从而实现将 InternLM 接入到 LangChain 框架中。完成 LangChain 的自定义 LLM 子类之后，可以以完全一致的方式调用 LangChain 的接口，而无需考虑底层模型调用的不一致。

基于本地部署的 InternLM 自定义 LLM 类并不复杂，我们只需从 LangChain.llms.base.LLM 类继承一个子类，并重写构造函数与 _call 函数即可：

from langchain.llms.base import LLM
from typing import Any, List, Optional
from langchain.callbacks.manager import CallbackManagerForLLMRun
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

class InternLM_LLM(LLM):
    # 基于本地 InternLM 自定义 LLM 类
    tokenizer : AutoTokenizer = None
    model: AutoModelForCausalLM = None

    def __init__(self, model_path :str):
        # model_path: InternLM 模型路径
        # 从本地初始化模型
        super().__init__()
        print("正在从本地加载模型...")
        self.tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
        self.model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True).to(torch.bfloat16).cuda()
        self.model = self.model.eval()
        print("完成本地模型的加载")

    def _call(self, prompt : str, stop: Optional[List[str]] = None,
                run_manager: Optional[CallbackManagerForLLMRun] = None,
                **kwargs: Any):
        # 重写调用函数
        system_prompt = """You are an AI assistant whose name is InternLM (书生·浦语).
        - InternLM (书生·浦语) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.
        - InternLM (书生·浦语) can understand and communicate fluently in the language chosen by the user such as English and 中文.
        """
        
        messages = [(system_prompt, '')]
        response, history = self.model.chat(self.tokenizer, prompt , history=messages)
        return response
        
    @property
    def _llm_type(self) -> str:
        return "InternLM"

在上述类定义中，我们分别重写了构造函数和 _call 函数：对于构造函数，我们在对象实例化的一开始加载本地部署的 InternLM 模型，从而避免每一次调用都需要重新加载模型带来的时间过长；_call 函数是 LLM 类的核心函数，LangChain 会调用该函数来调用 LLM，在该函数中，我们调用已实例化模型的 chat 方法，从而实现对模型的调用并返回调用结果。

构建检索问答链

LangChain 通过提供检索问答链对象来实现对于 RAG 全流程的封装。所谓检索问答链，即通过一个对象完成检索增强问答（即RAG）的全流程，针对 RAG 的更多概念，我们会在视频内容中讲解，也欢迎读者查阅该教程来进一步了解：《LLM Universe》。我们可以调用一个 LangChain 提供的 RetrievalQA 对象，通过初始化时填入已构建的数据库和自定义 LLM 作为参数，来简便地完成检索增强问答的全流程，LangChain 会自动完成基于用户提问进行检索、获取相关文档、拼接为合适的 Prompt 并交给 LLM 问答的全部流程。

加载向量数据库

首先我们需要将上文构建的向量数据库导入进来，我们可以直接通过 Chroma 以及上文定义的词向量模型来加载已构建的数据库：

from langchain.vectorstores import Chroma
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
import os

# 定义 Embeddings
embeddings = HuggingFaceEmbeddings(model_name="/root/data/model/sentence-transformer")

# 向量数据库持久化路径
persist_directory = 'data_base/vector_db/chroma'

# 加载数据库
vectordb = Chroma(
    persist_directory=persist_directory, 
    embedding_function=embeddings
)

上述代码得到的 vectordb 对象即为我们已构建的向量数据库对象，该对象可以针对用户的 query 进行语义向量检索，得到与用户提问相关的知识片段。

实例化自定义 LLM 与 Prompt Template

接着，我们实例化一个基于 InternLM 自定义的 LLM 对象：

1 2	llm = InternLM_LLM(model_path = "/root/data/model/Shanghai_AI_Laboratory/internlm-chat-7b") llm.predict("你是谁")

正在从本地加载模型...



Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]


完成本地模型的加载





'我是一个语言模型，我的名字是书生·浦语。我来自上海人工智能实验室。我可以回答各种问题，包括日常生活、历史、文化、科技、艺术、政治等各种话题。如果您有任何问题，欢迎随时问我。'

构建检索问答链，还需要构建一个 Prompt Template，该 Template 其实基于一个带变量的字符串，在检索之后，LangChain 会将检索到的相关文档片段填入到 Template 的变量中，从而实现带知识的 Prompt 构建。我们可以基于 LangChain 的 Template 基类来实例化这样一个 Template 对象：

from langchain.prompts import PromptTemplate

# 我们所构造的 Prompt 模板
template = """使用以下上下文来回答用户的问题。如果你不知道答案，就说你不知道。总是使用中文回答。
问题: {question}
可参考的上下文：
···
{context}
···
如果给定的上下文无法让你做出回答，请回答你不知道。
有用的回答:"""

# 调用 LangChain 的方法来实例化一个 Template 对象，该对象包含了 context 和 question 两个变量，在实际调用时，这两个变量会被检索到的文档片段和用户提问填充
QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context","question"],template=template)

构建检索问答链

最后，可以调用 LangChain 提供的检索问答链构造函数，基于我们的自定义 LLM、Prompt Template 和向量知识库来构建一个基于 InternLM 的检索问答链：

1
2
3

from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectordb.as_retriever(),return_source_documents=True,chain_type_kwargs={"prompt":QA_CHAIN_PROMPT})

得到的 qa_chain 对象即可以实现我们的核心功能，即基于 InternLM 模型的专业知识库助手。我们可以对比该检索问答链和纯 LLM 的问答效果：

# 检索问答链回答效果
question = "什么是InternLM"
result = qa_chain({"query": question})
print("检索问答链回答 question 的结果：")
print(result["result"])

# 仅 LLM 回答效果
result_2 = llm(question)
print("大模型回答 question 的结果：")
print(result_2)

检索问答链回答 question 的结果：
根据您提供的问题，InternLM是一个开源的轻量级训练框架，旨在支持大模型训练，而无需大量的依赖。它支持在拥有数千个GPU的大型集群上进行预训练，并在单个GPU上进行微调，同时实现了卓越的性能优化。在1024个GPU上训练时，InternLM可以实现近90%的加速效率。

InternLM团队已经发布了两个开源的预训练模型：InternLM-7B和InternLM-20B。更新包括InternLM-20B发布和InternLM-7B-Chat v1.1发布，后者增加了代码解释器和函数调用能力。InternLM模型的特点包括：

1. 支持训练高质量的对话模型，实现强大的知识库和推理功能；
2. 支持8k上下文窗口长，允许更长的输入序列和强大的推理能力；
3. 提供灵活的通用工具，使用户能够创建自己的工作流程；
4. 提供轻量级的学习框架，无需大量的依赖即可进行模型的前向学习和微调，并实现了卓越的性能优化。

总的来说，InternLM是一个非常实用的工具，可以帮助用户高效地训练大模型，并支持各种常见的预训练模型。
大模型回答 question 的结果：
书生·浦语

网页部署

在完成上述核心功能后，我们可以基于 Gradio 框架将其部署到 Web 网页，从而搭建一个小型 Demo，便于测试与使用。

我们首先将上文的代码内容封装为一个返回构建的检索问答链对象的函数，并在启动 Gradio 的第一时间调用该函数得到检索问答链对象，后续直接使用该对象进行问答对话，从而避免重复加载模型：

from langchain.vectorstores import Chroma
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
import os
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA

def load_chain():
    # 加载问答链
    # 定义 Embeddings
    embeddings = HuggingFaceEmbeddings(model_name="/root/data/model/sentence-transformer")

    # 向量数据库持久化路径
    persist_directory = 'data_base/vector_db/chroma'

    # 加载数据库
    vectordb = Chroma(
        persist_directory=persist_directory,  # 允许我们将persist_directory目录保存到磁盘上
        embedding_function=embeddings
    )

    # 加载自定义 LLM
    llm = InternLM_LLM(model_path = "/root/data/model/Shanghai_AI_Laboratory/internlm-chat-7b")

    # 定义一个 Prompt Template
    template = """使用以下上下文来回答最后的问题。如果你不知道答案，就说你不知道，不要试图编造答
    案。尽量使答案简明扼要。总是在回答的最后说“谢谢你的提问！”。
    {context}
    问题: {question}
    有用的回答:"""

    QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context","question"],template=template)

    # 运行 chain
    qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectordb.as_retriever(),return_source_documents=True,chain_type_kwargs={"prompt":QA_CHAIN_PROMPT})
    
    return qa_chain

# 接着我们定义一个类，该类负责加载并存储检索问答链，并响应 Web 界面里调用检索问答链进行回答的动作：

class Model_center():
    """
    存储检索问答链的对象 
    """
    def __init__(self):
        # 构造函数，加载检索问答链
        self.chain = load_chain()

    def qa_chain_self_answer(self, question: str, chat_history: list = []):
        """
        调用问答链进行回答
        """
        if question == None or len(question) < 1:
            return "", chat_history
        try:
            chat_history.append(
                (question, self.chain({"query": question})["result"]))
            # 将问答结果直接附加到问答历史中，Gradio 会将其展示出来
            return "", chat_history
        except Exception as e:
            return e, chat_history

# 然后我们只需按照 Gradio 的框架使用方法，实例化一个 Web 界面并将点击动作绑定到上述类的回答方法即可：
import gradio as gr

# 实例化核心功能对象
model_center = Model_center()
# 创建一个 Web 界面
block = gr.Blocks()
with block as demo:
    with gr.Row(equal_height=True):   
        with gr.Column(scale=15):
            # 展示的页面标题
            gr.Markdown("""<h1><center>InternLM</center></h1>
                <center>书生浦语</center>
                """)

    with gr.Row():
        with gr.Column(scale=4):
            # 创建一个聊天机器人对象
            chatbot = gr.Chatbot(height=450, show_copy_button=True)
            # 创建一个文本框组件，用于输入 prompt。
            msg = gr.Textbox(label="Prompt/问题")

            with gr.Row():
                # 创建提交按钮。
                db_wo_his_btn = gr.Button("Chat")
            with gr.Row():
                # 创建一个清除按钮，用于清除聊天机器人组件的内容。
                clear = gr.ClearButton(
                    components=[chatbot], value="Clear console")
                
        # 设置按钮的点击事件。当点击时，调用上面定义的 qa_chain_self_answer 函数，并传入用户的消息和聊天历史记录，然后更新文本框和聊天机器人组件。
        db_wo_his_btn.click(model_center.qa_chain_self_answer, inputs=[
                            msg, chatbot], outputs=[msg, chatbot])

    gr.Markdown("""提醒：<br>
    1. 初始化数据库时间可能较长，请耐心等待。
    2. 使用中如果出现异常，将会在文本输入框进行展示，请不要惊慌。 <br>
    """)
gr.close_all()
# 直接启动
demo.launch()

正在从本地加载模型...



Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]


完成本地模型的加载
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.