LLM会话记忆

大型语言模型本质上是无状态的,不了解与用户的先前交互,甚至不了解当前对话的先前部分。虽然在询问简单问题时这可能不明显,但在进行依赖对话上下文的长期对话时,这会成为障碍。

解决这个问题的方法是将之前的对话历史附加到对LLM的每次后续调用中。

本笔记本将展示如何使用Redis来构建、存储和检索这种对话会话记忆。

from redisvl.extensions.session_manager import StandardSessionManager
chat_session = StandardSessionManager(name='student tutor')
12:24:11 redisvl.index.index INFO   Index already exists, not overwriting.

为了与常见的LLM API保持一致,Redis使用rolecontent字段存储消息。支持的角色有"system"、"user"和"llm"。

您可以逐条或一次性存储消息。

chat_session.add_message({"role":"system", "content":"You are a helpful geography tutor, giving simple and short answers to questions about Europen countries."})
chat_session.add_messages([
    {"role":"user", "content":"What is the capital of France?"},
    {"role":"llm", "content":"The capital is Paris."},
    {"role":"user", "content":"And what is the capital of Spain?"},
    {"role":"llm", "content":"The capital is Madrid."},
    {"role":"user", "content":"What is the population of Great Britain?"},
    {"role":"llm", "content":"As of 2023 the population of Great Britain is approximately 67 million people."},]
    )

在任何时候,我们都可以检索对话的近期历史。它将按进入时间排序。

context = chat_session.get_recent()
for message in context:
    print(message)
{'role': 'llm', 'content': 'The capital is Paris.'}
{'role': 'user', 'content': 'And what is the capital of Spain?'}
{'role': 'llm', 'content': 'The capital is Madrid.'}
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}

在许多LLM流程中,对话以一系列提示和响应对的形式进行。会话管理器提供了一个便捷函数store()来轻松添加这些内容。

prompt = "what is the size of England compared to Portugal?"
response = "England is larger in land area than Portal by about 15000 square miles."
chat_session.store(prompt, response)

context = chat_session.get_recent(top_k=6)
for message in context:
    print(message)
{'role': 'user', 'content': 'And what is the capital of Spain?'}
{'role': 'llm', 'content': 'The capital is Madrid.'}
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}

管理多个用户和对话

对于需要同时处理多个对话的应用,Redis支持通过标记消息来将会话分开。

chat_session.add_message({"role":"system", "content":"You are a helpful algebra tutor, giving simple answers to math problems."}, session_tag='student two')
chat_session.add_messages([
    {"role":"user", "content":"What is the value of x in the equation 2x + 3 = 7?"},
    {"role":"llm", "content":"The value of x is 2."},
    {"role":"user", "content":"What is the value of y in the equation 3y - 5 = 7?"},
    {"role":"llm", "content":"The value of y is 4."}],
    session_tag='student two'
    )

for math_message in chat_session.get_recent(session_tag='student two'):
    print(math_message)
{'role': 'system', 'content': 'You are a helpful algebra tutor, giving simple answers to math problems.'}
{'role': 'user', 'content': 'What is the value of x in the equation 2x + 3 = 7?'}
{'role': 'llm', 'content': 'The value of x is 2.'}
{'role': 'user', 'content': 'What is the value of y in the equation 3y - 5 = 7?'}
{'role': 'llm', 'content': 'The value of y is 4.'}

语义对话记忆

对于更长的对话,我们的消息列表会不断增长。由于LLM是无状态的,我们必须在每次后续调用中继续传递这个对话历史,以确保LLM具有正确的上下文。

典型流程如下:

while True:
    prompt = input('enter your next question')
    context = chat_session.get_recent()
    response = LLM_api_call(prompt=prompt, context=context)
    chat_session.store(prompt, response)

这可行,但随着上下文不断增长,我们的LLM token数量也会增加,从而增加延迟和成本。

对话历史可以被截断,但这可能导致丢失早期出现的关联信息。

一个更好的解决方案是在每次后续调用时只传递相关的对话上下文。

为此,RedisVL提供了SemanticSessionManager,它使用向量相似性搜索来仅返回对话中语义相关的部分。

from redisvl.extensions.session_manager import SemanticSessionManager
semantic_session = SemanticSessionManager(name='tutor')

semantic_session.add_messages(chat_session.get_recent(top_k=8))
12:24:15 redisvl.index.index INFO   Index already exists, not overwriting.
prompt = "what have I learned about the size of England?"
semantic_session.set_distance_threshold(0.35)
context = semantic_session.get_relevant(prompt)
for message in context:
    print(message)
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}

您可以调整需要在上下文中包含的语义相似度级别。

将距离阈值设为接近0.0将要求精确的语义匹配,而距离阈值设为1.0将包含所有内容。

semantic_session.set_distance_threshold(0.7)

larger_context = semantic_session.get_relevant(prompt)
for message in larger_context:
    print(message)
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}

对话控制

LLM有时会产生幻觉,发生这种情况时,从对话历史中剪除不正确的信息会很有用,这样不正确的信息就不会继续作为上下文传递。

semantic_session.store(
    prompt="what is the smallest country in Europe?",
    response="Monaco is the smallest country in Europe at 0.78 square miles." # Incorrect. Vatican City is the smallest country in Europe
    )

# get the key of the incorrect message
context = semantic_session.get_recent(top_k=1, raw=True)
bad_key = context[0]['entry_id']
semantic_session.drop(bad_key)

corrected_context = semantic_session.get_recent()
for message in corrected_context:
    print(message)
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
{'role': 'user', 'content': 'what is the smallest country in Europe?'}
chat_session.clear()
评价此页面
返回顶部 ↑