LLM会话记忆
大型语言模型本质上是无状态的,不了解与用户的先前交互,甚至不了解当前对话的先前部分。虽然在询问简单问题时这可能不明显,但在进行依赖对话上下文的长期对话时,这会成为障碍。
解决这个问题的方法是将之前的对话历史附加到对LLM的每次后续调用中。
本笔记本将展示如何使用Redis来构建、存储和检索这种对话会话记忆。
from redisvl.extensions.session_manager import StandardSessionManager
chat_session = StandardSessionManager(name='student tutor')
12:24:11 redisvl.index.index INFO Index already exists, not overwriting.
为了与常见的LLM API保持一致,Redis使用role
和content
字段存储消息。支持的角色有"system"、"user"和"llm"。
您可以逐条或一次性存储消息。
chat_session.add_message({"role":"system", "content":"You are a helpful geography tutor, giving simple and short answers to questions about Europen countries."})
chat_session.add_messages([
{"role":"user", "content":"What is the capital of France?"},
{"role":"llm", "content":"The capital is Paris."},
{"role":"user", "content":"And what is the capital of Spain?"},
{"role":"llm", "content":"The capital is Madrid."},
{"role":"user", "content":"What is the population of Great Britain?"},
{"role":"llm", "content":"As of 2023 the population of Great Britain is approximately 67 million people."},]
)
在任何时候,我们都可以检索对话的近期历史。它将按进入时间排序。
context = chat_session.get_recent()
for message in context:
print(message)
{'role': 'llm', 'content': 'The capital is Paris.'}
{'role': 'user', 'content': 'And what is the capital of Spain?'}
{'role': 'llm', 'content': 'The capital is Madrid.'}
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
在许多LLM流程中,对话以一系列提示和响应对的形式进行。会话管理器提供了一个便捷函数store()
来轻松添加这些内容。
prompt = "what is the size of England compared to Portugal?"
response = "England is larger in land area than Portal by about 15000 square miles."
chat_session.store(prompt, response)
context = chat_session.get_recent(top_k=6)
for message in context:
print(message)
{'role': 'user', 'content': 'And what is the capital of Spain?'}
{'role': 'llm', 'content': 'The capital is Madrid.'}
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
管理多个用户和对话
对于需要同时处理多个对话的应用,Redis支持通过标记消息来将会话分开。
chat_session.add_message({"role":"system", "content":"You are a helpful algebra tutor, giving simple answers to math problems."}, session_tag='student two')
chat_session.add_messages([
{"role":"user", "content":"What is the value of x in the equation 2x + 3 = 7?"},
{"role":"llm", "content":"The value of x is 2."},
{"role":"user", "content":"What is the value of y in the equation 3y - 5 = 7?"},
{"role":"llm", "content":"The value of y is 4."}],
session_tag='student two'
)
for math_message in chat_session.get_recent(session_tag='student two'):
print(math_message)
{'role': 'system', 'content': 'You are a helpful algebra tutor, giving simple answers to math problems.'}
{'role': 'user', 'content': 'What is the value of x in the equation 2x + 3 = 7?'}
{'role': 'llm', 'content': 'The value of x is 2.'}
{'role': 'user', 'content': 'What is the value of y in the equation 3y - 5 = 7?'}
{'role': 'llm', 'content': 'The value of y is 4.'}
语义对话记忆
对于更长的对话,我们的消息列表会不断增长。由于LLM是无状态的,我们必须在每次后续调用中继续传递这个对话历史,以确保LLM具有正确的上下文。
典型流程如下:
while True:
prompt = input('enter your next question')
context = chat_session.get_recent()
response = LLM_api_call(prompt=prompt, context=context)
chat_session.store(prompt, response)
这可行,但随着上下文不断增长,我们的LLM token数量也会增加,从而增加延迟和成本。
对话历史可以被截断,但这可能导致丢失早期出现的关联信息。
一个更好的解决方案是在每次后续调用时只传递相关的对话上下文。
为此,RedisVL提供了SemanticSessionManager
,它使用向量相似性搜索来仅返回对话中语义相关的部分。
from redisvl.extensions.session_manager import SemanticSessionManager
semantic_session = SemanticSessionManager(name='tutor')
semantic_session.add_messages(chat_session.get_recent(top_k=8))
12:24:15 redisvl.index.index INFO Index already exists, not overwriting.
prompt = "what have I learned about the size of England?"
semantic_session.set_distance_threshold(0.35)
context = semantic_session.get_relevant(prompt)
for message in context:
print(message)
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
您可以调整需要在上下文中包含的语义相似度级别。
将距离阈值设为接近0.0将要求精确的语义匹配,而距离阈值设为1.0将包含所有内容。
semantic_session.set_distance_threshold(0.7)
larger_context = semantic_session.get_relevant(prompt)
for message in larger_context:
print(message)
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
对话控制
LLM有时会产生幻觉,发生这种情况时,从对话历史中剪除不正确的信息会很有用,这样不正确的信息就不会继续作为上下文传递。
semantic_session.store(
prompt="what is the smallest country in Europe?",
response="Monaco is the smallest country in Europe at 0.78 square miles." # Incorrect. Vatican City is the smallest country in Europe
)
# get the key of the incorrect message
context = semantic_session.get_recent(top_k=1, raw=True)
bad_key = context[0]['entry_id']
semantic_session.drop(bad_key)
corrected_context = semantic_session.get_recent()
for message in corrected_context:
print(message)
{'role': 'user', 'content': 'What is the population of Great Britain?'}
{'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}
{'role': 'user', 'content': 'what is the size of England compared to Portugal?'}
{'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}
{'role': 'user', 'content': 'what is the smallest country in Europe?'}
chat_session.clear()