LLM 缓存

SemanticCache

`class SemanticCache(name='llmcache', distance_threshold=0.1, ttl=None, vectorizer=None, filterable_fields=None, redis_client=None, redis_url='redis://:6379', connection_kwargs={}, overwrite=False, **kwargs)`

基类: BaseLLMCache

大型语言模型的语义缓存。

参数
- name (str , 可选) – 语义缓存搜索索引的名称。默认为“llmcache”。
- distance_threshold (float , 可选) – 缓存的语义阈值。默认为 0.1。
- ttl (Optional [ int ] , 可选) – Redis 中缓存记录的生存时间 (TTL)。默认为 None。
- vectorizer (Optional [ BaseVectorizer ] , 可选) – 缓存的向量化器。默认为 HFTextVectorizer。
- filterable_fields (Optional [ List [ Dict [ str , Any ] ] ]) – RedisVL 字段的可选列表，可用于通过过滤器自定义缓存检索。
- redis_client (Optional [ Redis ] , 可选) – Redis 客户端连接实例。默认为 None。
- redis_url (str , 可选) – Redis URL。默认为 redis://:6379。
- connection_kwargs (Dict [ str , Any ]) – Redis 客户端的连接参数。默认为空 {}。
- overwrite (bool) – 是否强制覆盖语义缓存索引的模式。默认为 false。
抛出
- TypeError – 如果提供了无效的向量化器。
- TypeError – 如果 TTL 值不是 int 类型。
- ValueError – 如果阈值不在 0 和 1 之间。
- ValueError – 如果现有模式与新模式不匹配且 overwrite 为 False。

`async acheck(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)`

异步检查语义缓存中与指定的 prompt 或 vector 相似的结果。

此方法使用向量相似性搜索缓存，输入可以是原始文本 prompt（转换为 vector）或提供的 vector。它检查是否存在语义相似的 prompt 并获取缓存的 LLM 响应。

参数
- prompt (Optional [ str ] , 可选) – 在缓存中搜索的文本 prompt。
- vector (Optional [ List [ float ] ] , 可选) – 在缓存中搜索的 prompt 的向量表示。
- num_results (int , 可选) – 要返回的缓存结果数量。默认为 1。
- return_fields (Optional [ List [ str ] ] , 可选) – 每个返回结果中要包含的字段。如果为 None，默认为缓存条目中的所有可用字段。
- filter_expression (Optional [FilterExpression ]) – 可选的过滤表达式，可用于过滤缓存结果。默认为 None，此时将搜索整个缓存。
- distance_threshold (Optional [ float ]) – 语义向量距离的阈值。
返回值

返回一个字典列表，其中包含每个相似缓存响应的请求返回字段。

为每个相似的缓存的响应返回字段。
返回类型: List[Dict[str, Any]]
抛出
- ValueError – 如果既未指定 prompt 也未指定 vector。
- ValueError – 如果 'vector' 的维度不正确。
- TypeError – 如果提供了 return_fields 但它不是一个列表。

response = await cache.acheck(
    prompt="What is the captial city of France?"
)

`async adrop(ids=None, keys=None)`

通过 id 或指定的 Redis key 异步使缓存中的特定条目过期。

参数
- ids (Optional [ str ]) – 要从缓存中删除的文档 ID 或多个 ID。
- keys (Optional [ str ]) – 要从缓存中删除的 Redis key。
返回类型: None

`async astore(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`

异步将指定的键值对以及元数据存储到缓存中。

参数
- prompt (str) – 要缓存的用户 prompt。
- response (str) – 要缓存的 LLM 响应。
- vector (Optional [ List [ float ] ] , 可选) – 要缓存的 prompt vector。默认为 None，此时会按需生成 prompt vector。
- metadata (Optional [ Dict [ str , Any ] ] , 可选) – 与 prompt 和 response 一起缓存的可选元数据。默认为 None。
- filters (Optional [ Dict [ str , Any ] ]) – 分配给缓存条目的可选标签。默认为 None。
- ttl (Optional [ int ]) – 对此单个缓存条目使用的可选 TTL 覆盖。默认为全局 TTL 设置。
返回值: 添加到语义缓存的条目的 Redis key。
返回类型: str
抛出
- ValueError – 如果既未指定 prompt 也未指定 vector。
- ValueError – 如果 vector 的维度不正确。
- TypeError – 如果提供的 metadata 不是字典类型。

key = await cache.astore(
    prompt="What is the captial city of France?",
    response="Paris",
    metadata={"city": "Paris", "country": "France"}
)

`async aupdate(key, **kwargs)`

异步更新现有缓存条目中的特定字段。如果没有传入字段，则仅刷新文档的 TTL。

参数: key (str) – 要使用 kwargs 更新的文档的 key。
抛出
- ValueError – 如果提供了不正确的映射作为 kwarg。
- TypeError – 如果提供了 metadata 且类型不是 dict。
返回类型: None

key = await cache.astore('this is a prompt', 'this is a response')
await cache.aupdate(
    key,
    metadata={"hit_count": 1, "model_name": "Llama-2-7b"}
)

`check(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)`

检查语义缓存中与指定的 prompt 或 vector 相似的结果。

此方法使用向量相似性搜索缓存，输入可以是原始文本 prompt（转换为 vector）或提供的 vector。它检查是否存在语义相似的 prompt 并获取缓存的 LLM 响应。

参数
- prompt (Optional [ str ] , 可选) – 在缓存中搜索的文本 prompt。
- vector (Optional [ List [ float ] ] , 可选) – 在缓存中搜索的 prompt 的向量表示。
- num_results (int , 可选) – 要返回的缓存结果数量。默认为 1。
- return_fields (Optional [ List [ str ] ] , 可选) – 每个返回结果中要包含的字段。如果为 None，默认为缓存条目中的所有可用字段。
- filter_expression (Optional [FilterExpression ]) – 可选的过滤表达式，可用于过滤缓存结果。默认为 None，此时将搜索整个缓存。
- distance_threshold (Optional [ float ]) – 语义向量距离的阈值。
返回值

返回一个字典列表，其中包含每个相似缓存响应的请求返回字段。

为每个相似的缓存的响应返回字段。
返回类型: List[Dict[str, Any]]
抛出
- ValueError – 如果既未指定 prompt 也未指定 vector。
- ValueError – 如果 'vector' 的维度不正确。
- TypeError – 如果提供了 return_fields 但它不是一个列表。

response = cache.check(
    prompt="What is the captial city of France?"
)

`clear()`

清除缓存中的所有 key，同时保留索引。

返回类型: None

`delete()`

清除语义缓存中的所有 key 并移除底层搜索索引。

返回类型: None

`drop(ids=None, keys=None)`

通过 id 或指定的 Redis key 手动使缓存中的特定条目过期。

参数
- ids (Optional [ str ]) – 要从缓存中删除的文档 ID 或多个 ID。
- keys (Optional [ str ]) – 要从缓存中删除的 Redis key。
返回类型: None

`set_threshold(distance_threshold)`

设置缓存的语义距离阈值。

参数: distance_threshold (float) – 缓存的语义距离阈值。
抛出: ValueError – 如果阈值不在 0 和 1 之间。
返回类型: None

`set_ttl(ttl=None)`

设置缓存中条目的默认 TTL（以秒为单位）。

参数: ttl (Optional [ int ] , 可选) – 缓存的可选生存时间（以秒为单位）。
抛出: ValueError – 如果生存时间值不是整数。

`store(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`

将指定的键值对以及元数据存储到缓存中。

参数
- prompt (str) – 要缓存的用户 prompt。
- response (str) – 要缓存的 LLM 响应。
- vector (Optional [ List [ float ] ] , 可选) – 要缓存的 prompt vector。默认为 None，此时会按需生成 prompt vector。
- metadata (Optional [ Dict [ str , Any ] ] , 可选) – 与 prompt 和 response 一起缓存的可选元数据。默认为 None。
- filters (Optional [ Dict [ str , Any ] ]) – 分配给缓存条目的可选标签。默认为 None。
- ttl (Optional [ int ]) – 对此单个缓存条目使用的可选 TTL 覆盖。默认为全局 TTL 设置。
返回值: 添加到语义缓存的条目的 Redis key。
返回类型: str
抛出
- ValueError – 如果既未指定 prompt 也未指定 vector。
- ValueError – 如果 vector 的维度不正确。
- TypeError – 如果提供的 metadata 不是字典类型。

key = cache.store(
    prompt="What is the captial city of France?",
    response="Paris",
    metadata={"city": "Paris", "country": "France"}
)

`update(key, **kwargs)`

更新现有缓存条目中的特定字段。如果没有传入字段，则仅刷新文档的 TTL。

参数: key (str) – 要使用 kwargs 更新的文档的 key。
抛出
- ValueError – 如果提供了不正确的映射作为 kwarg。
- TypeError – 如果提供了 metadata 且类型不是 dict。
返回类型: None

key = cache.store('this is a prompt', 'this is a response')
cache.update(key, metadata={"hit_count": 1, "model_name": "Llama-2-7b"})
)

`property aindex:` `AsyncSearchIndex` `| None`

缓存的底层 AsyncSearchIndex。

返回值: 异步搜索索引。
返回类型: AsyncSearchIndex

`property distance_threshold: float`

缓存的语义距离阈值。

返回值: 语义距离阈值。
返回类型: float

`property index:` `SearchIndex`

缓存的底层 SearchIndex。

返回值: 搜索索引。
返回类型: SearchIndex

`property ttl: int | None`

缓存中条目的默认 TTL（以秒为单位）。

产品

工具

获取 Redis

连接

学习

最新

查看工作原理

LLM 缓存

SemanticCache

`class SemanticCache(name='llmcache', distance_threshold=0.1, ttl=None, vectorizer=None, filterable_fields=None, redis_client=None, redis_url='redis://:6379', connection_kwargs={}, overwrite=False, **kwargs)`

`async acheck(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)`

`async adrop(ids=None, keys=None)`

`async astore(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`

`async aupdate(key, **kwargs)`

`check(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)`

`clear()`

`delete()`

`drop(ids=None, keys=None)`

`set_threshold(distance_threshold)`

`set_ttl(ttl=None)`

`store(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`

`update(key, **kwargs)`

`property aindex:` `AsyncSearchIndex` `| None`

`property distance_threshold: float`

`property index:` `SearchIndex`

`property ttl: int | None`

在此页面上

LLM 缓存

SemanticCache

class SemanticCache(name='llmcache', distance_threshold=0.1, ttl=None, vectorizer=None, filterable_fields=None, redis_client=None, redis_url='redis://:6379', connection_kwargs={}, overwrite=False, **kwargs)

async acheck(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)

async adrop(ids=None, keys=None)

async astore(prompt, response, vector=None, metadata=None, filters=None, ttl=None)

async aupdate(key, **kwargs)

check(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)

clear()

delete()

drop(ids=None, keys=None)

set_threshold(distance_threshold)

set_ttl(ttl=None)

store(prompt, response, vector=None, metadata=None, filters=None, ttl=None)

update(key, **kwargs)

property aindex: AsyncSearchIndex | None

property distance_threshold: float

property index: SearchIndex

property ttl: int | None

在此页面上

`class SemanticCache(name='llmcache', distance_threshold=0.1, ttl=None, vectorizer=None, filterable_fields=None, redis_client=None, redis_url='redis://:6379', connection_kwargs={}, overwrite=False, **kwargs)`

`async acheck(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)`

`async adrop(ids=None, keys=None)`

`async astore(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`

`async aupdate(key, **kwargs)`

`check(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)`

`clear()`

`delete()`

`drop(ids=None, keys=None)`

`set_threshold(distance_threshold)`

`set_ttl(ttl=None)`

`store(prompt, response, vector=None, metadata=None, filters=None, ttl=None)`

`update(key, **kwargs)`

`property aindex:` `AsyncSearchIndex` `| None`

`property distance_threshold: float`

`property index:` `SearchIndex`

`property ttl: int | None`