本教程演示了如何使用 LangChain(OpenAI)和 Redis 对产品图像执行语义搜索。具体来说,我们将涵盖以下主题
以下是用于克隆本教程中使用的应用程序源代码的命令
git clone --branch v9.2.0 https://github.com/redis-developer/redis-microservices-ecommerce-solutions
让我们看一下演示应用程序的架构
products service
: 处理从数据库查询产品并将它们返回到前端orders service
: 处理验证和创建订单order history service
: 处理查询客户的订单历史记录payments service
: 处理订单付款api gateway
: 将服务统一到单个端点下mongodb/ postgresql
: 作为写入优化的数据库,用于存储订单、订单历史记录、产品等。您不需要在演示应用程序中使用 MongoDB/ Postgresql 作为您的写入优化的数据库;您可以使用其他 prisma 支持的数据库 以及。这只是一个例子。
电子商务微服务应用程序包含一个前端,使用 Next.js 与 TailwindCSS 构建。应用程序后端使用 Node.js。数据存储在 Redis 和 MongoDB 或 PostgreSQL 中,使用 Prisma。以下是展示电子商务应用程序前端的屏幕截图。
仪表盘: 显示产品列表,并提供不同的搜索功能,可在设置页面中配置。
设置: 通过单击仪表盘右上角的齿轮图标访问。在此控制搜索栏、聊天机器人可见性和其他功能。
仪表盘(语义文本搜索): 配置为语义文本搜索,搜索栏允许使用自然语言查询。示例:“纯棉蓝色衬衫”。
仪表盘(语义基于图像的查询): 配置为语义图像摘要搜索,搜索栏允许进行基于图像的查询。示例:“左胸耐克标志”。
聊天机器人: 位于页面右下角,协助产品搜索和详细视图。
在聊天中选择产品会在仪表盘上显示其详细信息。
购物车: 将产品添加到购物车并使用“立即购买”按钮结账。
订单历史记录: 购买后,顶部导航栏中的“订单”链接会显示订单状态和历史记录。
管理面板: 可通过顶部导航栏中的“admin”链接访问。显示购买统计信息和趋势产品。
注册一个 OpenAI 帐户 以获取将在演示中使用的 API 密钥(在 .env 文件中添加 OPEN_AI_API_KEY 变量)。您还可以参考 OpenAI API 文档 以获取更多信息。
以下是用于克隆本教程中使用的应用程序源代码的命令
git clone --branch v9.2.0 https://github.com/redis-developer/redis-microservices-ecommerce-solutions
在本教程中,我们将使用一个简化的电子商务数据集。具体来说,我们的 JSON 结构包含 product
详细信息和一个名为 styleImages_default_imageURL
的键,它链接到产品的图像。此图像将成为我们 AI 驱动的语义搜索的重点。
const products = [
{
productId: '11000',
price: 3995,
productDisplayName: 'Puma Men Slick 3HD Yellow Black Watches',
variantName: 'Slick 3HD Yellow',
brandName: 'Puma',
// Additional product details...
styleImages_default_imageURL:
'https://host.docker.internal:8080/images/11000.jpg',
// Other properties...
},
// Additional products...
];
以下代码段概述了使用 OpenAI 功能为产品图像生成文本摘要的过程。我们将首先使用 fetchImageAndConvertToBase64
函数将图像 URL 转换为 base64 字符串,然后使用 OpenAI 使用 getOpenAIImageSummary
函数生成图像的摘要。
import {
ChatOpenAI,
ChatOpenAICallOptions,
} from 'langchain/chat_models/openai';
import { HumanMessage } from 'langchain/schema';
import { Document } from 'langchain/document';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { RedisVectorStore } from 'langchain/vectorstores/redis';
let llm: ChatOpenAI<ChatOpenAICallOptions>;
// Instantiates the LangChain ChatOpenAI instance
const getOpenAIVisionInstance = (_openAIApiKey: string) => {
//OpenAI supports images with text in input messages with their gpt-4-vision-preview.
if (!llm) {
llm = new ChatOpenAI({
openAIApiKey: _openAIApiKey,
modelName: 'gpt-4-vision-preview',
maxTokens: 1024,
});
}
return llm;
};
const fetchImageAndConvertToBase64 = async (_imageURL: string) => {
let base64Image = '';
try {
const response = await axios.get(_imageURL, {
responseType: 'arraybuffer',
});
// Convert image to Base64
base64Image = Buffer.from(response.data, 'binary').toString('base64');
} catch (error) {
console.error(
`Error fetching or converting the image: ${_imageURL}`,
error,
);
}
return base64Image;
};
// Generates an OpenAI summary for a given base64 image string
const getOpenAIImageSummary = async (
_openAIApiKey: string,
_base64Image: string,
_product: Prisma.ProductCreateInput,
) => {
/*
Reference : https://js.langchain.ac.cn/docs/integrations/chat/openai#multimodal-messages
- This function utilizes OpenAI's multimodal capabilities to generate a summary from the image.
- It constructs a prompt that combines the product description with the image.
- OpenAI's vision model then processes this prompt to generate a detailed summary.
*/
let imageSummary = '';
try {
if (_openAIApiKey && _base64Image && _product) {
const llmInst = getOpenAIVisionInstance(_openAIApiKey);
const text = `Below are the product details and image of an e-commerce product for reference. Please conduct and provide a comprehensive analysis of the product depicted in the image .
Product Details:
${_product.productDescriptors_description_value}
Image:
`;
// Constructing a multimodal message combining text and image
const imagePromptMessage = new HumanMessage({
content: [
{
type: 'text',
text: text,
},
{
type: 'image_url',
image_url: {
url: `data:image/jpeg;base64,${_base64Image}`,
detail: 'high', // low, high (if you want more detail)
},
},
],
});
// Invoking the LangChain ChatOpenAI model with the constructed message
const response = await llmInst.invoke([imagePromptMessage]);
if (response?.content) {
imageSummary = <string>response.content;
}
}
} catch (err) {
console.log(
`Error generating OpenAIImageSummary for product id ${_product.productId}`,
err,
);
}
return imageSummary;
};
以下部分演示了上述过程的结果。我们将使用 Puma T 恤的图像,并使用 OpenAI 的功能生成摘要。
OpenAI 模型生成的综合摘要如下:
This product is a black round neck T-shirt featuring a design consistent with the Puma brand aesthetic, which includes their iconic leaping cat logo in a contrasting yellow color placed prominently across the chest area. The T-shirt is made from 100% cotton, suggesting it is likely to be breathable and soft to the touch. It has a classic short-sleeve design with a ribbed neckline for added texture and durability. There is also mention of a vented hem, which may offer additional comfort and mobility.
The T-shirt is described to have a 'comfort' fit, which typically means it is designed to be neither too tight nor too loose, allowing for ease of movement without being baggy. This could be ideal for casual wear or active use.
Care instructions are also comprehensive, advising a gentle machine wash with similar colors in cool water at 30 degrees Celsius, indicating it is relatively easy to care for. However, one should avoid bleaching, tumble drying, and dry cleaning it, but a warm iron is permissible.
Looking at the image provided:
- The T-shirt appears to fit the model well, in accordance with the described 'comfort' fit.
- The color contrast between the T-shirt and the graphic gives the garment a modern, sporty look.
- The model is paired with denim jeans, showcasing the T-shirt's versatility for casual occasions. However, the product description suggests it can be part of an athletic ensemble when combined with Puma shorts and shoes.
- Considering the model's statistics, prospective buyers could infer how this T-shirt might fit on a person with similar measurements.
Overall, the T-shirt is positioned as a versatile item suitable for both lifestyle and sporting activities, with a strong brand identity through the graphic, and is likely comfortable and easy to maintain based on the product details provided.
该 addImageSummaryEmbeddingsToRedis
函数在将 AI 生成的图像摘要与 Redis 集成方面发挥着至关重要的作用。此过程涉及两个主要步骤
getImageSummaryVectorDocuments
函数,我们将图像摘要转换为向量文档。此转换至关重要,因为它将文本摘要转换为适合 Redis 存储的格式。seedImageSummaryEmbeddings
函数将这些向量文档存储到 Redis 中。此步骤对于在 Redis 数据库中实现高效的检索和搜索功能至关重要。// Function to generate vector documents from image summaries
const getImageSummaryVectorDocuments = async (
_products: Prisma.ProductCreateInput[],
_openAIApiKey: string,
) => {
const vectorDocs: Document[] = [];
if (_products?.length > 0) {
let count = 1;
for (let product of _products) {
if (product) {
let imageURL = product.styleImages_default_imageURL; //cdn url
const imageData = await fetchImageAndConvertToBase64(imageURL);
imageSummary = await getOpenAIImageSummary(
_openAIApiKey,
imageData,
product,
);
console.log(
`openAI imageSummary #${count++} generated for product id: ${
product.productId
}`,
);
if (imageSummary) {
let doc = new Document({
metadata: {
productId: product.productId,
imageURL: imageURL,
},
pageContent: imageSummary,
});
vectorDocs.push(doc);
}
}
}
}
return vectorDocs;
};
// Seeding vector documents into Redis
const seedImageSummaryEmbeddings = async (
vectorDocs: Document[],
_redisClient: NodeRedisClientType,
_openAIApiKey: string,
) => {
if (vectorDocs?.length && _redisClient && _openAIApiKey) {
const embeddings = new OpenAIEmbeddings({
openAIApiKey: _openAIApiKey,
});
const vectorStore = await RedisVectorStore.fromDocuments(
vectorDocs,
embeddings,
{
redisClient: _redisClient,
indexName: 'openAIProductImgIdx',
keyPrefix: 'openAIProductImgText:',
},
);
console.log('seeding imageSummaryEmbeddings completed');
}
};
const addImageSummaryEmbeddingsToRedis = async (
_products: Prisma.ProductCreateInput[],
_redisClient: NodeRedisClientType,
_openAIApiKey: string,
) => {
const vectorDocs = await getImageSummaryVectorDocuments(
_products,
_openAIApiKey,
);
await seedImageSummaryEmbeddings(vectorDocs, _redisClient, _openAIApiKey);
};
下图显示了 openAI 图像摘要 在 RedisInsight 中的 JSON 结构。
下载 RedisInsight 以直观地浏览您的 Redis 数据或在工作台中使用原始 Redis 命令。
本节介绍 getProductsByVSSImageSummary
的 API 请求和响应结构,这对于根据使用图像摘要的语义搜索检索产品至关重要。
请求格式
API 的示例请求格式如下:
POST https://localhost:3000/products/getProductsByVSSImageSummary
{
"searchText":"Left chest nike logo",
//optional
"maxProductCount": 4, // 2 (default)
"similarityScoreLimit":0.2, // 0.2 (default)
}
响应结构
API 的响应是一个 JSON 对象,包含与语义搜索条件匹配的产品详细信息数组
{
"data": [
{
"productId": "10017",
"price": 3995,
"productDisplayName": "Nike Women As The Windru Blue Jackets",
"brandName": "Nike",
"styleImages_default_imageURL": "https://host.docker.internal:8080/products/01/10017/product-img.webp",
"productDescriptors_description_value": " Blue and White jacket made of 100% polyester, with an interior pocket ...",
"stockQty": 25,
"similarityScore": 0.163541972637,
"imageSummary": "The product in the image is a blue and white jacket featuring a design consistent with the provided description. ..."
}
// Additional products...
],
"error": null,
"auth": "SES_fd57d7f4-3deb-418f-9a95-6749cd06e348"
}
此 API 的后端实现涉及以下步骤
getProductsByVSSImageSummary
函数处理 API 请求。getSimilarProductsScoreByVSSImageSummary
函数对图像摘要执行语义搜索。它与 OpenAI 的语义分析功能集成,以解释 searchText 并从 Redis 向量存储中识别相关产品。const getSimilarProductsScoreByVSSImageSummary = async (
_params: IParamsGetProductsByVSS,
) => {
let {
standAloneQuestion,
openAIApiKey,
//optional
KNN,
scoreLimit,
} = _params;
let vectorDocs: Document[] = [];
const client = getNodeRedisClient();
KNN = KNN || 2;
scoreLimit = scoreLimit || 1;
const embeddings = new OpenAIEmbeddings({
openAIApiKey: openAIApiKey,
});
// create vector store
const vectorStore = new RedisVectorStore(embeddings, {
redisClient: client,
indexName: 'openAIProductImgIdx',
keyPrefix: 'openAIProductImgText:',
});
// search for similar products
const vectorDocsWithScore = await vectorStore.similaritySearchWithScore(
standAloneQuestion,
KNN,
);
// filter by scoreLimit
for (let [doc, score] of vectorDocsWithScore) {
if (score <= scoreLimit) {
doc['similarityScore'] = score;
vectorDocs.push(doc);
}
}
return vectorDocs;
};
const getProductsByVSSImageSummary = async (
productsVSSFilter: IProductsVSSBodyFilter,
) => {
let { searchText, maxProductCount, similarityScoreLimit } = productsVSSFilter;
let products: IProduct[] = [];
const openAIApiKey = process.env.OPEN_AI_API_KEY || '';
maxProductCount = maxProductCount || 2;
similarityScoreLimit = similarityScoreLimit || 0.2;
//VSS search
const vectorDocs = await getSimilarProductsScoreByVSSImageSummary({
standAloneQuestion: searchText,
openAIApiKey: openAIApiKey,
KNN: maxProductCount,
scoreLimit: similarityScoreLimit,
});
if (vectorDocs?.length) {
const productIds = vectorDocs.map((doc) => doc?.metadata?.productId);
//get product with details
products = await getProductByIds(productIds, true);
}
//...
return products;
};
语义图像摘要搜索
选项。左胸 Nike 标识
,搜索结果将显示 Nike 夹克等产品,其特征是左胸上有标识,反映了查询。对图像摘要执行语义搜索是电子商务应用程序的强大工具。它允许用户根据其描述或图像搜索产品,从而提供更直观、更有效的购物体验。本教程演示了如何将 OpenAI 的语义分析功能与 Redis 集成,为电子商务应用程序创建强大的搜索引擎。