本教程演示了如何使用 LangChain (OpenAI) 和 Redis 对产品图片执行语义搜索。具体来说,我们将涵盖以下主题:
以下是克隆本教程中使用的应用源代码的命令:
git clone --branch v9.2.0 https://github.com/redis-developer/redis-microservices-ecommerce-solutions
让我们看看演示应用的架构:
products service
:处理从数据库查询产品并将其返回给前端orders service
:处理订单验证和创建order history service
:处理查询客户的订单历史记录payments service
:处理订单支付api gateway
:在单个端点下统一服务mongodb/ postgresql
:作为写优化数据库,用于存储订单、订单历史、产品等。在演示应用中,您无需将 MongoDB/Postgresql 用作写优化数据库;您也可以使用其他 prisma 支持的数据库。这仅是一个示例。
该电商微服务应用包含一个使用 Next.js 和 TailwindCSS 构建的前端。应用后端使用 Node.js。数据存储在 Redis 以及 MongoDB 或 PostgreSQL 中,使用 Prisma。以下是展示该电商应用前端的屏幕截图。
仪表板: 显示带有不同搜索功能的产品列表,可在设置页面进行配置。
设置: 通过点击仪表板右上角的齿轮图标访问。在此处控制搜索栏、聊天机器人可见性和其他功能。
仪表板(语义文本搜索): 配置为语义文本搜索后,搜索栏支持自然语言查询。示例:“纯棉蓝色衬衫”。
仪表板(基于图像的语义查询): 配置为语义图像摘要搜索后,搜索栏支持基于图像的查询。示例:“左胸耐克标志”。
聊天机器人: 位于页面右下角,协助进行产品搜索和详细查看。
在聊天中选择产品会在仪表板上显示其详细信息。
购物车: 将产品添加到购物车,然后使用“立即购买”按钮结账。
订单历史记录: 购买后,顶部导航栏中的“订单”链接显示订单状态和历史记录。
管理面板: 通过顶部导航中的“admin”链接访问。显示购买统计数据和热门产品。
注册 OpenAI 账户 以获取在演示中使用的 API 密钥(将 OPEN_AI_API_KEY 变量添加到 .env 文件中)。您也可以参考 OpenAI API 文档 获取更多信息。
以下是克隆本教程中使用的应用源代码的命令:
git clone --branch v9.2.0 https://github.com/redis-developer/redis-microservices-ecommerce-solutions
在本教程中,我们将使用简化的电商数据集。具体来说,我们的 JSON 结构包括 product
详情和一个名为 styleImages_default_imageURL
的键,该键链接到产品图片。此图片将是我们 AI 驱动的语义搜索的重点。
const products = [
{
productId: '11000',
price: 3995,
productDisplayName: 'Puma Men Slick 3HD Yellow Black Watches',
variantName: 'Slick 3HD Yellow',
brandName: 'Puma',
// Additional product details...
styleImages_default_imageURL:
'http://host.docker.internal:8080/images/11000.jpg',
// Other properties...
},
// Additional products...
];
以下代码片段概述了使用 OpenAI 功能为产品图片生成文本摘要的过程。我们将首先使用 fetchImageAndConvertToBase64
函数将图片 URL 转换为 base64 字符串,然后利用 OpenAI 使用 getOpenAIImageSummary
函数生成图片摘要。
import {
ChatOpenAI,
ChatOpenAICallOptions,
} from 'langchain/chat_models/openai';
import { HumanMessage } from 'langchain/schema';
import { Document } from 'langchain/document';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { RedisVectorStore } from 'langchain/vectorstores/redis';
let llm: ChatOpenAI<ChatOpenAICallOptions>;
// Instantiates the LangChain ChatOpenAI instance
const getOpenAIVisionInstance = (_openAIApiKey: string) => {
//OpenAI supports images with text in input messages with their gpt-4-vision-preview.
if (!llm) {
llm = new ChatOpenAI({
openAIApiKey: _openAIApiKey,
modelName: 'gpt-4-vision-preview',
maxTokens: 1024,
});
}
return llm;
};
const fetchImageAndConvertToBase64 = async (_imageURL: string) => {
let base64Image = '';
try {
const response = await axios.get(_imageURL, {
responseType: 'arraybuffer',
});
// Convert image to Base64
base64Image = Buffer.from(response.data, 'binary').toString('base64');
} catch (error) {
console.error(
`Error fetching or converting the image: ${_imageURL}`,
error,
);
}
return base64Image;
};
// Generates an OpenAI summary for a given base64 image string
const getOpenAIImageSummary = async (
_openAIApiKey: string,
_base64Image: string,
_product: Prisma.ProductCreateInput,
) => {
/*
Reference : https://js.langchain.ac.cn/docs/integrations/chat/openai#multimodal-messages
- This function utilizes OpenAI's multimodal capabilities to generate a summary from the image.
- It constructs a prompt that combines the product description with the image.
- OpenAI's vision model then processes this prompt to generate a detailed summary.
*/
let imageSummary = '';
try {
if (_openAIApiKey && _base64Image && _product) {
const llmInst = getOpenAIVisionInstance(_openAIApiKey);
const text = `Below are the product details and image of an e-commerce product for reference. Please conduct and provide a comprehensive analysis of the product depicted in the image .
Product Details:
${_product.productDescriptors_description_value}
Image:
`;
// Constructing a multimodal message combining text and image
const imagePromptMessage = new HumanMessage({
content: [
{
type: 'text',
text: text,
},
{
type: 'image_url',
image_url: {
url: `data:image/jpeg;base64,${_base64Image}`,
detail: 'high', // low, high (if you want more detail)
},
},
],
});
// Invoking the LangChain ChatOpenAI model with the constructed message
const response = await llmInst.invoke([imagePromptMessage]);
if (response?.content) {
imageSummary = <string>response.content;
}
}
} catch (err) {
console.log(
`Error generating OpenAIImageSummary for product id ${_product.productId}`,
err,
);
}
return imageSummary;
};
以下部分演示了上述过程的结果。我们将使用一件彪马 (Puma) T 恤的图片,并利用 OpenAI 的能力生成摘要。
OpenAI 模型生成的详细摘要如下:
This product is a black round neck T-shirt featuring a design consistent with the Puma brand aesthetic, which includes their iconic leaping cat logo in a contrasting yellow color placed prominently across the chest area. The T-shirt is made from 100% cotton, suggesting it is likely to be breathable and soft to the touch. It has a classic short-sleeve design with a ribbed neckline for added texture and durability. There is also mention of a vented hem, which may offer additional comfort and mobility.
The T-shirt is described to have a 'comfort' fit, which typically means it is designed to be neither too tight nor too loose, allowing for ease of movement without being baggy. This could be ideal for casual wear or active use.
Care instructions are also comprehensive, advising a gentle machine wash with similar colors in cool water at 30 degrees Celsius, indicating it is relatively easy to care for. However, one should avoid bleaching, tumble drying, and dry cleaning it, but a warm iron is permissible.
Looking at the image provided:
- The T-shirt appears to fit the model well, in accordance with the described 'comfort' fit.
- The color contrast between the T-shirt and the graphic gives the garment a modern, sporty look.
- The model is paired with denim jeans, showcasing the T-shirt's versatility for casual occasions. However, the product description suggests it can be part of an athletic ensemble when combined with Puma shorts and shoes.
- Considering the model's statistics, prospective buyers could infer how this T-shirt might fit on a person with similar measurements.
Overall, the T-shirt is positioned as a versatile item suitable for both lifestyle and sporting activities, with a strong brand identity through the graphic, and is likely comfortable and easy to maintain based on the product details provided.
addImageSummaryEmbeddingsToRedis
函数在将 AI 生成的图像摘要与 Redis 集成中起着关键作用。此过程包括两个主要步骤:
getImageSummaryVectorDocuments
函数,我们将图像摘要转换为向量文档。此转换至关重要,因为它将文本摘要转换为适合 Redis 存储的格式。seedImageSummaryEmbeddings
函数将这些向量文档存储到 Redis 中。此步骤对于在 Redis 数据库中实现高效检索和搜索功能至关重要。// Function to generate vector documents from image summaries
const getImageSummaryVectorDocuments = async (
_products: Prisma.ProductCreateInput[],
_openAIApiKey: string,
) => {
const vectorDocs: Document[] = [];
if (_products?.length > 0) {
let count = 1;
for (let product of _products) {
if (product) {
let imageURL = product.styleImages_default_imageURL; //cdn url
const imageData = await fetchImageAndConvertToBase64(imageURL);
imageSummary = await getOpenAIImageSummary(
_openAIApiKey,
imageData,
product,
);
console.log(
`openAI imageSummary #${count++} generated for product id: ${
product.productId
}`,
);
if (imageSummary) {
let doc = new Document({
metadata: {
productId: product.productId,
imageURL: imageURL,
},
pageContent: imageSummary,
});
vectorDocs.push(doc);
}
}
}
}
return vectorDocs;
};
// Seeding vector documents into Redis
const seedImageSummaryEmbeddings = async (
vectorDocs: Document[],
_redisClient: NodeRedisClientType,
_openAIApiKey: string,
) => {
if (vectorDocs?.length && _redisClient && _openAIApiKey) {
const embeddings = new OpenAIEmbeddings({
openAIApiKey: _openAIApiKey,
});
const vectorStore = await RedisVectorStore.fromDocuments(
vectorDocs,
embeddings,
{
redisClient: _redisClient,
indexName: 'openAIProductImgIdx',
keyPrefix: 'openAIProductImgText:',
},
);
console.log('seeding imageSummaryEmbeddings completed');
}
};
const addImageSummaryEmbeddingsToRedis = async (
_products: Prisma.ProductCreateInput[],
_redisClient: NodeRedisClientType,
_openAIApiKey: string,
) => {
const vectorDocs = await getImageSummaryVectorDocuments(
_products,
_openAIApiKey,
);
await seedImageSummaryEmbeddings(vectorDocs, _redisClient, _openAIApiKey);
};
下图显示了 RedisInsight 中 OpenAI 图像摘要 的 JSON 结构。
下载 RedisInsight 以可视化探索您的 Redis 数据或在工作台中使用原始 Redis 命令。
本节涵盖 getProductsByVSSImageSummary
的 API 请求和响应结构,这对于基于图像摘要使用语义搜索检索产品至关重要。
请求格式
API 的示例请求格式如下:
POST http://localhost:3000/products/getProductsByVSSImageSummary
{
"searchText":"Left chest nike logo",
//optional
"maxProductCount": 4, // 2 (default)
"similarityScoreLimit":0.2, // 0.2 (default)
}
响应结构
API 响应是一个 JSON 对象,其中包含与语义搜索条件匹配的产品详情数组。
{
"data": [
{
"productId": "10017",
"price": 3995,
"productDisplayName": "Nike Women As The Windru Blue Jackets",
"brandName": "Nike",
"styleImages_default_imageURL": "http://host.docker.internal:8080/products/01/10017/product-img.webp",
"productDescriptors_description_value": " Blue and White jacket made of 100% polyester, with an interior pocket ...",
"stockQty": 25,
"similarityScore": 0.163541972637,
"imageSummary": "The product in the image is a blue and white jacket featuring a design consistent with the provided description. ..."
}
// Additional products...
],
"error": null,
"auth": "SES_fd57d7f4-3deb-418f-9a95-6749cd06e348"
}
此 API 的后端实现涉及以下步骤:
getProductsByVSSImageSummary
函数处理 API 请求。getSimilarProductsScoreByVSSImageSummary
函数对图像摘要执行语义搜索。它集成了 OpenAI 的语义分析能力来解释 searchText 并从 Redis 向量存储中识别相关产品。const getSimilarProductsScoreByVSSImageSummary = async (
_params: IParamsGetProductsByVSS,
) => {
let {
standAloneQuestion,
openAIApiKey,
//optional
KNN,
scoreLimit,
} = _params;
let vectorDocs: Document[] = [];
const client = getNodeRedisClient();
KNN = KNN || 2;
scoreLimit = scoreLimit || 1;
const embeddings = new OpenAIEmbeddings({
openAIApiKey: openAIApiKey,
});
// create vector store
const vectorStore = new RedisVectorStore(embeddings, {
redisClient: client,
indexName: 'openAIProductImgIdx',
keyPrefix: 'openAIProductImgText:',
});
// search for similar products
const vectorDocsWithScore = await vectorStore.similaritySearchWithScore(
standAloneQuestion,
KNN,
);
// filter by scoreLimit
for (let [doc, score] of vectorDocsWithScore) {
if (score <= scoreLimit) {
doc['similarityScore'] = score;
vectorDocs.push(doc);
}
}
return vectorDocs;
};
const getProductsByVSSImageSummary = async (
productsVSSFilter: IProductsVSSBodyFilter,
) => {
let { searchText, maxProductCount, similarityScoreLimit } = productsVSSFilter;
let products: IProduct[] = [];
const openAIApiKey = process.env.OPEN_AI_API_KEY || '';
maxProductCount = maxProductCount || 2;
similarityScoreLimit = similarityScoreLimit || 0.2;
//VSS search
const vectorDocs = await getSimilarProductsScoreByVSSImageSummary({
standAloneQuestion: searchText,
openAIApiKey: openAIApiKey,
KNN: maxProductCount,
scoreLimit: similarityScoreLimit,
});
if (vectorDocs?.length) {
const productIds = vectorDocs.map((doc) => doc?.metadata?.productId);
//get product with details
products = await getProductByIds(productIds, true);
}
//...
return products;
};
Semantic image summary search
选项。Left chest nike logo
,搜索结果将显示带有左胸标志的耐克夹克等产品,反映了查询意图。对图像摘要执行语义搜索是电商应用的强大工具。它允许用户根据产品描述或图像搜索产品,从而实现更直观高效的购物体验。本教程演示了如何将 OpenAI 的语义分析能力与 Redis 集成,为电商应用创建强大的搜索引擎。