学习操作指南和教程解决方案向量数据库使用 LangChain (OpenAI) 和 Redis 进行基于图像的语义查询

学习

学习操作指南和教程解决方案向量数据库使用 LangChain (OpenAI) 和 Redis 进行基于图像的语义查询

使用 LangChain (OpenAI) 和 Redis 进行基于图像的语义查询

作者

Prasan Kumar, Redis 技术解决方案开发者

作者

Will Johnston, Redis 开发者增长经理

您将在此教程中学到什么 #

本教程演示了如何使用 LangChain (OpenAI) 和 Redis 对产品图片执行语义搜索。具体来说，我们将涵盖以下主题：

•电商应用场景 ：考虑一个示例电商应用场景，客户可以利用基于图像的查询进行产品搜索、将商品添加到购物车并完成购买，从而突出语义搜索的实际应用。
•数据库设置 ：这包括为产品图片生成描述性摘要，为生成的摘要创建语义嵌入，并将其高效地存储在 Redis 中。
•设置搜索 API ：此 API 旨在处理图像内容相关的用户查询。它集成了 OpenAI 的语义分析能力与 Redis 的高效数据检索和存储能力。

术语 #

LangChain 是一个用于构建语言模型应用的创新库。它提供了一种结构化的方式来组合不同的组件，例如语言模型（如 OpenAI 的模型）、存储解决方案（如 Redis）和自定义逻辑。这种模块化方法有助于创建复杂的 AI 应用。

OpenAI 提供先进的语言模型，如 GPT-3，它们凭借理解和生成类似人类文本的能力彻底改变了该领域。这些模型构成了许多现代 AI 应用（包括语义文本/图像搜索和聊天机器人）的基础。

电商应用的微服务架构 #

GITHUB 代码

以下是克隆本教程中使用的应用源代码的命令：

git clone --branch v9.2.0 https://github.com/redis-developer/redis-microservices-ecommerce-solutions

让我们看看演示应用的架构：

1.products service：处理从数据库查询产品并将其返回给前端
2.orders service：处理订单验证和创建
3.order history service：处理查询客户的订单历史记录
4.payments service：处理订单支付
5.api gateway：在单个端点下统一服务
6.mongodb/ postgresql：作为写优化数据库，用于存储订单、订单历史、产品等。

注意

在演示应用中，您无需将 MongoDB/Postgresql 用作写优化数据库；您也可以使用其他 prisma 支持的数据库。这仅是一个示例。

使用 Next.js 和 Tailwind 的电商应用前端 #

该电商微服务应用包含一个使用 Next.js 和 TailwindCSS 构建的前端。应用后端使用 Node.js。数据存储在 Redis 以及 MongoDB 或 PostgreSQL 中，使用 Prisma。以下是展示该电商应用前端的屏幕截图。

仪表板： 显示带有不同搜索功能的产品列表，可在设置页面进行配置。

设置： 通过点击仪表板右上角的齿轮图标访问。在此处控制搜索栏、聊天机器人可见性和其他功能。

仪表板（语义文本搜索）： 配置为语义文本搜索后，搜索栏支持自然语言查询。示例：“纯棉蓝色衬衫”。

仪表板（基于图像的语义查询）： 配置为语义图像摘要搜索后，搜索栏支持基于图像的查询。示例：“左胸耐克标志”。

聊天机器人： 位于页面右下角，协助进行产品搜索和详细查看。

在聊天中选择产品会在仪表板上显示其详细信息。

购物车： 将产品添加到购物车，然后使用“立即购买”按钮结账。

订单历史记录： 购买后，顶部导航栏中的“订单”链接显示订单状态和历史记录。

管理面板： 通过顶部导航中的“admin”链接访问。显示购买统计数据和热门产品。

数据库设置 #

注意

GITHUB 代码

以下是克隆本教程中使用的应用源代码的命令：

git clone --branch v9.2.0 https://github.com/redis-developer/redis-microservices-ecommerce-solutions

示例数据 #

在本教程中，我们将使用简化的电商数据集。具体来说，我们的 JSON 结构包括 product 详情和一个名为 styleImages_default_imageURL 的键，该键链接到产品图片。此图片将是我们 AI 驱动的语义搜索的重点。

database/fashion-dataset/001/products/*.json

const products = [
  {
    productId: '11000',
    price: 3995,
    productDisplayName: 'Puma Men Slick 3HD Yellow Black Watches',
    variantName: 'Slick 3HD Yellow',
    brandName: 'Puma',
    // Additional product details...
    styleImages_default_imageURL:
      'http://host.docker.internal:8080/images/11000.jpg',
    // Other properties...
  },
  // Additional products...
];

生成 OpenAI 图像摘要 #

以下代码片段概述了使用 OpenAI 功能为产品图片生成文本摘要的过程。我们将首先使用 fetchImageAndConvertToBase64 函数将图片 URL 转换为 base64 字符串，然后利用 OpenAI 使用 getOpenAIImageSummary 函数生成图片摘要。

database/src/open-ai-image.ts

import {
  ChatOpenAI,
  ChatOpenAICallOptions,
} from 'langchain/chat_models/openai';
import { HumanMessage } from 'langchain/schema';
import { Document } from 'langchain/document';
import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { RedisVectorStore } from 'langchain/vectorstores/redis';

let llm: ChatOpenAI<ChatOpenAICallOptions>;

// Instantiates the LangChain ChatOpenAI instance
const getOpenAIVisionInstance = (_openAIApiKey: string) => {
  //OpenAI supports  images with text in input messages with their gpt-4-vision-preview.
  if (!llm) {
    llm = new ChatOpenAI({
      openAIApiKey: _openAIApiKey,
      modelName: 'gpt-4-vision-preview',
      maxTokens: 1024,
    });
  }
  return llm;
};

const fetchImageAndConvertToBase64 = async (_imageURL: string) => {
  let base64Image = '';
  try {
    const response = await axios.get(_imageURL, {
      responseType: 'arraybuffer',
    });
    // Convert image to Base64
    base64Image = Buffer.from(response.data, 'binary').toString('base64');
  } catch (error) {
    console.error(
      `Error fetching or converting the image: ${_imageURL}`,
      error,
    );
  }
  return base64Image;
};

// Generates an OpenAI summary for a given base64 image string
const getOpenAIImageSummary = async (
  _openAIApiKey: string,
  _base64Image: string,
  _product: Prisma.ProductCreateInput,
) => {
  /*
     Reference : https://js.langchain.ac.cn/docs/integrations/chat/openai#multimodal-messages

    - This function utilizes OpenAI's multimodal capabilities to generate a summary from the image. 
    - It constructs a prompt that combines the product description with the image.
    - OpenAI's vision model then processes this prompt to generate a detailed summary.

   */
  let imageSummary = '';

  try {
    if (_openAIApiKey && _base64Image && _product) {
      const llmInst = getOpenAIVisionInstance(_openAIApiKey);

      const text = `Below are the product details and image of an e-commerce product for reference. Please conduct and provide a comprehensive analysis of the product depicted in the image . 
        
            Product Details:
            ${_product.productDescriptors_description_value}
            
            Image:
        `;
      // Constructing a multimodal message combining text and image
      const imagePromptMessage = new HumanMessage({
        content: [
          {
            type: 'text',
            text: text,
          },
          {
            type: 'image_url',
            image_url: {
              url: `data:image/jpeg;base64,${_base64Image}`,
              detail: 'high', // low, high (if you want more detail)
            },
          },
        ],
      });

      // Invoking the LangChain ChatOpenAI model with the constructed message
      const response = await llmInst.invoke([imagePromptMessage]);
      if (response?.content) {
        imageSummary = <string>response.content;
      }
    }
  } catch (err) {
    console.log(
      `Error generating OpenAIImageSummary for product id ${_product.productId}`,
      err,
    );
  }
  return imageSummary;
};

示例图片和 OpenAI 摘要 #

以下部分演示了上述过程的结果。我们将使用一件彪马 (Puma) T 恤的图片，并利用 OpenAI 的能力生成摘要。

OpenAI 模型生成的详细摘要如下：

This product is a black round neck T-shirt featuring a design consistent with the Puma brand aesthetic, which includes their iconic leaping cat logo in a contrasting yellow color placed prominently across the chest area. The T-shirt is made from 100% cotton, suggesting it is likely to be breathable and soft to the touch. It has a classic short-sleeve design with a ribbed neckline for added texture and durability. There is also mention of a vented hem, which may offer additional comfort and mobility.

The T-shirt is described to have a 'comfort' fit, which typically means it is designed to be neither too tight nor too loose, allowing for ease of movement without being baggy. This could be ideal for casual wear or active use.

Care instructions are also comprehensive, advising a gentle machine wash with similar colors in cool water at 30 degrees Celsius, indicating it is relatively easy to care for. However, one should avoid bleaching, tumble drying, and dry cleaning it, but a warm iron is permissible.

Looking at the image provided:

- The T-shirt appears to fit the model well, in accordance with the described 'comfort' fit.
- The color contrast between the T-shirt and the graphic gives the garment a modern, sporty look.
- The model is paired with denim jeans, showcasing the T-shirt's versatility for casual occasions. However, the product description suggests it can be part of an athletic ensemble when combined with Puma shorts and shoes.
- Considering the model's statistics, prospective buyers could infer how this T-shirt might fit on a person with similar measurements.

Overall, the T-shirt is positioned as a versatile item suitable for both lifestyle and sporting activities, with a strong brand identity through the graphic, and is likely comfortable and easy to maintain based on the product details provided.

填充图像摘要嵌入 #

addImageSummaryEmbeddingsToRedis 函数在将 AI 生成的图像摘要与 Redis 集成中起着关键作用。此过程包括两个主要步骤：

1.生成向量文档：利用 getImageSummaryVectorDocuments 函数，我们将图像摘要转换为向量文档。此转换至关重要，因为它将文本摘要转换为适合 Redis 存储的格式。
2.将嵌入填充到 Redis：然后使用 seedImageSummaryEmbeddings 函数将这些向量文档存储到 Redis 中。此步骤对于在 Redis 数据库中实现高效检索和搜索功能至关重要。

// Function to generate vector documents from image summaries
const getImageSummaryVectorDocuments = async (
  _products: Prisma.ProductCreateInput[],
  _openAIApiKey: string,
) => {
  const vectorDocs: Document[] = [];

  if (_products?.length > 0) {
    let count = 1;
    for (let product of _products) {
      if (product) {
        let imageURL = product.styleImages_default_imageURL; //cdn url
        const imageData = await fetchImageAndConvertToBase64(imageURL);
        imageSummary = await getOpenAIImageSummary(
          _openAIApiKey,
          imageData,
          product,
        );
        console.log(
          `openAI imageSummary #${count++} generated for product id: ${
            product.productId
          }`,
        );

        if (imageSummary) {
          let doc = new Document({
            metadata: {
              productId: product.productId,
              imageURL: imageURL,
            },
            pageContent: imageSummary,
          });
          vectorDocs.push(doc);
        }
      }
    }
  }
  return vectorDocs;
};

// Seeding vector documents into Redis
const seedImageSummaryEmbeddings = async (
  vectorDocs: Document[],
  _redisClient: NodeRedisClientType,
  _openAIApiKey: string,
) => {
  if (vectorDocs?.length && _redisClient && _openAIApiKey) {
    const embeddings = new OpenAIEmbeddings({
      openAIApiKey: _openAIApiKey,
    });
    const vectorStore = await RedisVectorStore.fromDocuments(
      vectorDocs,
      embeddings,
      {
        redisClient: _redisClient,
        indexName: 'openAIProductImgIdx',
        keyPrefix: 'openAIProductImgText:',
      },
    );
    console.log('seeding imageSummaryEmbeddings completed');
  }
};

const addImageSummaryEmbeddingsToRedis = async (
  _products: Prisma.ProductCreateInput[],
  _redisClient: NodeRedisClientType,
  _openAIApiKey: string,
) => {
  const vectorDocs = await getImageSummaryVectorDocuments(
    _products,
    _openAIApiKey,
  );

  await seedImageSummaryEmbeddings(vectorDocs, _redisClient, _openAIApiKey);
};

下图显示了 RedisInsight 中 OpenAI 图像摘要 的 JSON 结构。

提示

下载 RedisInsight 以可视化探索您的 Redis 数据或在工作台中使用原始 Redis 命令。

设置搜索 API #

API 端点 #

本节涵盖 getProductsByVSSImageSummary 的 API 请求和响应结构，这对于基于图像摘要使用语义搜索检索产品至关重要。

请求格式

API 的示例请求格式如下：

POST http://localhost:3000/products/getProductsByVSSImageSummary
{
   "searchText":"Left chest nike logo",

   //optional
   "maxProductCount": 4, // 2 (default)
   "similarityScoreLimit":0.2, // 0.2 (default)
}

响应结构

API 响应是一个 JSON 对象，其中包含与语义搜索条件匹配的产品详情数组。

{
  "data": [
    {
      "productId": "10017",
      "price": 3995,
      "productDisplayName": "Nike Women As The Windru Blue Jackets",
      "brandName": "Nike",
      "styleImages_default_imageURL": "http://host.docker.internal:8080/products/01/10017/product-img.webp",
      "productDescriptors_description_value": " Blue and White jacket made of 100% polyester, with an interior pocket ...",
      "stockQty": 25,
      "similarityScore": 0.163541972637,
      "imageSummary": "The product in the image is a blue and white jacket featuring a design consistent with the provided description. ..."
    }
    // Additional products...
  ],
  "error": null,
  "auth": "SES_fd57d7f4-3deb-418f-9a95-6749cd06e348"
}

API 实现 #

此 API 的后端实现涉及以下步骤：

1.getProductsByVSSImageSummary 函数处理 API 请求。
2.getSimilarProductsScoreByVSSImageSummary 函数对图像摘要执行语义搜索。它集成了 OpenAI 的语义分析能力来解释 searchText 并从 Redis 向量存储中识别相关产品。

server/src/services/products/src/open-ai-prompt.ts

const getSimilarProductsScoreByVSSImageSummary = async (
  _params: IParamsGetProductsByVSS,
) => {
  let {
    standAloneQuestion,
    openAIApiKey,

    //optional
    KNN,
    scoreLimit,
  } = _params;

  let vectorDocs: Document[] = [];
  const client = getNodeRedisClient();

  KNN = KNN || 2;
  scoreLimit = scoreLimit || 1;

  const embeddings = new OpenAIEmbeddings({
    openAIApiKey: openAIApiKey,
  });

  // create vector store
  const vectorStore = new RedisVectorStore(embeddings, {
    redisClient: client,
    indexName: 'openAIProductImgIdx',
    keyPrefix: 'openAIProductImgText:',
  });

  // search for similar products
  const vectorDocsWithScore = await vectorStore.similaritySearchWithScore(
    standAloneQuestion,
    KNN,
  );

  // filter by scoreLimit
  for (let [doc, score] of vectorDocsWithScore) {
    if (score <= scoreLimit) {
      doc['similarityScore'] = score;
      vectorDocs.push(doc);
    }
  }

  return vectorDocs;
};

server/src/services/products/src/service-impl.ts

const getProductsByVSSImageSummary = async (
  productsVSSFilter: IProductsVSSBodyFilter,
) => {
  let { searchText, maxProductCount, similarityScoreLimit } = productsVSSFilter;
  let products: IProduct[] = [];

  const openAIApiKey = process.env.OPEN_AI_API_KEY || '';
  maxProductCount = maxProductCount || 2;
  similarityScoreLimit = similarityScoreLimit || 0.2;

  //VSS search
  const vectorDocs = await getSimilarProductsScoreByVSSImageSummary({
    standAloneQuestion: searchText,
    openAIApiKey: openAIApiKey,
    KNN: maxProductCount,
    scoreLimit: similarityScoreLimit,
  });

  if (vectorDocs?.length) {
    const productIds = vectorDocs.map((doc) => doc?.metadata?.productId);

    //get product with details
    products = await getProductByIds(productIds, true);
  }

  //...

  return products;
};

前端 UI #

•设置配置：首先，确保在设置页面中启用 Semantic image summary search 选项。

•执行搜索：在仪表板页面，用户可以使用基于图像的查询进行搜索。例如，如果查询是 Left chest nike logo，搜索结果将显示带有左胸标志的耐克夹克等产品，反映了查询意图。

•查看图像摘要：用户可以点击任何产品图片查看 OpenAI 生成的相应图像摘要。此功能提供了关于 AI 如何解释和总结视觉内容的深入见解。

准备好使用 Redis 进行基于图像的语义查询了吗？#

对图像摘要执行语义搜索是电商应用的强大工具。它允许用户根据产品描述或图像搜索产品，从而实现更直观高效的购物体验。本教程演示了如何将 OpenAI 的语义分析能力与 Redis 集成，为电商应用创建强大的搜索引擎。

工具

获取 Redis

连接

学习

最新

查看工作原理

学习

使用 LangChain (OpenAI) 和 Redis 进行基于图像的语义查询

您将在此教程中学到什么 #

术语 #

电商应用的微服务架构 #

GITHUB 代码

注意

使用 Next.js 和 Tailwind 的电商应用前端 #

数据库设置 #

注意

GITHUB 代码

示例数据 #

生成 OpenAI 图像摘要 #

示例图片和 OpenAI 摘要 #

填充图像摘要嵌入 #

提示

设置搜索 API #

API 端点 #

API 实现 #

前端 UI #

准备好使用 Redis 进行基于图像的语义查询了吗？#

延伸阅读#

加入 Redis 大学

面向 AI 的 Redis

页面内容