MongoDB Atlas 和 AWS Lambda 之间的间歇性超时问题

Question

MongoDB Atlas 和 AWS Lambda 之间的间歇性超时问题

V.D*_*zel 6 mongodb amazon-web-services node.js aws-lambda

我对这个问题有点绝望：我们正在为我们的 API 运行 AWS Lambda，该 API 与 MongoDB Atlas (M20) 上的 MongoDB 集群通信。为了防止在每次 Lambda 调用时创建新连接，我们遵循以下模式：https : //docs.atlas.mongodb.com/best-practices-connecting-to-aws-lambda/在 Lambda 的生命周期内缓存连接容器。我们对它略有不同：

async function getProdDB() {
  const url = `mongodb+srv://${process.env.DB_USER}:${process.env.DB_PASSWORD}@xxxxx-yyyy.zzzzz.net?retryWrites=true`

  if (!cachedDb || !cachedDb.serverConfig.isConnected()) {
    cachedClient = await MongoClient.connect(
      url,
      { useNewUrlParser: true, useUnifiedTopology: true }
    )
    cachedDb = cachedClient.db(process.env.DB_NAME)
  }

  return cachedDb
}

Run Code Online (Sandbox Code Playgroud)

这也检查我们是否已连接。现在这在 98% 的情况下都有效，但我们的 Lambda 调用时不时会超时。我们试图诊断一下：

我们将 Lambda 的超时限制从 6 秒更改为 30 秒/60 秒，并且 Lambda 函数仍然会时不时地超时。Mongo 从不抛出错误，由于 TimeoutError 总是由 Lambda 完成调用
在成功和不成功调用的情况下，cachedDB.serverConfig.isConnected()返回 rue`
导致超时的业务逻辑部分是对 MongoDB 本身的查询，常见的 MongoDB 操作，如findOne或updateOne对非常小的集合（前 100 个文档）
我们试图从对升级我们的NodeJS MongoDB的驱动程序3.3.1，以3.3.5下列https://github.com/Automattic/mongoose/issues/8180（我们不使用猫鼬虽然，只是官方的MongoDB驱动的NodeJS）和问题依然存在
我们尝试通过使用相同版本驱动程序的 NodeJS 脚本直接查询我们的 MongoDB 集群，并且在数千个查询中，没有发生一个超时问题。所以我们得出结论，问题不在于我们的集群本身，而在于连接。
经常调用的函数不会比经常调用的函数更频繁地超时，但频率较低。看起来，我们与 MongoDB 的缓存连接以某种方式变得陈旧，即使true在调用时返回，isConnected()并且在 Lambda 容器在没有调用的情况下保持打开一段时间后无法重用。我们使用默认超时：https : //scalegrid.io/blog/understanding-mongodb-client-timeout-options/
检查了 Atlas 上的 MongoDB 日志条目 - 没有任何可疑之处
停止 chaching 数据库连接解决了问题，但使大多数 API 调用慢了 2-3 倍，我们仍然想了解问题的根源

有没有人遇到过类似的问题，或者可以建议我们如何有效地继续调试这个问题？

Answer 1

Bor*_*uhh 0

我们公司目前使用具有超过 500 个 lambda 的相同架构（Lambda -> MongoDB Atlas Cluster），并且我们没有看到任何连接超时的问题。我们使用 Node.js 驱动程序的 3.6 版。

这是我们正在使用的代码：

public static async connect(url?: string): Promise<Db> {
    /** If MongoDB is already connected, return that */
    if (this.cachedDb && this.client && this.client.isConnected())
      return Promise.resolve(this.cachedDb);

    /** Check for MongoURL in env if not provided */
    const mongoUrl: string = url || process.env.MONGO_CONNECTION_URL || '';

    this.client = await MongoClient.connect(mongoUrl, {
      useUnifiedTopology: true,
      useNewUrlParser: true,
      ignoreUndefined: true,
    });

    this.cachedDb = this.client.db();

    return Promise.resolve(this.cachedDb);
  }

Run Code Online (Sandbox Code Playgroud)

然后我们稍后使用缓存的数据库和客户端MongoClient：

/**
 * MongoDB Client Class
 */
class MongoDbClient {
  private static cachedDb: Db | null = null;
  private static client: MongoClient | null = null;

  /**
   * Connects to a MongoDB database instance
   *
   * @param url The MongoDB connection string
   *
   * @returns An instance of a MongoDB database
   */
  public static async connect(url?: string): Promise<Db> {
    /** If MongoDB is already connected, return that */
    if (this.cachedDb && this.client && this.client.isConnected())
      return Promise.resolve(this.cachedDb);

    /** Check for MongoURL in env if not provided */
    const mongoUrl: string = url || process.env.MONGO_CONNECTION_URL || '';

    this.client = await MongoClient.connect(mongoUrl, {
      useUnifiedTopology: true,
      useNewUrlParser: true,
      ignoreUndefined: true,
    });

    this.cachedDb = this.client.db();

    return Promise.resolve(this.cachedDb);
  }

  /**
   * Retrieves the first document that matches a filter condition
   *
   * @param collectionName The name of the collection to search
   * @param filter The filter to search for
   * @param options MongoDB find one option to include
   *
   * @returns The first document that matches the filter condition
   */
  public static async findOne<T>(
    collectionName: string,
    filter: FilterQuery<T> = {},
    options: FindOneOptions<T extends T ? T : T> = defaultDocumentOptions
  ): Promise<T | null> {
    const db = await this.connect();
    return db.collection<T>(collectionName).findOne<T>(filter, options);
  }
}

Run Code Online (Sandbox Code Playgroud)

我注意到的区别是，我们正在检查是否已MongoClient连接与Db. 此后，此功能已被弃用，并由4+ 版本中的驱动程序使用以下代码在内部进行处理：

// If a connection already been established, we can terminate early
  if (mongoClient.topology && mongoClient.topology.isConnected()) {
    return callback(undefined, mongoClient);
  }

Run Code Online (Sandbox Code Playgroud)

另一件可能发生的事情是 Lambda 偶尔会被破坏/损坏。对于 Lambda 来说，这是无法避免的正常行为，这就是AWS 建议在您的程序中内置重试策略的原因。

请注意，即使是 AWS SDK，根据其文档也有重试策略：

AWS CLI 和 AWS 开发工具包等客户端会在客户端超时、限制错误 (429) 以及不是由错误请求引起的其他错误时重试。

归档时间：	6 年，2 月前
查看次数：	604 次
最近记录：	6 年，2 月前