dynamodb扫描：筛选属性不存在的所有记录

Question

dynamodb扫描：筛选属性不存在的所有记录

我似乎无法正确理解。我想对表进行扫描，仅返回不存在特定字段的记录。

我尝试了以下两件事：

HashMap<String, Condition> scanFilter = new HashMap();
Condition scanFilterCondition = new Condition().withComparisonOperator(ComparisonOperator.NULL.toString());
scanFilter.put("field", scanFilterCondition);

ScanRequest scan = new ScanRequest()
    .withTableName("table name")
    .withScanFilter(scanFilter)
etc

Run Code Online (Sandbox Code Playgroud)

和

ScanRequest scan = new ScanRequest()
          .withTableName("table")
          .withFilterExpression("attribute_not_exists(attributeName)")
          .withLimit(100)
          etc

Run Code Online (Sandbox Code Playgroud)

但是，它们不返回任何记录（大多数记录都缺少此字段）。请注意，如果删除过滤器，扫描将返回并按预期处理所有记录，因此基本查询是正确的。我该怎么做呢？

编辑添加了完整的方法，以防万一

// Get information on the table so that we can set the read capacity for the operation.
List<String> tables = client.listTables().getTableNames();
String tableName = tables.stream().filter(table -> table.equals(configuration.getTableName())).findFirst().get();
if(Strings.isNullOrEmpty(tableName))
  return 0;
TableDescription table = client.describeTable(tableName).getTable();

//Set the rate limit to a third of the provisioned read capacity.
int rateLimit = (int) (table.getProvisionedThroughput().getReadCapacityUnits() / 3);
RateLimiter rateLimiter = RateLimiter.create(rateLimit);
// Track how much throughput we consume on each page
int permitsToConsume = 1;
// Initialize the pagination token
Map<String, AttributeValue> exclusiveStartKey = null;
int count = 1;
int writtenCount = 0;

do {
  // Let the rate limiter wait until our desired throughput "recharges"
  rateLimiter.acquire(permitsToConsume);

  //We only want to process records that don't have the field key set.
  HashMap<String, Condition> scanFilter = new HashMap<>();
  Condition scanFilterCondition = new Condition().withComparisonOperator(ComparisonOperator.NULL.toString());
  scanFilter.put("field", scanFilterCondition);

  ScanRequest scan = new ScanRequest()
      .withTableName(configuration.getNotificationsTableName())
      .withScanFilter(scanFilter)
      .withLimit(100)
      .withReturnConsumedCapacity(ReturnConsumedCapacity.TOTAL)
      .withExclusiveStartKey(exclusiveStartKey);

  ScanResult result = client.scan(scan);
  exclusiveStartKey = result.getLastEvaluatedKey();

  // Account for the rest of the throughput we consumed,
  // now that we know how much that scan request cost
  double consumedCapacity = result.getConsumedCapacity().getCapacityUnits();
  permitsToConsume = (int)(consumedCapacity - 1.0);
  if(permitsToConsume <= 0) {
    permitsToConsume = 1;
    }

  // Process results here
} while (exclusiveStartKey != null);

Run Code Online (Sandbox Code Playgroud)

Answer 1

not*_*est 3

这条件似乎没问题。您需要使用 Scan 进行递归搜索。Dynamodb 扫描不会一次性扫描整个数据库。它根据消耗的预配置吞吐量来扫描数据。

基于 LastEvaluatedKey 执行循环扫描的示例代码：-

ScanResult result = null; do { HashMap<String, Condition> scanFilter = new HashMap<>(); Condition scanFilterCondition = new Condition().withComparisonOperator(ComparisonOperator.NULL); scanFilter.put("title", scanFilterCondition); ScanRequest scanRequest = new ScanRequest().withTableName(tableName).withScanFilter(scanFilter); if (result != null) { scanRequest.setExclusiveStartKey(result.getLastEvaluatedKey()); } result = dynamoDBClient.scan(scanRequest); LOGGER.info("Number of records ==============>" + result.getItems().size()); for (Map<String, AttributeValue> item : result.getItems()) { LOGGER.info("Movies ==================>" + item.get("title")); } } while (result.getLastEvaluatedKey() != null);
Run Code Online (Sandbox Code Playgroud)

无效的：该属性不存在。所有数据类型都支持 NULL，包括列表和映射。注意该运算符测试属性是否存在，而不是其数据类型。如果属性“a”的数据类型为 null，并且使用 NULL 对其求值，则结果为布尔值 false。这是因为属性“a”存在；它的数据类型与 NULL 比较运算符无关。

LastEvaluatedKey操作停止处的项目的主键，包括上一个结果集。使用该值启动新操作，新请求中不包括该值。

如果 LastEvaluatedKey 为空，则结果的“最后一页”已被处理，不再需要检索数据。

如果LastEvaluatedKey不为空，并不一定意味着结果集中还有更多数据。知道何时到达结果集末尾的唯一方法是 LastEvaluatedKey 何时为空。

归档时间：	9 年，4 月前
查看次数：	2503 次
最近记录：	8 年，8 月前