use*_*024 4 data-modeling nosql aerospike
可以说我在JAVA中有以下模型
class Shape {
String type;
String color;
String size;
}
Run Code Online (Sandbox Code Playgroud)
并说我有以下基于上述模型的数据.
Triangle, Blue, Small
Triangle, Red, Large
Circle, Blue, Small
Circle, Blue, Medium
Square, Green, Medium
Star, Blue, Large
Run Code Online (Sandbox Code Playgroud)
我想回答以下问题
Given the type Circle how many unique colors?
Answer: 1
Given the type Circle how many unique sizes?
Answer: 2
Given the color Blue how many unique shapes?
Answer: 2
Given the color Blue how many unique sizes?
Answer: 3
Given the size Small how many unique shapes?
Answer: 2
Given the size Small how many unique colors?
Answer: 1
Run Code Online (Sandbox Code Playgroud)
我想知道我是否应该按照以下方式建模......
set: shapes -> key: type -> bin(s): list of colors, list of sizes
set: colors -> key: color -> bin(s): list of shapes, list of sizes
set: sizes -> key: size -> bin(s): list of shapes, list of colors
Run Code Online (Sandbox Code Playgroud)
或者有更好的方法吗?如果我这样做,我需要3倍的存储空间.
我还希望每套都有数十亿条目.顺便说一下模型已被编辑以保护inoncent代码;)
NoSQL中的数据建模总是关于您计划如何检索数据,吞吐量和延迟.
有几种方法可以对这些数据进行建模; 最简单的是模仿每个字段变为Bin的类结构.您可以在每个bin上定义Secondary Indexes并使用Aggregation Queries来回答您的问题(上图).
但这只是一种方式; 您可能需要使用不同的数据模型来满足延迟和吞吐量的因素.