本文以用户feed作为案例

MongoDB集群结构

数据量较小

采用MongoDB三节点副本集的方式构造集群

数据量较大

使用sharding方式扩展单个集群的容量

数据量非常大

不同时期的Feed数据写入到不同的MongoDB Cluster中，避免单个MongoDB集群规模过大带来各种运维上的问题

每个MongoDB Cluster保存的数据包括：
- 元数据
  - 时间范围：指定当前cluster保存那一段时间的feed信息
- Feed数据
  - 使用一个collection保存所有用户的feed
  - 这个collection的根据用户的user_id进行分片，适应写、读扩容场景
客户端程序根据MongoDB Cluster的元数据将收到的Feed消息写入到对应的MongoDB Cluster
客户端程序启动时从所有的MongoDB Cluster中加载元数据

Feed DB的结构

metadata集合

{
    "_id": "cluster_1",
    "name": "cluster_1",
    "start_date": new Date("2016-05-01"),
    "end_date": new Date("2016-08-01"),
    "creator_name": "ethan",
    "created_at": new Date("2016-05-01 00:00:00")
}

feed集合

{
    "_id": ObjectId(""),
    "data_key": "e6755cfae343b6719cc2121e888b0a41",
    "receiver_id": 1000386,
    "sender_id": 1000765,
    "event_time": new Date("2016-05-01 10:00:00"),
    "type": 1,
    "data": {
        "fabula_id": 1000983
    }
}

feed.data_key用于根据业务对象查找对应feed记录的标识，主要用于删除场景，生成算法如下：

feed.data_key = MMH3Hash("fabula_" + $fabulaId)

feed._id的生成算法：

同ObjectID的生成算法，包含time, machine identifier, process id, counter四部分，使用feed.event_time作为第一部分
ObjectID生成算法参考: https://github.com/go-mgo/mgo/blob/v2/bson/bson.go

MongoDB集群结构

数据量较小

数据量较大

数据量非常大

Feed DB的结构

参考