Introduction
...
Code Block |
---|
{ "_id" : ObjectId("5d3f1c2a2f547625b0bbb397"), "string" : "AAPL", "int32" : 10, "double" : 23.23, "array" : [ "a1", "a2" ], "object" : { "inner_field" : "val" }, "binary" : { "$binary" : "YmluYXJ5IGRhdGE=", "$type" : "00" }, "undefined" : undefined, "boolean" : false, "date" : ISODate("2019-07-29T16:17:46.109Z"), "null" : null, "regex" : /./, "dbpointer" : DBRef("source", "5d079ee6d078c94008e4bb3a"), "javascript" : var l = 1;, "javascriptwithscope" : { "$code" : var l = 1; , "$scope" : { "scope" : "scope_val" } }, "symbol" : "a", "timestamp" : Timestamp(1564417066, 1), "long" : NumberLong(9223372036854775807), "decimal" : NumberDecimal("3.100000"), "minkey" : { "$minKey" : 1 }, "maxkey" : { "$maxKey" : 1 } } |
BSON
BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org
Document limitations
- The maximum BSON document size is 16 megabytes.
- In MongoDB, each document stored in a collection requires a unique _id field that acts as a primary key. If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.
...
The following example uses '{ status: { $in: [ "A", "D" ] } }' query filter document to retrieve all documents from the 'inventory' collection where 'status' equals either "A" or "D":
...
MongoDB Data Type | CDAP Schema Data Type | Support | Comment |
---|---|---|---|
Double | Schema.Type.DOUBLE | + | |
String | Schema.Type.STRING | + | |
Object | Schema.Type.RECORD | + | |
Array | Schema.Type.ARRAY | + | |
Binary data | Schema.Type.BYTES | * | Value can be mapped to Schema.Type.BYTES, but this can lead to subtype information loss.
There are several options: 1) Support only 'generic' subtype. 2) Map using MongoDB extended JSON format: "binary": {"$binary": "YmluYXJ5IGRhdGE=", "$type": "00"} |
Undefined | Schema.Type.NULL | * | Can be mapped to Schema.Type.STRING using MongoDB extended JSON format: "undefined": {"$undefined": true} |
ObjectId | * | Value can be mapped to Schema.Type.STRING, but this will lead to type information loss. There are several options: 1) Do not support this data type for the Sink 2) Map using MongoDB extended JSON format: {"$oid": "5d3f1c2a2f547625b0bbb397"} | |
Boolean | Schema.Type.BOOLEAN | + | |
Date | Schema.LogicalType.TIMESTAMP_MILLIS | + | |
Null | Schema.Type.UNION | + | A nullable version of the actual type, corresponds to Schema.nullableOf(actualTypeSchema). |
Regular Expression | Schema.Type.STRING | * | Value can be mapped to Schema.Type.STRING, but this will lead to type information loss. There are several options: 1) Do not support this data type for the Sink 2) Map using MongoDB extended JSON format: "regex": {"$regex": ".", "$options": ""} |
DBPointer | Schema.Type.STRING | * | String in MongoDB extended JSON format: "dbpointer": {"$ref": "source", "$id": {"$oid": "5d079ee6d078c94008e4bb3a"}} |
JavaScript | Schema.Type.STRING | * | Value can be mapped to Schema.Type.STRING, but this will lead to type information loss. There are several options: 1) Do not support this data type for the Sink 2) Map using MongoDB extended JSON format: "javascript": {"$code": "var l = 1;"} |
Symbol | Schema.Type.STRING | * | Value can be mapped to Schema.Type.STRING, but this will lead to type information loss. There are several options: 1) Do not support this data type for the Sink 2) Map using MongoDB extended JSON format: "symbol": {"$symbol": "a"} |
JavaScript (with scope) | Schema.Type.STRING | * | Can be mapped to Schema.Type.STRING using MongoDB extended JSON format: "javascriptwithscope": {"$code": "var l = 1;", "$scope": {"scope": "scope_val"} |
32-bit integer | Schema.Type.INT | + | |
Timestamp | * | Special type for internal MongoDB use which is not associated with the regular Date type. Timestamp values are a 64 bit value where:
Can be mapped to Schema.Type.STRING using MongoDB extended JSON format: "timestamp": {"$timestamp": {"t": 1564410161, "i": 1}} | |
64-bit integer | Schema.Type.LONG | + | |
Decimal128 | Schema.LogicalType.DECIMAL | + | |
Min key | * | Is less than any other value of any type. This can be useful for always returning certain documents first (or last). Can be mapped to Schema.Type.STRING using MongoDB extended JSON format: "minkey": {"$minKey": 1} | |
Max key | * | Is greater than any other value of any type. This can be useful for always returning certain documents first (or last). Can be mapped to Schema.Type.STRING using MongoDB extended JSON format: "maxkey": {"$maxKey": 1} |
...