...
- User specifies how they would like to handle errors during ingesting, depending on option chosen, the errors in processing are handled.
- User should be able to specify account credentials in configuration.
User Configurations
Section | User Configuration Label | Label Description | Mandatory | Macro-enabled | Options | Default | Variable | User Widget |
---|---|---|---|---|---|---|---|---|
Standard | Reference Name | This will be used to uniquely identify this source for lineage, annotating metadata, etc | + | + | referenceName | Text Box | ||
Table | Database table name | + | + | table | Text Box | |||
Column Family | Column family to use for all inserted rows. | + | + | columnFamily | Text Box | |||
Instance ID | Bigtable instance ID | + | + | instance | Text Box | |||
Project ID | The ID of the project in Google Cloud If not specified, will be automatically read from the cluster environment | + | project | Text Box | ||||
Service Account File Path | Path on the local file system of the service account key used for If the plugin is run on a Google Cloud Dataproc cluster, the service account key does not need to be provided and can be set to 'auto-detect'. When running on other clusters, the file must be present on every node in the cluster. See Google's documentation on Service account credentials for details. | + | serviceFilePath | Text Box | ||||
Key Alias | Name of the field for row key. | + | __key__ | keyAlias | Text Box |
How to handle error in record processing. Error will be thrown if failed to serialize value according to provided input schema.
- Skip error
- Fail pipeline
Bigtable Overview
Storage model
...
- Task will be split using org.apache.hadoop.hbase.mapreduce.TableOutputFormat.
- Values will be converted into bytes using input schema.
- Supported input field types: boolean, int, long, float, double, bytes, string.
- All information about logical types will be lost when inserted because of schema-less DB nature.