...
Install the relevant cdap-hbase-compat package on all hbase nodes in your cluster. Compat packages are:
- cdap-hbase-compat-0.96
- cdap-hbase-compat-0.98
- cdap-hbase-compat-1.0
- cdap-hbase-compat-1.0-cdh
- cdap-hbase-compat-1.0-cdh5.5.0
- cdap-hbase-compat-1.1
- cdap-hbase-compat-1.2-cdh5.7.0
Modify hbase-site.xml on all hbase nodes to enable hbase replication, and to use the CDAP replication status coprocessors
Code Block <property> <name>hbase.replication</name> <value>true</value> </property> <property> <name>hbase.coprocessor.regionserver.classes</name> <value>co.cask.cdap.data2.replication.LastReplicateTimeObserver</value> </property> <property> <name>hbase.coprocessor.wal.classes</name> <value>co.cask.cdap.data2.replication.LastWriteTimeObserver</value> </property>
Modify hbase-env.sh on all hbase nodes to include the hbase coprocessor in the classpath
Code Block export HBASE_CLASSPATH="$HBASE_CLASSPATH:/opt/cdap/<hbase-compat-version>/coprocessor/*" for example, if you're on cdh5.5.x and have installed the cdap-hbase-compat-1.0-cdh5.5.0 package: export HBASE_CLASSPATH="$HBASE_CLASSPATH:/opt/cdap/hbase-compat-1.0-cdh5.5.0/coprocessor/*"
- Restart hbase master and regionservers
HDFS
Hive
Kafka
CDAP Setup
...
Code Block |
---|
<property>
<name>hbase.replication</name>
<value>true</value>
</property>
<property>
<name>hbase.coprocessor.regionserver.classes</name>
<value>co.cask.cdap.data2.replication.hbase10.LastReplicateTimeObserver</value>
</property>
<property>
<name>hbase.coprocessor.wal.classes</name>
<value>co.cask.cdap.data2.replication.hbase10.LastWriteTimeObserver</value>
</property> |
...
- Set up JAAS Configuration.
- Sync /etc/hosts on all the nodes
...
- <>
...
- This can be achieved by using hadoop tool distcp with regular intervals. For help on distcp check here.
- For secure clusters, distcp can be run with cdap keytab.
...
Setup HDFS replication using the solution provided by your distribution. HDFS does not have true replication, but is usually achieved by scheduling regular distcp jobs.
Hive
Setup replication for the database backing your Hive Metastore. Note that this will simply replicate the Hive metadata (which tables exist, table metadata, etc.), but not the data itself. It is assumed you will not be running Hive queries on the slave until a manual failover occurs.
For example, to setup MySQL replication, follow the steps at https://dev.mysql.com/doc/refman/5.7/en/replication-howto.html, which amount to:
Modify my.cnf on the master to set a server-id and use bin logging
Code Block [mysqld] log-bin=mysql-bin server-id=1
Restart mysql on master
Modify my.cnf on the slave to set a server-id
Code Block [mysqld] server-id=2
Restart mysql on slave
Create a replication user on the master
Code Block mysql> CREATE USER 'repl'@'%.mydomain.com' IDENTIFIED BY 'slavepass'; mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%.mydomain.com';
Obtain the master status
Code Block mysql > SHOW MASTER STATUS; +------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +------------------+----------+--------------+------------------+ | mysql-bin.000003 | 73 | test | manual,mysql | +------------------+----------+--------------+------------------+
Set master on the slave
Code Block mysql> CHANGE MASTER TO -> MASTER_HOST='master_host_name', -> MASTER_USER='replication_user_name', // repl -> MASTER_PASSWORD='replication_password', //slavepass -> MASTER_LOG_FILE='recorded_log_file_name', //mysql-bin.000003 -> MASTER_LOG_POS=recorded_log_position; // 73
Start slave
Code Block mysql> start slave
Verify slave status
Code Block mysql> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: localhost Master_User: root Master_Port: 13000 Connect_Retry: 60 Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 1307 Relay_Log_File: slave-relay-bin.000003 Relay_Log_Pos: 1508 Relay_Master_Log_File: master-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 1307 Relay_Log_Space: 1858 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 3e11fa47-71ca-11e1-9e33-c80aa9429562 Master_Info_File: /var/mysqld.2/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 10 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 3e11fa47-71ca-11e1-9e33-c80aa9429562:1-5 Executed_Gtid_Set: 3e11fa47-71ca-11e1-9e33-c80aa9429562:1-5 Auto_Position: 1 Replicate_Rewrite_DB: Channel_name:
Kafka
Setup replication for the Kafka brokers you are using. Kafka MirrorMaker is the most common solution.
CDAP Setup
CDAP requires that you provide an extension that will perform HBase related DDL operations on both clusters instead of just one. To create an extension, you must implement the HBaseDDLExecutor class:
Code Block |
---|
/**
* Interface providing the HBase DDL operations.
*/
public interface HBaseDDLExecutor extends Closeable {
/**
* Initialize the {@link HBaseDDLExecutor}.
* @param context the context for the executor
*/
void initialize(HBaseDDLExecutorContext context);
/**
* Create the specified namespace if it does not exist.
*
* @param name the namespace to create
* @throws IOException if a remote or network exception occurs
*/
void createNamespaceIfNotExists(String name) throws IOException;
/**
* Delete the specified namespace if it exists.
*
* @param name the namespace to delete
* @throws IOException if a remote or network exception occurs
* @throws IllegalStateException if there are tables in the namespace
*/
void deleteNamespaceIfExists(String name) throws IOException;
/**
* Create the specified table if it does not exist.
*
* @param descriptor the descriptor for the table to create
* @param splitKeys the initial split keys for the table
* @throws IOException if a remote or network exception occurs
*/
void createTableIfNotExists(TableDescriptor descriptor, @Nullable byte[][] splitKeys)
throws IOException;
/**
* Enable the specified table if it is disabled.
*
* @param namespace the namespace of the table to enable
* @param name the name of the table to enable
* @throws IOException if a remote or network exception occurs
*/
void enableTableIfDisabled(String namespace, String name) throws IOException;
/**
* Disable the specified table if it is enabled.
*
* @param namespace the namespace of the table to disable
* @param name the name of the table to disable
* @throws IOException if a remote or network exception occurs
*/
void disableTableIfEnabled(String namespace, String name) throws IOException;
/**
* Modify the specified table. The table must be disabled.
*
* @param namespace the namespace of the table to modify
* @param name the name of the table to modify
* @param descriptor the descriptor for the table
* @throws IOException if a remote or network exception occurs
* @throws IllegalStateException if the specified table is not disabled
*/
void modifyTable(String namespace, String name, TableDescriptor descriptor) throws IOException;
/**
* Truncate the specified table. The table must be disabled.
*
* @param namespace the namespace of the table to truncate
* @param name the name of the table to truncate
* @throws IOException if a remote or network exception occurs
* @throws IllegalStateException if the specified table is not disabled
*/
void truncateTable(String namespace, String name) throws IOException;
/**
* Delete the table if it exists. The table must be disabled.
*
* @param namespace the namespace of the table to delete
* @param name the table to delete
* @throws IOException if a remote or network exception occurs
* @throws IllegalStateException if the specified table is not disabled
*/
void deleteTableIfExists(String namespace, String name) throws IOException;
} |
To deploy your extension:
Create an extension directory
Code Block $ mkdir -p /opt/cdap/master/ext/hbase/repl
Copy your jar to the directory
Code Block $ cp myextension.jar /opt/cdap/master/ext/hbase/repl/
Modify cdap-site.xml to use your implementation of HBaseDDLExecutor
Code Block <property> <name>hbase.ddlexecutor.extension.dir</name> <value>/opt/cdap/master/ext/hbase</value> </property>
Modify cdap-site.xml with any properties required by your executor. Anything prefixed by 'cdap.hbase.spi.hbase.' will be available through the Context object passed into your executor's initialize method
Code Block <property> <name>cdap.hbase.spi.hbase.zookeeper.quorum</name> <value>hbase-master-i18003-1000.dev.continuuity.net:2181/cdap</value> </property> <property> <name>cdap.hbase.spi.hbase.zookeeper.session.timeout</name> <value>60000</value> </property> <property> <name>cdap.hbase.spi.hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>cdap.hbase.spi.hbase.bulkload.staging.dir</name> <value>/tmp/hbase-staging</value> </property> <property> <name>cdap.hbase.spi.hbase.replication</name> <value>true</value> </property>