Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

HBase DDL will require some hooks in CDAP, because replication must be setup for every table when it is created, and before any data is written to it. CDAP will define an interface to create, modify, and delete HBase tables. Instead of just creating a table in the local HBase instance, we need to create a table in both the master and slave instances and set up replication from the master to the slave. We can do this by introducing an SPI for HBase DDL operations, where the default implementation is the current single cluster implementation, and users can plug in their own implementation that creates tables and sets up replication as needed.

Java SPI

Code Block
/**
 * ExecutesInterface providing the HBase DDL operations.
 */
@Beta
public interface HBaseDDLExecutor extends Closeable {
  /**
   * Initialize the {@link HBaseDDLExecutor}.
   * @param context the context for the executor
   */
  void initialize(HBaseDDLExecutorContext context);

  /**
   * Create the specified namespace if it does not exist.
   *
   * @param name the namespace to create
   * @return whether the namespace was created
   * @throws IOException if a remote or network exception occurs
   */
  voidboolean createNamespaceIfNotExists(String name) throws IOException;

  /**
   * Delete the specified namespace if it exists.
   *
   * @param name the namespace to delete
   * @throws IOException if a remote or network exception occurs
   * @throws IllegalStateException if there are tables in the namespace
   */
  void deleteNamespaceIfExists(String name) throws IOException;

  /**
   * Create the specified table if it does not exist.
   *
   * @param descriptor the descriptor for the table to create
   * @param splitKeys the initial split keys for the table
   * @throws IOException if a remote or network exception occurs
   * @throws NotFoundException if the namespace for the specified table does not exist
   */
  void createTableIfNotExists(TableDescriptor descriptor, @Nullable byte[][] splitKeys)
    throws IOException;

  /**
   * Enable the specified table if it is disabled.
   *
   * @param namespace the namespace of the table to enable
   * @param name the name of the table to enable
   * @throws IOException if a remote or network exception occurs
   * @throws NotFoundException if the
specified table does not exist
   */
  void enableTableenableTableIfDisabled(String namespace, String name) throws IOException;

  /**
   * Disable the specified table if it is enabled.
   *
   * @param namespace the namespace of the table to disable
   * @param name the name of the table to disable
   * @throws IOException if a remote or network exception occurs
   * @throws NotFoundException if the specified table does not exist
   */
  void disableTabledisableTableIfEnabled(String namespace, String name) throws IOException;

  /**
   * Modify the specified table. The table must be disabled.
   *
   * @param namespace the namespace of the table to modify
   * @param name the name of the table to modify
   * @param descriptor the descriptor for the table
   * @throws IOException if a remote or network exception occurs
   * @throws NotFoundException if the specified table does not exist
   * @throws IllegalStateException if the specified table is not disabled
   */
  void modifyTable(String namespace, String name, TableDescriptor descriptor) throws IOException;
 
  /**
   * Truncate the specified table. The table must be disabled.
   *  
    * @param namespace the namespace of the table to truncate
   * @param name the name of the table to truncate
   * @throws IOException if a remote or network exception occurs
   * @throws NotFoundException if the specified table does
not exist    * @throws IllegalStateException if the specified table is not disabled
   */
  void truncateTable(String namespace, String name) throws IOException;

  /**
   * Delete the table if it exists. The table must be disabled.
   *
   * @param namespace the namespace of the table to delete
   * @param name the table to delete
   * @throws IOException if a remote or network exception occurs
   * @throws NotFoundException if IllegalStateException if the specified table is not disabled
   */
  void deleteTableIfExists(String namespace, String name) throws IOException;

  /**
   * Grant permissions on a table or namespace to users or groups.
   *
   * @param namespace the namespace forof the specified table does not exist
   * @throws IllegalStateException if the specified table is not disabledtable
   * @param table the name of the. If null, then the permissions are applied to the namespace
   * @param permissions A map from user or group name to the permissions for that user or group, given as a string
   *                    containing only characters 'a'(Admin), 'c'(Create), 'r'(Read), 'w'(Write), and 'x'(Execute).
   *                    Group names must be prefixed with the character '@'.
   * @throws IOException if anything goes wrong
   */
  void deleteTableIfExistsgrantPermissions(String namespace, @Nullable String nametable, Map<String, String> permissions) throws IOException;
}

 
/**
 * Describes an HBase Table.
 */
public class TableDescriptor {
  private final String namespace;
  private final String name;
  private final Map<String, ColumnFamilyDescriptor> families; // family -> descriptor
  private final Collection<CoprocessorDescriptor>Map<String, CoprocessorDescriptor> coprocessors; // classname -> descriptor
  private final Map<String, String> properties;
}
 
/**
 * Describes an HBase table CoProcessorcoprocessor.
 */
public class CoprocessorDescriptor {
  private final String classnamedirPath;
  private final Stringint pathpriority;
  private final Map<String, intString> priorityproperties;
 
  private final Map<String, String> properties;public String getCoprocessorPath(String cdapVersion, String hbaseVersion) {
    ...
  }
}

/**
 * Describes an HBase table column family.
 */
public class ColumnFamilyDescriptor {
  private final int maxVersions;
  private final CompressionType compressionType;
  private final BloomType bloomType;
  private final Map<String, String> properties;
}

/**
 * Types of column family compression.
 */
public enum CompressionType {
  LZO, SNAPPY, GZIP, NONE
}
 
/**
 * Types of column family bloom filters.
 */
public enum BloomType {
  ROW, ROWCOL, NONE
}

...

Instead, each CDAP instance will include a tool that will pre-build a coprocessor jar and place it on HDFS in a pre-determined location. Instead of building the jar on demand, it will just always be present on hdfs. The jar name will include the cdap version and the hbase version:

Code Block
hdfs:///cdap/cdap/lib/coprocessor-<cdap-version>-<hbase-version>.jar

 The cdap version and hbase version need to be in the jar name because HBase will not pick up coprocessor changes unless the path is different than before (https://issues.apache.org/jira/browse/HBASE-9046).