Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: update based on comments

Table of Contents

Checklist

  •  User Stories Documented
  •  User Stories Reviewed
  •  Design Reviewed
  •  APIs reviewed
  •  Release priorities assigned
  •  Test cases reviewed
  •  Blog post

Introduction 

Enhancements to wrangler. Ability to read connect to external sources like Database, Kafka and perform wrangler directives on them for data preparation. 

Goals

Ability to dynamically load plugins from CDAP Service. 

User Stories 

  • User wants to load the database driver artifact in wranger and execute database commands to load data. Once data is loaded they want to execute wrangler directives on the loaded data.
  • User adds kafka artifact, immediately wants to load the kafka artifact in wrangler and read data from a kafka topic, once data is loaded, they want to execute wrangler directives on the loaded data.
  • User Story #3

Design

Background :

 

1) Currently plugins that required by programs are configured at configure time. these plugin artifacts get localized to the program containers. 
2) At runtime, these plugins can be instantiated by their plugin id 
3) we want to be able to dynamically instantiate plugins based on plugin type, name and version at runtime (not possible currently).  
Example :
user uploads a POSTGres JDBC Driver
wants to load this POSTGres driver in wrangler,
configure connection and execute a query to select data and perform wrangler directives on the data. 
Currently the wrangler service cannot load the PostGres driver, as the PostGres driver was not configured as a plugin in the service configuration. it was not localized to the container and we won't be able to instantiate it.  

Cover details on assumptions made, design alternatives considered, high level design

Approach

Approach #1

Approach #2

API changes

New Programmatic APIs

New Java APIs introduced (both user facing and internal)

 

Code Block
public interface Admin extends DatasetManager, SecureStoreManager, MessagingAdmin, ArtifactManager {

} 
Code Block
/**
 * Provides access to information about artifacts
 */
interface ArtifactManager {

/**
 * get the artifacts which extends the plugins identified by pluginType and pluginName
 */
Set<ArtifactInfo> getParentArtifactsForPlugins(String pluginType, String pluginName);

/**
 * get the plugin artifacts identified by pluginType and pluginName, this can return more than one artifacts when there are multiple versions
 */
Set<ArtifactInfo> getPluginArtifacts(String pluginType, String pluginName);

/**
 * Get the ArtifactInfo for artifact given the name and version
 */
ArtifactInfo getArtifactInfo(String artifactName, String artifactVersion);
...
} 
Code Block
interface PluginContext {
...
/**
 * given the parent classloader and plugin artifact info, create and return a plugin classloader, with this plugin classloader, 
 * user can load the plugin classes and instantiate the plugin.
 * /
ClassLoader createPluginClassLoader(ClassLoader parentClassLoader, ArtifactInfo pluginArtifact);
} 

 

 

Notes :

1) ArtifactInfo class is available in cdap-proto. With Admin, CDAP developers can access the ArtifactManager and get information about artifacts

2) Once they have ArtifactInfo and if they wish to load a plugin from the artifact dynamically, they can use the PluginContext to create a plugin classloader, by passing their classloader and the artifactInfo for the plugin.

3) How the user can instantiate the plugin ?

 Since InstantatiorFactory is available as part of cdap-common, Once they have loaded the desired plugin class, they can use the instantiatorFactory to instantiate the plugin.

 

 

 

Code Block
languagejava
titlePluginContext.java
public interface PluginContext {
...
 
 /**
   * Gets Readonly Artifact Repository, ReadOnlyArtifactRepository provides ability to get artifacts by plugin type and name. Get plugin class   
   * for a given artifactId, pluginType and PluginName and also instantiate this plugin.
   * You can only perform read only operation on artifacts and instantiate plugins using the artifact. However any delete or modify operation  
   * is now allowed.
   * Since this provides live view of Artifact Repository, same call could provide different results, if there is any change in repository.
   *
   *
   * 
   * @return A read only access to artifact repository
   *
   */
   ReadOnlyArtifactRepository getArtifactRepository(); 
  
  ...
}
Code Block
/**
 * Read only access to artifact repository.
 */
class ReadOnlyArtifactRepository {

/** 
 * for the plugin type and plugin name, get all the artifact parents that are usableBy the plugin.
 * @param pluginType
 * @param pluginName
 * @return set of parent artifacts which extends the plugin identified by pluginType and pluginName.
 */
public  Set<ArtifactId>  getParentArtifactsForPlugin(String pluginType, String pluginName)


/**
 * Get the Plugin identified by parent artifact Id, pluginType and pluginName
 * @param parentArtifactId
 * @param pluginType
 * @param pluginName
 * @return Plugin
 * throws PluginNotExistsException if the plugin is not found
 */
public Plugin getPlugin(ArtifactId parentArtifactId, String pluginType, String pluginName)



/**
 * Instantiate the supplied plugin using the provided ClassLoader and return the instantiated plugin.
 * @param classLoader
 * @param loaded plugin 
 * @return instantiated plugin
 * throws InstantiationException if the plugin cannot be instantiated.
 */
 public <T> T instantiatePlugin(ClassLoader classLoader, Plugin plugin)

}     

Deprecated Programmatic APIs

New REST APIs

PathMethodDescriptionResponse CodeResponse
/v3/apps/<app-id>GETReturns the application spec for a given application

200 - On success

404 - When application is not available

500 - Any internal errors

 

     

Deprecated REST API

PathMethodDescription
/v3/apps/<app-id>GETReturns the application spec for a given application

CLI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

UI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

Security Impact 

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages 

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test IDTest DescriptionExpected Results
   
   
   
   

Releases

Release X.Y.Z

Release X.Y.Z

Related Work

  • Work #1
  • Work #2
  • Work #3

 

Future work