Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 30 Next »

Overview

The purpose of this page is to illustrate the plan for ApplicationTemplate and Application consolidation.  This work is being tracked in 

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
.

 

Why do we want to consolidate templates and applications? In CDAP 3.0, an ApplicationTemplate is a way for somebody to write an Application that can be given some configuration to create an Adapter. The story is confusing; one would expect an ApplicationTemplate to create... Applications. Instead, we use the term Adapter because Application means something else already. In addition an ApplicationTemplate can only include a single workflow or a single worker, giving people different experiences for templates and applications.

 

Really, the goal of templates was to be able to write one piece of Application code that could be used to create multiple Applications. To do this requires that an Application can be configured at creation time instead of at compile time. For example, a user should be able to set the name of their dataset based on configuration instead of hardcoding it in the code. To support this, we plan on making it possible to get a configuration object from the ApplicationContext available in Application's configure() method. This allows somebody to pass in a config when creating an Application through the RESTful API, which can be used to configure an Application. The relevant programmatic API changes are shown below.

public interface ApplicationContext<T extends Config> {
  T getConfig();
}

public interface Application<T extends Config> {
  void configure(ApplicationConfigurer configurer, ApplicationContext<T> context);
}
 
public abstract class AbstractApplication<T> implements Application<T extends Config> {
  ...
  protected final ApplicationContext<T> getContext() { return context; }
}

 

Use Case Walkthrough

 We will use this example to walk through some use cases.

 

public class MyApp extends AbstractApplication<MyApp.MyConfig> {
 
  public static class MyConfig extends Config {
    @Nullable
    @Description("The name of the stream to read from. Defaults to 'A'.")
    private String stream;
 
    @Nullable
    @Description("The name of the table to write to. Defaults to 'X'.")
    private String table;
 
    @Name("flow")
    private MyFlowConfig flowConfig;
 
    private MyConfig() {
      this.stream = "A";
      this.table = "X";
    }
  }
 
  public void configure() {
    // ApplicationContext now has a method to get a custom config object whose fields will
    // be injected using the values given in the RESTful API
    MyConfig config = getContext().getConfig();
    addStream(new Stream(config.stream));
    createDataset(config.table, Table.class);
    addFlow(new MyFlow(config.stream, config.table, config.flowConfig));
  }
}
 
public class MyFlow implements Flow {
  @Property
  private String stream;
  @Property
  private String table;
  @Property
  private FlowConfig flowConfig;
 
  public static final FlowConfig extends Config {
    private ReaderConfig reader;
    private WriterConfig writer;
  }
 
  MyFlow(String stream, String table, FlowConfig flowConfig) {
    this.stream = stream;
    this.table = table;
    this.flowConfig = flowConfig;
  }
 
  @Override
  public FlowSpecification configure() {
    return FlowSpecification.Builder.with()
      .setName("MyFlow")
      .setDescription("Reads from a stream and writes to a table")
      .withFlowlets()
        .add("reader", new StreamReader(flowConfig.reader))
        .add("writer", new TableWriter(flowConfig.writer))
      .connect()
        .fromStream(stream).to("reader")
        .from("reader").to("writer")
      .build();
  }
} 
 
public class StreamReader extends AbstractFlowlet {
  private OutputEmitter<Put> emitter;
  @Property
  private ReaderConfig readerConfig;
  private Reader reader; 

  public static class ReaderConfig extends Config {
    @Description("The name of the reader plugin to use.")
    String name; 

    @Description("The properties needed by the chosen reader plugin.")
    @PluginType("reader")
    PluginProperties properties;
  }

  public static interface Reader {
    Put read(StreamEvent);
  }
 
  StreamReader(ReaderConfig readerConfig) {
    this.readerConfig = readerConfig;
  }
 
  @Override
  public FlowletSpecification configure() {
    // arguments are: type, name, id, properties
    usePlugin("reader", readerConfig.name, "streamReader", readerConfig.properties);
  }

  @Override
  public void initialize(FlowletContext context) throws Exception {
    reader = context.newPluginInstance("streamReader");
  }
  @ProcessInput
  public void process(StreamEvent event) {
    emitter.emit(reader.read(event));
  }
}
 
@Plugin(type = "reader")
@Name("default")
@Description("Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.")
public class DefaultStreamReader implements StreamReader.Reader {
  private DefaultConfig config;
 
  public static class DefaultConfig extends PluginConfig {
    @Description("The header that should be used as the row key to write to. Defaults to 'rowkey'.")
    @Nullable
    private String rowkey;
    
    private DefaultConfig() {
      rowkey = "rowkey";
    }
  }
 
  public Put read(StreamEvent event) {
    Put put = new Put(Bytes.toBytes(event.getHeaders().get(config.rowkey)));
    put.add("timestamp", event.getTimestamp());
    put.add("body", Bytes.toBytes(event.getBody()));
    return put;
  }
}

 

1. Deploying an Artifact

A development team creates a project built on top of CDAP. Their CI build runs and produces a jar file. An administrator deploys the jar by making a REST call:

POST /namespaces/default/artifacts/myapp --data-binary @myapp-1.0.0.jar

CDAP opens the jar, figures out the artifact version based on the the bundle-version in the manifest, figures out what apps, programs, datasets, and plugins are in the artifact, then stores the artifact on the filesystem and metadata in a table.

The administrator can examine the metadata by making a call:

GET /namespaces/default/artifacts/myapp/versions/1.0.0
 
{
  "name": "myapp",
  "version": "1.0.0",
  "classes": {
    "apps": [
      {
        "className": "co.cask.cdap.examples.myapp.MyApp",
        "properties": {
          "stream": { 
            "name": "stream", 
            "description": "The name of the stream to read from. Defaults to 'A'.", 
            "type": "string", 
            "required": false 
          },
          "table": {
            "name": "table",
            "description": "The name of the table to write to. Defaults to 'X'.",
            "type": "string",
            "required": false,
          },
          "flowConfig": {
            "name": "flow",
            "description": "",
            "type": "config",
            "fields": {
              "reader": {
                "name": "reader",
                "description": "",
                "type": "config",
                "required": true,
                "fields": {
                  "name": {
                    "name": "name",
                    "description": "The name of the reader plugin to use.",
                    "type": "string",
                    "required": true
                  },
                  "properties": {
                    "name": "properties",
                    "description": "The properties needed by the chosen reader plugin.",
                    "type": "plugin",
                    "plugintype": "reader",
                    "required": true
                  }
                }
              },
              "writer": { ... }
            }
          }
        }
      }
    ],
    "plugins": [
      {
        "name": "default",
        "type": "reader",
        "description": "Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.",
        "className": "co.cask.cdap.examples.myapp.plugins.DefaultStreamReader",
        "properties": {
          "rowkey": {
            "name": "rowkey",
            "description": "The header that should be used as the row key to write to. Defaults to 'rowkey'.",
            "type": "string",
            "required": false
          }
        }
      }
    ],
    "flows": [ ... ],
    "flowlets": [ ... ],
    "datasetModules": [ ... ]
  }
}

Reverse indices will be mained to allow querying the classes in artifacts directly:

GET /namespaces/default/classes/plugintypes
[ "reader" ]
 
GET /namespaces/default/classes/plugintypes/reader
[
  {    
    "type": "reader",
    "name": "default",
    "description": "Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.",
    "className": "co.cask.cdap.examples.myapp.plugins.DefaultStreamReader"
    "artifact": {
      "namespace": "default",
      "name": "myapp",
      "version": "1.0.0"
    }
  }
]
 
GET /namespaces/default/classes/apps
[
  {
    "className": "co.cask.cdap.examples.myapp.MyApp",
    "description": "",
    "artifact": {
      "namespace": "default",
      "name": "myapp",
      "version": "1.0.0"
    }
  }
]

2. Creating an Application

The administrator notices there is an app 'co.cask.cdap.examples.myapp.MyApp' contained in the artifact.  Based on the app properties, the admin gathers that it needs a config of the form:

{
  "stream": "A",
  "table": "X",
  "flow": {
    "reader": { 
      "name": "<some plugin name>",
      "properties": { <properties for plugins of type "reader"> }
    },
    "writer": { ... }
  }
}

He then makes a call to see what plugins of type 'reader' are available:

GET /namespaces/default/classes/plugintypes/reader
[
  {    
    "type": "reader",
    "name": "default",
    "description": "Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.",
    "className": "co.cask.cdap.examples.myapp.plugins.DefaultStreamReader"
    "artifact": {
      "namespace": "default",
      "name": "myapp",
      "version": "1.0.0"
    }
  }
]

It looks like there is only one plugin of type reader available. Another call gives more details about what that plugin requires:

GET /namespaces/default/classes/plugintypes/reader/plugins/default
[  
  {    
    "type": "reader",
    "name": "default",
    "description": "Writes timestamp and body as two columns and expects the row key to come as a header in the stream event.",
    "className": "co.cask.cdap.examples.myapp.plugins.DefaultStreamReader",
    "properties": {         
      "rowkey": {
        "name": "rowkey",
        "description": "The header that should be used as the row key to write to. Defaults to 'rowkey'.",
        "type": "string",
        "required": false
      }
    },
    "artifact": {
      "namespace": "default",
      "name": "myapp",
      "version": "1.0.0"
    }
  }
]

Now the admin has all the information needed to create an application from the artifact:

PUT /namespaces/default/apps/purchaseDump -d '
{ 
  "artifact": {
    "name": "myapp",
    "version": "1.0.0"
  },
  "config": {
    "stream": "purchases",
    "table": "events",
    "flow": {
      "reader": { 
        "name": "default",
        "properties": { }
      }, 
      "writer": { ... }
    }
  }
}'

This creates an application that reads from the 'purchases' stream and writes to the 'events' table. In the same way, other applications can be created that read from and write to configurable data sources.

3. Upgrading an Application

A bug is found in the application code. A new version of the artifact is created and deployed:

POST /namespaces/default/artifacts/myapp --data-binary @myapp-1.0.1.jar

The admin wants to see what applications are using the old version of the artifact:

GET /namespaces/default/apps?artifactName=myapp&artifactVersion=1.0.0
[
  {
    "name": "purchaseDump"
  }
]

The admin stops all running programs for the app using existing APIs. A call is then made to upgrade the app to the latest artifact:

PUT /namespaces/default/apps/purchaseDump/properties -d '
{
  "artifact": {
    "name": "myapp",
    "version": "1.0.1"
  },
  "config": {
    "stream": "purchases",
    "table": "events",
    "flow": {
      "reader": { 
        "name": "default",
        "properties": { }
      }, 
      "writer": { ... }
    }
  }
}'

4. Rolling back an Application

Oops, it turns out v1.0.1 has a critical bug. The admin stops all running programs then makes a call to rollback to the previous version:

PUT /namespaces/default/apps/purchaseDump/properties -d '
{
  "artifact": {
    "name": "myapp",
    "version": "1.0.0"
  },
  "config": {
    "stream": "purchases",
    "table": "events",
    "flow": {
      "reader": { 
        "name": "default",
        "properties": { }
      }, 
      "writer": { ... }
    }
  }
}'

5. System Artifacts

System artifacts are special artifacts that can be accessed in other namespaces. They cannot be deployed through the RESTful API. Instead, they are placed in a directory on the CDAP master host. When CDAP starts up, the directory will be scanned and those artifacts will be added to the system. Example uses for system artifacts are the ETLBatch and ETLRealtime applications that we want to include out of the box.

 

System artifacts are included in results by default and are indicated with a special flag.

GET /namespaces/default/artifacts?includeSystem=true
[
  {
    "name": "ETLBatch",
    "version": "3.1.0",
    "isSystem": true
  },  
  {
    "name": "ETLRealtime",
    "version": "3.1.0",
    "isSystem": true
  },
  {
    "name": "ETLPlugins",
    "version": "3.1.0",
    "isSystem": true
  },
  {
    "name": "myapp",
    "version": "1.0.0",
    "isSystem": false
  },
  {
    "name": "myapp",
    "version": "1.0.1",
    "isSystem": false
  }
]

System artifacts can be excluded from results using a filter:

GET /namespaces/default/artifacts?includeSystem=false
[
  {
    "name": "myapp",
    "version": "1.0.0",
    "isSystem": false
  },
  {
    "name": "myapp",
    "version": "1.0.1",
    "isSystem": false
  }
]

 

When a user wants to create an application from a system artifact, they make the same RESTful call as before, except adding a special flag to indicate it is a system artifact:

 

PUT /namespaces/default/apps/somePipeline -d '
{ 
  "artifact": {
    "name":"ETLBatch",
    "version":"3.1.0",
    "isSystem": true
  },
  "config": { ... }
}'

6. Deleting an Artifact

Non-snapshot artifacts will be immutable. Advanced users can delete an existing artifact, but the assumption will be that they know exactly what they are doing. Deleting an artifact may cause programs that are using it to fail.

7. CDAP Upgrade

The programmatic API changes are all backwards compatible, so existing apps will not need to be recompiled. They will, however, need to be added to the artifact repository.

Any existing adapters will need to be migrated. Ideally, the upgrade tool will create matching applications based on the adapter conf, but at a minimum we will simply delete existing adapters and templates.

RESTful API changes

Application APIs

TypePathBodyHeadersDescription
GET/v3/namespaces/<namespace-id>/apps?artifactName=<name>[&artifactVersion=<version>]  get all apps using the given artifact name and version
POST/v3/namespaces/<namespace-id>/appsapplication jar contentsApplication-Config: <json of config>same as deploy api today, except allows passing config as a header
PUT/v3/namespaces/<namespace-id>/apps/<app-name>application jar contentsApplication-Config: <json of config>same as deploy api today, except allows passing config as a header
PUT/v3/namespaces/<namespace-id>/apps/<app-name>
{ 
  'artifact': {'name':<name>, 'version':<version>}, 
  'config': { ... } 
}
Content-Type: application/json

create an application from an existing artifact.

Note: Edits existing API, different behavior based on content-type

PUT/v3/namespaces/<namespace-id>/apps/<app-name>/properties
{ 
  'artifact': {'name':<name>, 'version':<version>}, 
  'config': { ... } 
}
 update an existing application. No programs can be running

Artifact APIs

TypePathBodyHeadersDescription
GET/v3/namespaces/<namespace-id>/artifacts   
GET/v3/namespaces/<namespace-id>/artifacts/<artifact-name>  Get data about all artifact versions
POST/v3/namespaces/<namespace-id>/artifacts/<artifact-name>jar contentsArtifact-Version: <version>Add a new artifact. Version header only needed if Bundle-Version is not in jar Manifest. If both present, header wins.
GET/v3/namespaces/<namespace-id>/artifacts/<artifact-name>/versions/<version>  Get details about the artifact, such as what plugins and applications are in the artifact and properties they support
PUT/v3/namespaces/<namespace-id>/artifacts/<artifact-name>/versions/<version>/classeslist of classes contained in the jar This is required for 3rd party jars, such as the mysql jdbc connector. It is the equivalent of the .json file we have in 3.0
GET/v3/namespaces/<namespace-id>/classes/plugintypes  

 

GET/v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type>   
GET/v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type>/plugins/<plugin-name>  

config properties can be nested now. For example:

{
  "className": "co.cask.cdap.example.MyPlugin",
  "description": "My Plugin",
  "name": "MyPlugin",
  "properties": {
    "threshold": { "name": "thresh", "type": "int", "required": false },
    "user": { "name": "user", "type": "config", "required": true,
      "fields": {
        "id": { "name": "id", "type": "long", "required": true },
        "digits": { "name": "phoneNumber", "type": "string", "required": true }
      }
    }
  }
}
GET/v3/namespaces/<namespace-id>/classes/apps   
GET/v3/namespaces/<namespace-id>/classes/apps/<app-classname>   

Template APIs (will be removed)

TypePathReplaced By
GET/v3/templates 
GET/v3/templates/<template-name> 
GET/v3/templates/<template-name>/extensions/<plugin-type>/v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type>
GET/v3/templates/<template-name>/extensions/<plugin-type>/plugins/<plugin-name>/v3/namespaces/<namespace-id>/classes/plugintypes/<plugin-type>/plugins/<plugin-name>
PUT/v3/namespaces/<namespace-id>/templates/<template-id> 
GET/v3/namespaces/<namespace-id>/adapters 
GET/v3/namespaces/<namespace-id>/adapters/<adapter-name> 
POST/v3/namespaces/<namespace-id>/adapters/<adapter-name>/start 
POST/v3/namespaces/<namespace-id>/adapters/<adapter-name>/stop 
GET/v3/namespaces/<namespace-id>/adapters/<adapter-name>/status 
GET/v3/namespaces/<namespace-id>/adapters/<adapter-name>/runs 
GET/v3/namespaces/<namespace-id>/adapters/<adapter-name>/runs/<run-id> 
DELETE/v3/namespaces/<namespace-id>/adapters/<adapter-name> 
  • No labels