Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Option #3 (based on discussion with terence)

1) No new user level apps are deployed. Config store is used to store user drafts of hydrator apps.

2) REST endpoint 'configure', can accept partial config and return a config response with suggestions of values for fields in a plugin, exceptions if any during configuring the plugin. 

  • user can choose a value from the suggestions for the field and call the configure again.
  • user can look at exception, fix the issue with either the script or configuration and call configure again. 
  • when all the required configs are provided and there aren't any exceptions, completionStatus would be set to true for the plugin.


Story 1 - Drafts

 

HTTP Request Type

Endpoint

Request Body

Response Status

Response Body

POST

/namespaces/{namespace-id}/configurations/{config-id}/


 

{

"config": {...}

}

200 OK: config saved successfully

409 CONFLICT: draft-name already exists

500 Error: while saving the draft

 

PUT

/namespaces/{namespace-id}/configurations/{config-id}/


 

{

"config ": {...}

}

200 OK: config updated successfully

404 NOT Found : config doesn't exist already, cannot be updated.

500 Error while updating the config

 

GET

/namespaces/{namespace-id}/configurations/{config-id}/

 

200 return all the versions for the config identified by the config-name

404 config not found

500 error while getting config


 

[

{

"timestamp" : "...",

"config": {

"source" : {

   ....

 },

"transforms" : [...],

"sinks" [...]

"connections" : [..]

}

},

...

]

GET

/namespaces/{namespace-id}/configurations/{config-id}/versions/{version-number}

-1 -> latest version

 

200 return the versions for the config identified by the config-id and version-number

404 config with version found

500 error while getting config


 

{

"timestamp" : "...",

"config": {

"source" : {

   ....

 },

"transforms" : [...],

"sinks" [...]

"connections" : [..]

}

}

GET

/namespaces/{namespace-id}/configurations/

 

200 return the name of list of all saved configs

500 error

[
 "streamToTPFS",
 "DBToHBase",
  ...
]

DELETE

/namespaces/{namespace-id}/configurations/

 

200 successfully deleted all configs

500 error while deleting

 

DELETE

/namespaces/{namespace-id}/configurations/{config-id}

 

200 successfully deleted the specified config

404 config does not exist

500 error while deleting

 

 

The ConsoleSettingsHttpHandler currently makes use of ConfigStore. It's however not name-spaced and has few other issues, it can be fixed and can be improved to store configs.

Along with pipeline drafts ConsoleSettingsHttpHandler also stores the following information currently:

Code Block
titlePlugin Template Endpoints
GET namespaces/{namespace-id}/plugin-templates/{plugin-template-id}/ 
// create a new plugin template
POST namespaces/{namespace-id}/plugin-templates/{plugin-template-id}/ -d '@plugin-template.json' 
// update existing plugin template
PUT namespaces/{namespace-id}/plugin-templates/{plugin-template-id}/ -d '@plugin-template.json'
// delete the plugin template
DELETE namespaces/{namespace-id}/plugin-templates/{plugin-template-id}/ 
Code Block
titleDefaults
 // create/update defaults this include user's plugin version preferences, etc.
 PUT : namespaces/{namespace-id}/defaults -d '@default.json' 
 GET : namespaces/{namespace-id}/defaults 

 

JAVA API - Config Store:

Code Block
titleExisting configstore methods
void create(String namespace, String type, Config config) throws ConfigExistsException;

void createOrUpdate(String namespace, String type, Config config);

void delete(String namespace, String type, String id) throws ConfigNotFoundException;

List<Config> list(String namespace, String type);

Config get(String namespace, String type, String id) throws ConfigNotFoundException; 

void update(String namespace, String type, Config config) throws ConfigNotFoundException;
Code Block
titleConfigstore new methods
// get a particular version of an entry. 
Config get(String namespace, String type, String id, int version) throws ConfigNotFoundException; 
// get all the versions of an entry.
Config getAllVersions(String namespace, String type, String id) throws ConfigNotFoundException; 
// delete all entries of specified type.
void delete(String namespace, String type) 

Open Questions :

1) ConfigStore stores the configs in "config.store.table", currently the table properties doesn't have versioning, drafts would need versioning, this would also need CDAP-upgrade to update properties for the existing dataset? 

2) rename ConsoleSettingsHttpHandler to ConfigurationsHttpHanlder ?

3) Dependent UI changes. 

Story 2 - Schema and field value suggestions : 

REST API:

Request-Method : POST
Request-Endpoint : /namespaces/{namespace-id}/apps/{app-id}/configure
Request-Body
Code Block
titlerequest.json
{
    "artifact": {
        "name": "cdap-etl-batch",
        "scope": "SYSTEM",
        "version": "3.4.0-SNAPSHOT"
    },
    "name": "pipeline",
    "config": {
        "source": {
			     "name": "Stream",
                 "plugin": {
                    "name": "StreamSource",
                    "artifact": {
                        "name": "core-plugins",
                        "version": "1.3.0-SNAPSHOT",
                        "scope": "SYSTEM"
                    },
                    "properties": {
                        "format": "syslog",
                        "name": "test",
                        "duration": "1d"
                    }
                }
            },
         "sinks" : [{..}],
          "transform": [{..}, {...}]
        }
}
 
Response-Body
Code Block
titleresponse.json
{
    "artifact": {
        "name": "cdap-etl-batch",
        "scope": "SYSTEM",
        "version": "3.4.0-SNAPSHOT"
    },
    "name": "pipeline",
    "config": {
        "source": {
				"name": "Stream",
                "plugin": {
                    "name": "StreamSource",
                    "artifact": {
                        "name": "core-plugins",
                        "version": "1.3.0-SNAPSHOT",
                        "scope": "SYSTEM"
                    },
                    "properties": {
                        "format": "syslog",
                        "name": "test",
                        "duration": "1d",
                        "suggestions" : [{ 
                             "schema" : [ 
                                 { 
								 	"ts" : "long", 
                                    "headers", "Map<String, String>", 
                                    "program", "string",
									"message": "string",
									"pid": "string"
						         }
						       ]
							}],
						"isComplete" : "false"
                  	}
                }
            },
         "sinks" : [{..}],
         "transform": [{..}, {...}]
        }
}

 

JAVA API 

PipelineConfigurable API Change
Code Block
titlePipelineConfigurable
@Beta
public interface PipelineConfigurable {
  // change in return-type.
  ConfigResponse configurePipeline(PipelineConfigurer pipelineConfigurer) throws IllegalArgumentException; 
}
Code Block
titleConfigResponse
public class ConfigResponse {
 // list of suggestions for fields. 
 List<Suggestion> suggestions;
 // if there were any exception while executing configure 
 @Nullable
 String exception;
 // is the stage configuration complete ? 
 @DefaultValue("false")
 boolean isComplete;
}
Code Block
titleSuggestion
public class Suggestion {
String fieldName;
// list of possible values for the fieldName
List<String> fieldValues; 
} 
Code Block
titleApplicationContext
@Beta
public interface ApplicationContext<T extends Config> {
  // existing
  T getConfig();
  // application will set a config response
  void setResponseConfig(T response);
  // get the response config
  T getResponseConfig();
}

Open Questions:

1) would having setResponseConfig and getResponseConfig ApplicationContext along with input config,  allow CDAP programs to set a config and read from other programs, would that be an issue?

2) Database's have information schema table, which has metadata information about column names and their types of tables.

  • However we have recently removed tableName from DBSource plugin, so how would we figure out this information from just the query ? 
  • how to get schema for complex queries involving multiple tables ? This would involve parsing query to understand fields and the tables they are from and then querying the information schema for the types. 

3) Incompatible API changes.

 




User Stories (3.5.0)

  1. For the hydrator use case, the backend app should be able to support hydrator related functionalities listed below:
  2. query for plugins available for a certain artifacts and list them in UI
  3. obtaining output schema of plugins provided the input configuration information
  4. deploying pipeline and start/stop the pipeline
  5. query the status of a pipeline run and current status of execution if there are multiple stages.
  6. get the next schedule of run, ability to query metrics and logs for the pipeline runs.
  7. creating and saving pipeline drafts
  8. get the input/output streams/datasets of the pipeline run and list them in UI. 
  9. explore the data of streams/datasets used in the pipeline if they are explorable. 
  10. Add new metadata about a pipeline and retrieve metadata by pipeline run,etc.
  11. delete hydrator pipeline
  12. the backend app's functionalities should be limited to hydrator and it shouldn't be like a proxy for CDAP.  

Having this abilities will remove the logic in CDAP-UI to make appropriate CDAP REST calls, this encapsulation will simplify UI's interaction with the back-end and also help in debugging potential issues faster. In future, we could have more apps similar to hydrator app so our back-end app should define and implement generic cases that can be used across these apps and it should also allow extensibility to support adding new features. 

Generic Endpoints

...