Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Option #3 (based on discussion with terence)

 No new user level apps are deployed. Preference store is used to store user drafts of hydrator apps.

'configurePipeline' can be changed to return partial results, it can return pluginSpecification with possible values for missing information in plugin config, the pluginSpecification will be serialized into applicationSpecification and returned to the user.  

Example:

  1. Hydrator makes a call to Preference store to save name-spaced draft, in order to delete the drafts, delete endpoint is called in preference store for the drafts. If user deletes the namespace manually from CDAP-CLI, the preference store drops everything in that namespace including the drafts.

  2. Plugin configure stage will accept incomplete config and will create PluginSpecification, with possible values for incomplete config.

    1. Example : User is using a DBSource plugin, he provides connectionString, userName and password. the UI hits /validate endpoint with config, DBSource’s configurePlugin is called, it inspects the config, notices the required field ‘tableName' is missing, it connects to the database and gets the list of table names, writes this list in PluginSpecification and returns failure.

    2. User notices the failure, reads the specification to get the list of tables, selects the table he is interested in and makes the same call again, DBSource’s configure plugin notices schema is missing and ‘import’ field is missing. It then populates schema information in spec and returns failure.

    3. user fills the ‘import’, ‘count’ queries and changes schema appropriately and makes the same call, all the necessary fields are present and valid, the DBSource plugin returns successful for this stage. user proceeds to next stage.


REST API DraftsHttpHandler:

 

 

HTTP Request Type

Endpoint

Request Body

Response Status

Response Body

POST

/namespaces/{namespace-id}/drafts/{draft-id}/


 

{

"config": {...}

}

200 OK: draft created and saved successfully

409 CONFLICT: draft-name already exists

500 Error: while creating the draft

 

PUT

/namespaces/{namespace-id}/drafts/{draft-id}/


 

{

"config ": {...}

}

200 OK: draft updated successfully

404 NOT Found : draft doesn't exist already, cannot be updated.

500 Error while updating the draft

 

GET

/namespaces/{namespace-id}/drafts/{draft-id}/

 

200 return all the versions for the draft identified by the draft-name

404 draft not found

500 error while getting draft


 

[

{

"timestamp" : "...",

"config": {

"source" : {

   ....

 },

"transforms" : [...],

"sinks" [...]

"connections" : [..]

}

},

...

]

GET

/namespaces/{namespace-id}/drafts/{draft-id}/versions/{version-number}

-1 -> latest version

 

200 return the versions for the draft identified by the draft-name and version-number

404 draft with version found

500 error while getting draft


 

{

"timestamp" : "...",

"config": {

"source" : {

   ....

 },

"transforms" : [...],

"sinks" [...]

"connections" : [..]

}

}

GET

/namespaces/{namespace-id}/drafts/

 

200 return the name of list of all saved drafts

500 error

[
 "streamToTPFS",
 "DBToHBase",
  ...
]

DELETE

/namespaces/{namespace-id}/drafts/

 

200 successfully deleted all drafts

500 error while deleting

 

DELETE

/namespaces/{namespace-id}/drafts/{draft-id}

 

200 successfully deleted the specified draft

404 draft does not exist

500 error while deleting

 

 

The DraftsHttpHandler can make use of ConfigStore. It can take a similar approach done in PreferenceHttpHandler. 

DraftsHttpHandler->DraftStore->ConfigStore.

 

ConfigStore Existing methods  :

Code Block
void create(String namespace, String type, Config config) throws ConfigExistsException;

void createOrUpdate(String namespace, String type, Config config);

void delete(String namespace, String type, String id) throws ConfigNotFoundException;

List<Config> list(String namespace, String type);

Config get(String namespace, String type, String id) throws ConfigNotFoundException; 

void update(String namespace, String type, Config config) throws ConfigNotFoundException;

ConfigStore new methods:

Code Block
Config get(String namespace, String type, String id, int version) throws ConfigNotFoundException; // get a version of a draft
Config getAllVersions(String namespace, String type, String id) throws ConfigNotFoundException; // get all the versions of the draft. 
void delete(String namespace, String type) // type-> drafts, delete all drafts in the namespace.

 

Existing Config class: 

 

Code Block
public final class Config {
 // draft-id
 private final String id;  
 // config -> json-config and other properties, example:timestamp -> currentTime.
 private final Map<String, String> properties; 


}


Questions :

1) ConfigStore stores the configs in "config.store.table", currently the table properties doesn't have versioning, drafts would need versioning, would this affect the "preferences" stored by PreferenceStore?. This would also need CDAP-upgrade to update properties for the existing dataset? 


REST API for configure suggestions - AppFabric :

Request-Method : POST
Request-Endpoint : /namespaces/{namespace-id}/apps/{app-id}/Configure:configure
Request-Body : config-JSON 
Code Block
titlerequest.json
{
    "artifact": {
        "name": "cdap-etl-batch",
        "scope": "SYSTEM",
        "version": "3.4.0-SNAPSHOT"
    },
    "name": "pipeline",
    "config": {
        "source": {
			     "name": "Stream",
                 "plugin": {
                    "name": "StreamSource",
                    "artifact": {
                        "name": "core-plugins",
                        "version": "1.3.0-SNAPSHOT",
                        "scope": "SYSTEM"
                    },
                    "properties": {
                        "format": "syslog",
                        "name": "test",
                        "duration": "1d"
                    }
                }
            },
         "sinks" : [{..}],
          "transform": [{..}, {...}]
        }
}
 
Response-Body : Config JSON
Code Block
titleresponse.json
{
    "artifact": {
        "name": "cdap-etl-batch",
        "scope": "SYSTEM",
        "version": "3.4.0-SNAPSHOT"
    },
    "name": "pipeline",
    "config": {
        "source": {
				"name": "Stream",
                "plugin": {
                    "name": "StreamSource",
                    "artifact": {
                        "name": "core-plugins",
                        "version": "1.3.0-SNAPSHOT",
                        "scope": "SYSTEM"
                    },
                    "properties": {
                        "format": "syslog",
                        "name": "test",
                        "duration": "1d",
                        "suggestions" : [{ 
                             "schema" : [ 
                                 { 
								 	"ts" : "long", 
                                    "headers", "Map<String, String>", 
                                    "program", "string",
									"message":"string",
									"pid":"string"
						         }
						       ]
							}],
						"isComplete" : "false"
                  	}
                }
            },
         "sinks" : [{..}],
          "transform": [{..}, {...}]
        }
}

 

Plugin API Change
Code Block
titlePipelineConfigurable
@Beta
public interface PipelineConfigurable {
  // change in return-type.
  ConfigResponse configurePipeline(PipelineConfigurer pipelineConfigurer) throws IllegalArgumentException; 
}
Code Block
titleConfigResponse
public class ConfigResponse extends Config {
 // list of suggestions for fields. 
 List<Suggestion> suggestions;
 // if there were any exception while executing configure 
 @Nullable
 String exception;
 // is the stage configuration complete ? 
 @DefaultValue("false")
 boolean isComplete;
}
Code Block
titleSuggestion
public class Suggestion {
String fieldName;
// list of possible values for the fieldName
List<String> fieldValues; 
} 


ETLMapReduce can construct a ETLConfig response based on these individual config-responses and can propagate the config through MR-properties.

User Stories (3.5.0)

  1. For the hydrator use case, the backend app should be able to support hydrator related functionalities listed below:
  2. query for plugins available for a certain artifacts and list them in UI
  3. obtaining output schema of plugins provided the input configuration information
  4. deploying pipeline and start/stop the pipeline
  5. query the status of a pipeline run and current status of execution if there are multiple stages.
  6. get the next schedule of run, ability to query metrics and logs for the pipeline runs.
  7. creating and saving pipeline drafts
  8. get the input/output streams/datasets of the pipeline run and list them in UI. 
  9. explore the data of streams/datasets used in the pipeline if they are explorable. 
  10. Add new metadata about a pipeline and retrieve metadata by pipeline run,etc.
  11. delete hydrator pipeline
  12. the backend app's functionalities should be limited to hydrator and it shouldn't be like a proxy for CDAP.  

Having this abilities will remove the logic in CDAP-UI to make appropriate CDAP REST calls, this encapsulation will simplify UI's interaction with the back-end and also help in debugging potential issues faster. In future, we could have more apps similar to hydrator app so our back-end app should define and implement generic cases that can be used across these apps and it should also allow extensibility to support adding new features. 

Generic Endpoints

...