Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Checklist

  •  User Stories Documented
  •  User Stories Reviewed
  •  Design Reviewed
  •  APIs reviewed
  •  Release priorities assigned
  •  Test cases reviewed
  •  Blog post

Introduction 

CDAP pipeline is composed of various plugins that can be configured by users as CDAP pipelines are being developed. While building CDAP pipelines, pipeline developer can provide invalid plugin configurations or schema. For example, the BigQuery sink plugin can have output schema which does not match with underlying BigQuery table. CDAP pipeline developer can use new validation endpoint to validate the stages before deploying the pipeline. In order to fail fast and for better user experience, validation endpoint should return all the validation errors from a given stage when this endpoint is called. 

Data pipeline app exposes various error types for plugin validation. In future releases, new error types can be introduced. With current implementation, when plugins with new error types are pushed to hub, data pipeline artifacts need to be updated for every new type of error that is introduced. This is because the validation errors are defined in the data pipeline app itself. A better approach would be to modify data pipeline app so that app artifacts do not need to be replaced for every new type of error.

Goals

  • To fail fast and for better user experience, introduce a new api to collect multiple validation error messages from a stage at configure time

  • Decouple validation error types from data pipeline app
  • Instrument plugins to use this api to return multiple error messages for validation endpoint

User Stories 

  • As a CDAP pipeline developer, when I validate a stage, I expect that all the invalid config properties and input/output schema fields are highlighted on CDAP UI with appropriate error message and corrective action.
  • As a plugin developer, I should be able to capture all the validation errors while configuring the plugin so that all the validation errors can be surfaced on CDAP UI.
  • As a plugin developer, I should be able to use new validation error types without replacing data pipeline app artifacts. 

API Changes for Plugin Validation

Collect Multiple errors from plugins

To collect multiple stage validation errors from the stage, StageConfigurer, MultiInputStageConfigurer and MultiOutputStageConfigurer can be modified as below. If there are any validation errors added to stage configurer, the pipeline deployment will fail and all the errors will be returned as a response to stage validation REST endpoint. Current implementation does not expose stage name to the plugin in configurePipeline method. Stage name will be needed by the plugins to create stage specific errors. For that, stage name will be exposed to plugins through stage configurer as below.

Code Block
languagejava
titleStageConfigurer.java
public interface StageConfigurer {

  ...

/**
 * get the stage name.
 * @return stage name
 */
String getStageName();

/**
 * add validation failures.
 * @param e failures
 */
void addValidationFailure(ValidationFailure e);

Decouple plugin error types from data pipeline app

Approach - 1

To carry error information, a new ValidationFailure class is introduced to collect multiple validation failures in stage configurer. This class can be built using a ValidationFailureBuilder which only allows string properties. The builder expose methods to get message, type and properties of a failure.  

Code Block

The validation failures are collected using ValidationException. Using this validation exception whenever plugin has an invalid property that is tied to another invalid property, plugin can throw a validation exception with all the errors collected so far. This keep plugin validation code much simpler. 

Code Block
languagejava
titleValidationFailure.java
/**
 * RepresentsValidation failure that occurred during validation.
 */
@Beta
public class ValidationFailure {
  private final String message;
  private final String type;
  private final Map<String, Object> properties;

  ValidationFailure(String message, String type, Map<String, Object> propertiesmap) {
    this.message = message;
    this.type = type;
    this.properties = propertiesmap;
  }

  public static ValidationFailureBuilderValidationFailure.Builder builder(String message, String type) {
    return new DefaultValidationFailureBuilder(ValidationFailure.Builder(message, type);
  }
}
Code Block
languagejava
titleValidationFailureBuilder.java
/**
  *public Validationstatic Failureclass builder.Builder {
*/ @Beta public interface ValidationFailureBuilderprivate {final String message;
 /**   private * Sets failure message.final String type;
    private *final @paramMap<String, messageObject> properties;

 */   ValidationFailureBuilderprivate setMessageBuilder(String message);, String type) {
/**    * Sets failure type. this.message = message;
    * @param this.type failure= type;
     */ this.properties = ValidationFailureBuildernew setTypeHashMap<>(String type);
   /** }

 * Adds a property aboutpublic ValidationFailure.Builder addProperty(String property, String *value) @param{
propertyName failure property name    * @param value failure property value
   * @return
   */
  ValidationFailureBuilder addProperty(String propertyName, String value);

  /**
   * Builds validation failure.
   */
  ValidationFailure build();
}this.properties.put(property, value);
      return this;
    }

    ....
    public ValidationFailure build() {
      return new ValidationFailure(message, type, properties);
    }
  }
}

Code Block
languagejava
titleDefaultValidationFailureBuilderValidationException.java
/**
 * DefaultRepresents validationValidation failure builderException.
 */
@Beta
public class DefaultValidationFailureBuilderValidationException implementsextends ValidationFailureBuilderRuntimeException {
  private StringList<ValidationFailure> messagefailures;

  private String type;
ValidationException(List<ValidationFailure> failures) {
 private final Map<String, Object> properties;

  public DefaultValidationFailureBuilder super();
{     this.propertiesfailures = new HashMap<>()failures;
  }
  
  @Override
  public ValidationFailureBuilderString setMessagegetMessage(String message) {
    this.message = message;
 ...
  }

  /**
  return * this;Returns list of }failures.
   @Override*/
  public ValidationFailureBuilderList<ValidationFailure> setTypegetFailures(String type) {
    this.type = typereturn failures;
  }

return this; public static }

ValidationException.Builder builder() {
 @Override   publicreturn ValidationFailureBuildernew addProperty(String propertyName, String value) {
    properties.put(propertyName, value);ValidationException.Builder();
  }

  /**
   * returnBuilder this;to build validation }exception.
   @Override*/
  public static ValidationFailureclass build()Builder {
    returnprivate final new ValidationFailure(message, type, properties);List<ValidationFailure> failures;

   }
}

 If needed, plugin can create its own ValidationFailure, however, for convenience, below helper class is provided as part of cdap-etl-api to build common validation failures.

Code Block
languagejava
titleValidationFailures.java
/**
 * Helper class to create various validation failures.
 */
public final class ValidationFailureFactory {

  /**
   * Builds stage validation failure.
   * @param message failure message
   * @param type failure type
   * @param stage stage for which failure happened private Builder() {
      this.failures = new ArrayList<>();
    }

    public ValidationException.Builder addFailure(ValidationFailure failure) {
      failures.add(failure);
      return this;
    }

    /**
     * @returnUtil validationmethod failureto create and add */stage validation failure publicto staticthis ValidationFailureexception.
createStageValidationFailure(String message, String stage, @Nullable )*
{     ValidationFailureBuilder* builder@param message = ValidationFailure.builder();
validation failure message
   builder.setMessage(message);  * @param  builder.setType("STAGE_ERROR");
    builder.addProperty("stage", stage);type type of the validation failure
     * return builder.build();
  }@param stage stage name
    private ValidationFailureFactory()* {@param correctiveAction suggested action
 // no-op   }*/
  ... }

API usage in plugins

Code Block
@Override public voidValidationException.Builder configurePipeline(PipelineConfigurer pipelineConfigurer) {addStageValidationFailure(String message, String stage,
  pipelineConfigurer.createDataset(conf.destinationFileset, FileSet.class);   StageConfigurer stageConfigurer = pipelineConfigurer.getStageConfigurer();   // get the name of the stage    String stageName = stageConfigurer.getStageName();   try {     Pattern.compile(conf.filterRegex);   } catch (Exception e) {       // add validation failure to stage configurer     stageConfigurer.addValidationFailure(ValidationFailures.createFieldValidationFailure(e.getMessage(), stageName, "filterRegex"));   } String correctiveAction) if (conf.sourceFileset.equals(conf.destinationFileset)) {{
      // add validation failure to stage configurerValidationFailure.Builder builder = ValidationFailure.builder(message, "INVALID_STAGE");
      stageConfigurerbuilder.addValidationFailure(ValidationFailures.createStageValidationFailure("source and destination filesets must be different", stageName));
  }
}
Approach - 2

Validation error represents an error with various causes with different attributes for each cause. For example, when the input schema field type does not match the underlying sink schema, the cause is input field mismatch with attributes such as stage name, field name, suggested type etc. Each error message can be associated to more than one causes. This can happen for plugins such as joiner and splitter where there are multiple input or output schemas from a given stage. For example, when input schemas for joiner are not compatible, the causes will include mismatching fields from input schemas of incoming stages. This means that a validation error can be represented as a list of causes where each cause is a map of cause attribute to its value as shown below.

Code Block
languagejava
titleValidationFailure.java
/**
 * Represents failure that occurred during validation.
 */
@Beta
public class ValidationFailure {
  private final String message;
  protected final List<Map<String, Object>> causes;

  /**
   * Creates a validation failure with a message and empty map of causes
   * @param message
   */
  public ValidationFailure(String message) {
    this.message = message;	
    this.causes = new ArrayList<>();
  }
 
  @Override
   public boolean equals(Object o) {
    if (this == o) {
      return true;
    }
    if (o == null || getClass() != o.getClass()) {
      return false;
    }
    ValidationFailure that = (ValidationFailure) o;
    return message.equals(that.message) && causes.equals(that.causes);
  }

  @Override
  public int hashCode() {
    return Objects.hash(message, causes);
  }
  
}

All the attributes of a cause can be tracked at central location as below: 

Code Block
languagejava
titleFailureAttributes.java
/**
 * Failure attributes.
 */
public enum FailureAttributes {
  STAGE("stage"), // represents stage being validated
  PROPERTY("property"), // represents stage property
  INPUT_FIELD("input_field") // represents field in the input schema
  OUTPUT_FIELD("output_field") // represents field in the output schema
  OUTPUT_PORT("output_port"), // represents output port for plugins such as SplitterTransform where multiple output schemas are expected
  INPUT_STAGE("input_stage"), // represents input stage for plugins such as Joiner where multiple input schemas are expected
  ..

  private String name;

  FailureAttributes(String name) {
    this.name = name;
  }
}

Introduced Errors

With this approach following error classes can be added to hydrator-common which represents specific type of errors
addProperty("stage", stage);
      builder.addProperty("correctiveAction", correctiveAction);
      failures.add(builder.build());
      return this;
    }
    ...

    /**
     * Build and throw validation exception. This method can be used by plugins to throw an exception at any point
     * during validation. The method will build and throw ValidationException if there are any failures added while
     * building the exception
     *
     * @return Validation exception
     */
    public ValidationException buildAndThrow() {
      if (failures.isEmpty()) {
        return new ValidationException(failures);
      }
      throw new ValidationException(failures);
    }
  }
}


API usage in plugins

Code Block
@Override
public void configurePipeline(PipelineConfigurer pipelineConfigurer) {
  pipelineConfigurer.createDataset(conf.destinationFileset, FileSet.class);
  StageConfigurer stageConfigurer = pipelineConfigurer.getStageConfigurer();
  // get the name of the stage 
  String stageName = stageConfigurer.getStageName();
  ValidationException.Builder exceptionBuilder = ValidationException.builder();
  try {
    Pattern.compile(conf.filterRegex);
  } catch (Exception e) {  
    // add validation failure to stage configurer
    exceptionBuilder.addValidationFailure(e.getMessage(), stageName, "filterRegex", "Make sure the file regex is correct"));
  }
  if (conf.sourceFileset.equals(conf.destinationFileset)) {
    // add validation failure to stage configurer
    exceptionBuilder.addValidationFailure("source and destination filesets must be different", stageName, "Provide different source and destination filesets"));
  }
  exceptionBuilder.buildAndThrow();
}


Approach - 2

Validation error represents an error with various causes with different attributes for each cause. For example, when the input schema field type does not match the underlying sink schema, the cause is input field mismatch with attributes such as stage name, field name, suggested type etc. Each error message can be associated to more than one causes. This can happen for plugins such as joiner and splitter where there are multiple input or output schemas from a given stage. For example, when input schemas for joiner are not compatible, the causes will include mismatching fields from input schemas of incoming stages. This means that a validation error can be represented as a list of causes where each cause is a map of cause attribute to its value as shown below.

Code Block
languagejava
titleInvalidStageFailureValidationFailure.java
/**
 * Represents failure that occurred during stage validation.
 */
@Beta
public class InvalidStageFailure extends ValidationFailure {
  /**private final String message;
 * protected Createsfinal validationList<Map<String, failureObject>> thatcauses;
occurred
during stage validation./**
   * @paramCreates a messagevalidation failure messagewith a message and *empty @parammap stage name of thecauses
stage that caused this* validation@param failuremessage
   */
  public InvalidStageFailureValidationFailure(String message,) String{
stage) {   this.message = super(message);	
    causes.add(Collections.singletonMap("stage", stage))this.causes = new ArrayList<>();
  }
}
Code Block
languagejava
titleInvalidStagePropertyFailure.java
/** 
 * Represents@Override
failure that occurred duringpublic stageboolean config property validation.equals(Object o) {
 */ @Beta public classif InvalidStagePropertyFailure(this extends== ValidationFailureo) {
 /**    * Createsreturn validationtrue;
failure that occurred during stage}
validation.    *if @param(o message== failurenull message|| getClass() !=  * @param stage name of the stage that caused this validation failure
   * @param property property that is invalid
   */
  public InvalidStageFailure(String message, String stage, String property) {
    super(message);
    Map<String, Object> map = new HashMap<>();
    map.put("stage", stage);
    map.put("property", property);
    causes.add(mapo.getClass()) {
      return false;
    }
    ValidationFailure that = (ValidationFailure) o;
    return message.equals(that.message) && causes.equals(that.causes);
  }

  @Override
  public int hashCode() {
    return Objects.hash(message, causes);
  }
  
/**
   * Creates validation failure that occurred during stage validation.
   * @param message failure message
   * @param stage name of the stage that caused this validation failure
   * @param properties properties that is caused this failure
   */
  public InvalidStageFailure(String message, String stage, String[] properties) {
    super(message);
    Map<String, Object> map = new HashMap<>();
    for (String property : properties) {
        map.put("stage", stage);
        map.put("property", property);
    }
    causes.add(map);
  }
}}

All the attributes of a cause can be tracked at central location as below: 

Code Block
languagejava
titleFailureAttributes.java
/**
 * Failure attributes.
 */
public enum FailureAttributes {
  STAGE("stage"), // represents stage being validated
  PROPERTY("property"), // represents stage property
  INPUT_FIELD("inputField") // represents field in the input schema
  OUTPUT_FIELD("outputField") // represents field in the output schema
  OUTPUT_PORT("outputPort"), // represents output port for plugins such as SplitterTransform where multiple output schemas are expected
  INPUT_STAGE("inputStage"), // represents input stage for plugins such as Joiner where multiple input schemas are expected
  ..

  private String name;

  FailureAttributes(String name) {
    this.name = name;
  }
}


Introduced Errors

With this approach following error classes can be added to hydrator-common which represents specific type of errors.

Code Block
languagejava
titleInvalidInputSchemaFailureInvalidStageFailure.java
/**
 * Represents failure that invalidoccurred inputduring schemastage failurevalidation.
  */
@Beta
public class InvalidInputSchemaFailureInvalidStageFailure extends ValidationFailure {
   /**
   * Creates invalid input schema failurevalidation failure that occurred during stage validation.
   * @param message failure message
   * @param stage name of the stage that caused this validation *failure
@param map map of*/
incoming stage name to field that is invalid.
   */
  public InvalidInputSchemaFailurepublic InvalidStageFailure(String message, String stage, Map<String, String> map) {
    super(message);
    for (Map.Entry<String, String> entry : map.entrySet()) {
      Map<String, Object> causeMap = new HashMap<>();
      causeMap.put("stage", stage);
      causeMap.put("input_stage", entry.getKey());
      causeMap.put("input_field", entry.getValue());
      causes.add(causeMap);
    }
  }
}
Code Block
languagejava
titleInvalidOutputSchemaFailure.java
/**
 * Represents invalid output schema failure.
 */
public class InvalidOutputSchemaFailure extends ValidationFailure {

  /**
   * Creates invalid output schema failure.
   * @param message failure message
   * @param stage name of the stage
   * @param map map of output going port name to field that is invalid
   */
  public InvalidOutputSchemaFailure(String message, String stage, Map<String, String> map) {
    super(message);
    for (Map.Entry<String, String> entry : map.entrySet()) {
      Map<String, Object> causeMap = new HashMap<>();
      causeMap.put("stage", stage);
      causeMap.put("output_port", entry.getKey());
      causeMap.put("output_field", entry.getValue());
      causes.add(causeMap);
    }
  }
}

API usage in plugins

Code Block
@Override
public void configurePipeline(PipelineConfigurer pipelineConfigurer) {
  pipelineConfigurer.createDataset(conf.destinationFileset, FileSet.class);
  StageConfigurer stageConfigurer = pipelineConfigurer.getStageConfigurer();
  // get the name of the stage 
  String stageName = stageConfigurer.getStageName();
  try {
    Pattern.compile(conf.filterRegex);
  } catch (Exception e) {  
    // add validation error to stage configurer
    stageConfigurer.addValidationFailure(new InvalidStagePropertyFailure(e.getMessage(), stageName, "filterRegex"));
  }
  if (conf.sourceFileset.equals(conf.destinationFileset)) {
    // add validation error to stage configurer
    stageConfigurer.addValidationFailure(new InvalidStageFailure("source and destination filesets must be different", stageName));
  }
}

Impact on UI

TypeDescriptionScenarioApproach - 1 - Json ResponseApproach - 2 - Json ResponseSTAGE_ERRORRepresents validation error while configuring the stageIf there is any error while connecting to sink while getting actual schema
{
"errors": [
{
      "type" : "STAGE_ERROR", 
      "stage" : "src",
      "message" : "Could not load jdbc driver."
    }
]
}
{
"errors": [
{
"message": "Could not load jdbc driver.",
      "causes": [
{
"stage": "src"
}
]
}
]
}
INVALID_PROPERTYRepresents invalid configuration propertyIf config property value contains characters that are not allowed by underlying source or sink
{
"errors": [
{
"type" : "INVALID_PROPERTY",
"stage" : "projection",
"message" : "Can not specify both drop and keep. One should be empty or null",
"property" : "drop"
},
{
"type" : "INVALID_PROPERTY",
"stage" : "projection",
"message" : "Can not specify both drop and keep. One should be empty or null",
"property" : "keep"
}
]
}
{
"errors": [
{
"message": "Can not specify both drop and keep. One should be empty or null",
"causes": [
{
"stage": "projection",
"property": "keep"
},
{
"stage" : "projection",
"property" : "drop"
}
]
}
]
}
PLUGIN_NOT_FOUNDRepresents plugin not found error for a stage. This error will be added by the data pipeline appIf the plugin was not found. This error will be thrown from the data pipeline app{
"errors": [
{
"stage": "src
causes.add(Collections.singletonMap("stage", stage));
  }
}
Code Block
languagejava
titleInvalidStagePropertyFailure.java
/**
 * Represents failure that occurred during stage config property validation.
 */
@Beta
public class InvalidStagePropertyFailure extends ValidationFailure {
 /**
   * Creates validation failure that occurred during stage validation.
   * @param message failure message
   * @param stage name of the stage that caused this validation failure
   * @param property property that is invalid
   */
  public InvalidStageFailure(String message, String stage, String property) {
    super(message);
    Map<String, Object> map = new HashMap<>();
    map.put("stage", stage);
    map.put("property", property);
    causes.add(map);
  }


 /**
   * Creates validation failure that occurred during stage validation.
   * @param message failure message
   * @param stage name of the stage that caused this validation failure
   * @param properties properties that is caused this failure
   */
  public InvalidStageFailure(String message, String stage, String[] properties) {
    super(message);
    Map<String, Object> map = new HashMap<>();
    for (String property : properties) {
        map.put("stage", stage);
        map.put("property", property);
    }
    causes.add(map);
  }
}
Code Block
languagejava
titleInvalidInputSchemaFailure.java
/**
 * Represents invalid input schema failure. 
 */
public class InvalidInputSchemaFailure extends ValidationFailure {

  /**
   * Creates invalid input schema failure.
   * @param message failure message
   * @param stage name of the stage 
   * @param map map of incoming stage name to field that is invalid.
   */
  public InvalidInputSchemaFailure(String message, String stage, Map<String, String> map) {
    super(message);
    for (Map.Entry<String, String> entry : map.entrySet()) {
      Map<String, Object> causeMap = new HashMap<>();
      causeMap.put("stage", stage);
      causeMap.put("inputStage", entry.getKey());
      causeMap.put("inputField", entry.getValue());
      causes.add(causeMap);
    }
  }
}
Code Block
languagejava
titleInvalidOutputSchemaFailure.java
/**
 * Represents invalid output schema failure.
 */
public class InvalidOutputSchemaFailure extends ValidationFailure {

  /**
   * Creates invalid output schema failure.
   * @param message failure message
   * @param stage name of the stage
   * @param map map of output going port name to field that is invalid
   */
  public InvalidOutputSchemaFailure(String message, String stage, Map<String, String> map) {
    super(message);
    for (Map.Entry<String, String> entry : map.entrySet()) {
      Map<String, Object> causeMap = new HashMap<>();
      causeMap.put("stage", stage);
      causeMap.put("outputPort", entry.getKey());
      causeMap.put("outputField", entry.getValue());
      causes.add(causeMap);
    }
  }
}


API usage in plugins

Code Block
@Override
public void configurePipeline(PipelineConfigurer pipelineConfigurer) {
  pipelineConfigurer.createDataset(conf.destinationFileset, FileSet.class);
  StageConfigurer stageConfigurer = pipelineConfigurer.getStageConfigurer();
  // get the name of the stage 
  String stageName = stageConfigurer.getStageName();
  try {
    Pattern.compile(conf.filterRegex);
  } catch (Exception e) {  
    // add validation error to stage configurer
    stageConfigurer.addValidationFailure(new InvalidStagePropertyFailure(e.getMessage(), stageName, "filterRegex"));
  }
  if (conf.sourceFileset.equals(conf.destinationFileset)) {
    // add validation error to stage configurer
    stageConfigurer.addValidationFailure(new InvalidStageFailure("source and destination filesets must be different", stageName));
  }
}

Impact on UI

typePLUGIN_NOT_FOUND",{scopeUSER nameapp-mocks-ghost version1.1.0"}
}
]
}{
"errors": [messagePlugin named 'Mock' of type 'batchsource' not found.causes[
{ stagesrc",
pluginTypebatchsource,
TypeDescriptionScenarioApproach - 1 - Json ResponseApproach - 2 - Json Response
STAGE_ERRORRepresents validation error while configuring the stageIf there is any error while connecting to sink while getting actual schema
{
"errors": [
{
"type" : "STAGE_ERROR",
"stage" : "src",
"message" : "Could not load jdbc driver.",
"
correctiveAction" : "
Make sure correct driver is uploaded."
}
]
}
{
"messageerrors": "Plugin[
named 'Mock' of type 'batchsource' not found.", {
"pluginTypemessage": "batchsourceCould not load jdbc driver.",
      "pluginNamecorrectiveAction" : "Mock",Make sure correct driver is uploaded."
"requestedArtifactcauses": {[
"scope": "USER",
{
"namestage": "app-mocks-ghostsrc",
"version": "1.0.0"}
},]
}
"suggestedArtifact]
}
INVALID_PROPERTYRepresents invalid configuration propertyIf config property value contains characters that are not allowed by underlying source or sink
{
"errors":
[
{
"
type" : "
INVALID_PROPERTY", 
      "
stage" : "
projection",
"
message" : "
Can not specify both drop and keep.", 
"correctiveAction" : "Either drop or keep should be empty",
"property" : "drop"
},
{
"
type" : "
INVALID_PROPERTY", 
"stage" : "projection",
"
message" : 
"Can not specify both drop and keep.", 

      
"
correctiveAction" : "
Either drop or keep should be empty",
"
property" : "
keep"

}
]
}

{
"pluginName"errors": "Mock",[
{
"requestedArtifactmessage": {
"Can not specify both drop and keep",
      "scopecorrectiveAction" : "USER",
Either drop or keep should be empty",
"namecauses": "app-mocks-ghost",
[
{
"versionstage": "1.0.0projection",
},"property": "keep"
"suggestedArtifact": {},
"scope": "USER",
{
"namestage" : "app-mocks-ghostprojection",
"versionproperty" : "1.0.0drop"
}
}
]
}
]
}
INVALIDPLUGIN_INPUTNOT_SCHEMAFOUNDRepresents invalid schema field in input schemaIf the input schemas for joiner plugin is of different typesplugin not found error for a stage. This error will be added by the data pipeline appIf the plugin was not found. This error will be thrown from the data pipeline app
{
"errors": [
{
"typestage" : "INVALID_INPUT_SCHEMAsrc",
"stagetype" : "joinerPLUGIN_NOT_FOUND",
"message" : "InvalidPlugin schema fieldnamed 'idMock'. Differentof typestype of join keys found in source1 and source2'batchsource' not found.",
"fieldpluginType" : "idbatchsource",
"input_stagepluginName" : "source1Mock",
},
"pluginId" : {"Mock",
      "typecorrectiveAction" : "INVALID_INPUT_SCHEMA", 
"stage" : "joiner",
"message" : "Invalid schema field 'id'. Different types of join keys found in source1 and source2.", Please make sure the 'Mock' plugin is installed."
}
]
}
{
"errors": [
{
"message": "Plugin named 'Mock' of type 'batchsource' not found.",
      "fieldcorrectiveAction" : "id",
Please make sure the 'Mock' plugin "input_stage" : "source2"is installed."
}
]
}{
"errorscauses": [
{
"messagestage": "Invalidsrc",
schema field 'id'. Different types of join keys found.",
"causes"pluginType": [
{
"stage": "joiner""batchsource",
"input_stagepluginName": "source1Mock",
          "input_fieldpluginId" : "idMock"
},
{]
"stage": "joiner",}
]
}
INVALID_INPUT_SCHEMARepresents invalid schema field in input schemaIf the input schemas for joiner plugin is of different types
{
"errors": [
{
"input_stagetype" : "source2INVALID_INPUT_SCHEMA",
"input_field""stage" : "idjoiner",
"message" : }
"Invalid schema field 'id'. Different types ]
of join keys found }
in ]
}
INVALID_OUTPUT_SCHEMARepresents invalid schema field in output schemaIf the output schema for the plugin is not compatible with underlying sink{
"errors": [
source1 and source2.", 
    
{
  
"
type
correctiveAction" : "
INVALID_OUTPUT_SCHEMA", "stage" : "splitter",
Type of join keys from source1 and source2 must be of same type string",
"
message
field" : "
Invalid
id",
schema
 
field
 
'email'. It should be of type 'string' at output port 'port'",
 "inputStage" : "source1"
},
{
"
field
type" : "
email
INVALID_INPUT_SCHEMA", 
"
output_port
stage" : "
port
joiner",
}
  
]
}

{

  "errorsmessage" : [
"Invalid schema field {
"message": "Invalid schema field 'email'. It should be of type 'string'",'id'. Different types of join keys found in source1 and source2.",
      "causescorrectiveAction" : [
"Type of join keys from source1 and source2 {
must be of same type string",
"stagefield" : "splitterid",
"inputStage" : "source2"
"output_port }
]
}
{

"errors": "port",[
{
"output_fieldmessage": "email"Invalid schema field 'id'. Different types of join keys found.",
"correctiveAction" : "Type of }
join keys from source1 and source2 must ]
be of same type }string",
]
}

Conclusion

There are 2 contracts in this design. One is between data pipeline app and plugins and another between UI via validation REST endpoint and data pipeline app.

Data pipeline app and plugins: Approach 1 provided well defined contract between plugins and data pipeline app by restricting type of the property values passed from plugins whereas, Approach 2 provides flexible contract by allowing any type of objects as validation failure property value from plugins. 

Data pipeline app and UI: Approach 1 provides list of error messages each message containing validation error type and corresponding error properties for that type whereas, Approach 2 provides list of single error message along with multiple causes of the error message without specifying the type of the error. 

Eventhough Approach1 is restricting the type of property values, in most of the cases the property value will be of string type. In case the type is not supported by the validation error builder, a new method can be added to support that type. This change would require corresponding UI change in order to render new type of property. For errors generated by data pipeline app such as plugin not found or pipeline validation error that can include other objects in future, data pipeline app can define its own error object and serialize it to UI. Those types should not be exposed to plugins through ValidationFailure
    "causes": [
{
"stage": "joiner",
"inputStage": "source1",
"inputField": "id"
},
{
"stage": "joiner",
"inputStage": "source2",
"inputField": "id"
}
]
}
]
}
INVALID_OUTPUT_SCHEMARepresents invalid schema field in output schemaIf the output schema for the plugin is not compatible with underlying sink
{
"errors": [
{
"type" : "INVALID_OUTPUT_SCHEMA", "stage" : "splitter", "message" : "Invalid schema field 'email'.",
      "correctiveAction" : "Schema should be of type 'string' at output port 'port'",
"field" : "email",
"outputPort": "port"
}
]
}
{
"errors": [
{
"message": "Invalid schema field 'email'.",
      "correctiveAction" : "Schema should be of type 'string' at output port 'port'"
"causes": [
{
"stage": "splitter",
"outputPort": "port",
"outputField": "email"
}
]
}
]
}

Conclusion

There are 2 contracts in this design. One is between data pipeline app and plugins and another between data pipeline app and UI. Approach 1 provides well defined contract between plugins and data pipeline app. Using Approach 1 its possible to throw the exception with all the collected errors at any point from plugin validation code in case one of the dependent properties is invalid. This makes the plugin validation code much more simpler. Hence, Approach 1 is suggested.

Related Jira

Jira Legacy
serverCask Community Issue Tracker
serverId45b48dee-c8d6-34f0-9990-e6367dc2fe4b
keyCDAP-15578

Related Work

Releases

Release 6.1.0