Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
Introduction
CDAP pipelines is composed of various CDAP plugins.
CDAP pipeline is composed of various CDAP plugins. These plugins handle error situations in case of invalid inputs or configurations. While developing CDAP pipelines, pipeline developer can provide invalid plugin configurations. For example, the BigQuery sink plugin can have invalid temporary GCS file which does not match with an underlying BigQuery table. In such situations, providing clear error message is helpful to guide user in right direction. Wrangler provides interactive way for users to apply directives to the data. However, while applying these directives, user may run into error situations. For example, the input json file may be corrupted which can fail parse-as-json directive. In such error situations, user should be provided clear error message so that further actions can be taken.
Goals
There are four goals which needs to be achieved to improve error handling:
Provide a guideline on how an error message should be formulated that makes it easier for end user to interpret the error situation
- Instrument plugins to return multiple error messages for validation endpoint
Add a framework to standardize error messages in wrangler and pipeline
- Add a framework to prefix error codes to user facing error messages so that developers can figure out the source of error message
Scope
Plugins
- Plugin Validation (Has a separate design doc - this document focuses on design of error codes and standard error messages)
- Provide a framework to collect multiple validation errors so that they can be highlighted by UI when validation endpoint is called.
- Provide a framework to add new type of exception without replacing data pipeline artifacts
- Instrument plugins so that all the invalid config and schema fields are reported to the user at once when a plugin is validated
Dataprep
- Improve error messages in all Directives
- Remove usages of object hashes in the error messages. It happens because of usage of toString() in error messages
- Standardize error messages
- Apply error codes to user facing error messages
Pipeline
- Standardize error messages
- Apply error codes to user facing error messages
User Stories
As a CDAP pipeline developer, if a pipeline contains plugin configurations which are invalid, I will like it to fail early with appropriate error message.
As an ETL engineer, if I run into error situation while applying directives, I will like to see appropriate error message which clearly indicates the error.
Scenario 1: Error codes in Wrangler
Scenario 2: Standard Error messages in Wrangler
Scenario 3: Error codes in Pipeline
Scenario 4: Standard Error messages in Pipeline
Approach
Impact on UI
UI changes will be needed for invalid schema type errors returned from validation endpoint.
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
Bug Fixes
- Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
Releases
Release 6.1.0
Related Work
Future work
- Add error code and standard error message capability to CDAP platform.