Batch Runs Endpoint
Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
IntroductionÂ
If a client wants to get the last X runs for N different programs, they have to make N calls today. One example of this is the pipeline list UI, which gets the latest run for each pipeline in the list.
Goals
Make it possible for a client to get the last X runs for a list of programs in a single API call.
User StoriesÂ
- As a CDAP ops developer, I want to write a monitoring check that gets the latest run for multiple programs in a single call and alert if any of them have not run as expected
- As a UI, I want to make a single call to fetch the latest run for multiple programs in order to display run information in a program list view
- As a CDAP client, if a program in the request does not exist, I want to be able to tell from the response
Design
We will add a batch runs endpoint that is similar to the batch status endpoint. It will take a list of programs in its request. It will return a list of programs with the runs for each program in the request. If a program does not exist, it will be indicated in the response. The scan for latest program runs will happen in a separate transaction for each program. It is functionally equivalent to making N different calls to the runs endpoint.
The request will be:
POST v3/namespaces/<namespace-id>/runs [ { "appId": "my-app", "programType": "WORKFLOW", "programId": "DataPipelineWorkflow", "limit": 5 // optional limit for the number of runs. Defaults to 1 } ]
Note: this API request mirrors the POST v3/namespaces/<namespace-id>/status endpoint. I think it would actually be better if the request were an object with a 'programs' section that lists the programs, but making it this way for API consistency.
The response will be:
[ { "appId": "my-app", "programType": "WORKFLOW", "programId": "DataPipelineWorkflow", "statusCode": 200, "runs": [ { "runid": "<run-id>", "starting": 1234567890, ... // same content as the program specific runs endpoint }, ... ] }, { "appId": "my-app", "programType": "WORKFLOW", "programId": "DataPipelineWurkflu", "statusCode": 404 // if the program doesn't exist } ]
API changes
New Programmatic APIs
None
Deprecated Programmatic APIs
New REST APIs
Path | Method | Description | Response Code | Response |
---|---|---|---|---|
/v3/namespaces/<namespace-id>/runs | POST | Returns the last N runs for each program in the request | 200 - On success 500 - Any internal errors | |
Deprecated REST API
None
CLI Impact or Changes
- Could add a new command, but not planned
UI Impact or Changes
- UI can use this for the pipelines list view to improve performance
- A side impact is that the UI cannot fill in data as it gets it. It will be all or nothing.
Security ImpactÂ
None
Impact on Infrastructure OutagesÂ
None
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
1 | Get the last 5 runs for programs that all exist and have runs | All programs should be returned, with their last 5 runs |
3 | Get the last run for a mix of programs that exist and don't exist | Request should succeed, with programs that don't exist still in the response be with a not found status |
4 | Get the last 10 runs for a mix of programs that have more than 10 runs and fewer than 10 runs | Each program should have at most 10 runs |
Releases
Release 5.1.0
Related Work
- UI to use this new endpoint for pipelines list view
Future Work
None