Schedule pipelines

This topic is published on the CDAP doc wiki and will be maintained here.

Pipelines can be set to run on a specified schedule and frequency, such as every 4 hours or weekly on Monday at 1:30 AM. The scheduling capability is currently only available in the Enterprise edition.

Creating a schedule

  1. After the pipeline is created and deployed, click on the Schedule button in the pipeline detail page. The pipeline detail page can be found by clicking on the pipeline from the Control Center or Pipeline List page. The Schedule can also be configured from within Pipeline Studio for a newly created pipeline.

     

     

  2. The schedule can be configured using the Basic or Advanced interface. The Basic interface allows you to set:

    • Frequency

    • Start time (and date, if needed)

    • Max concurrent runs: Up to 10. If the max number of runs is already running, the scheduled run will be skipped.

    • Compute profile (optional): If no profiles have been created in the namespace, the default Dataproc profile is used.

       

  3. Once all options are selected, click on Save and Start Schedule to save and start the schedule or Save schedule to save the schedule without starting it. Saved schedules can be started by clicking on Start Schedule from the same interface in the pipeline detail page. Alternatively, the Advanced interface can be used to create a schedule using cron syntax.

 

Once a schedule is created, it can be modified, started, or suspended by clicking on the Unschedule button in the pipeline detail page.

Filter by label

There are no items with the selected labels at this time.