Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The pipeline reads from some fileset. It writes to an avro fileset and a database. Before writing to the db, it needs to perform some lookups to add a couple fields.  

Image Added

Use case 2:

The pipeline reads from twitter. If the tweet is english, we want to run it through an english sentiment tagger transform.  If not, we want to send it to a translate transform before sending it on to the tagger transform.

Image Added

Use case 3:

The pipeline reads purchase events from a stream.  If the required userid field is null, it wants to write that record to an error table.  It then performs a transform to add user email to the records. If email is invalid, it also wants to write that record to the same error table.

Image Added

 

Use case 4:

The pipeline reads from an employees database table.  It wants to write salaries for different categories of employees, like male/female, manager/ic, age group.  One employee may fall into multiple categories.  If the employee is in a certain age group, we want to add information about retirement and health plans, but we don't need it for other categories.

Image Added

 Image Removed

 

Forks are not conditional, the same data gets sent to all forks. This is represented in the config as:

...