Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For example, a CustomerDirectory dataset is maintained by organization X in an enterprise. C provides this dataset in namespace C, for applications in other namespaces .  

 

This dataset is used by many applications to look up customers. This dataset has a custom type with various methods to maintain its data; however, most of the applications only need one API: CustomerInfo getCustomer(String id)

...

.  

  • Applications that use this dataset need to include a dependency customer-api-1.0 in their pom in order to compile and package. (See the discussion of scenario 2 for why this should be a maven dependency). 
  • This actual dataset type must implement implements the CustomerDirectory interface API, say using a class TableBasedCustomerDirectory in artifact customer-table-1.3.1. 
  • At runtime, when the app calls getDataset(), CDAP determines that the dataset instance has that type and version, and loads the class from that artifact. 
  • The actual dataset type has more methods in its API, including one that allows adding new customers. Therefore, the app that maintains this dataset, includes the implementing artifact in its pom file. 
  • The implementation can be updated without changing the API. In this case, C X deploys a new artifact customer-table-1.3.2 and upgrades the dataset to this version. The maintaining app may or may not be upgraded to must now pick up the new artifact version, depending on how it bundles it: If it uses provided scope, then it automatically picks up the new jar upon restart. If it uses included scope, then it must be updated to the new version and redeployed. the next time it runs. (Whether this requires recompiling/packaging the app is up for detailed design). No change is needed for the other applications that use this dataset, because CDAP always injects the correct version of the dataset type.
  • The implementation can be updated with an interface change, for example, adding a new field to the CustomerInfo. To make this update seamless, a new artifact customer-table-1.4.0 is deployed, and both the dataset and the maintaining app are upgraded to this version. Then a new version of the API, customer-api-1.1, is deployed, and apps may now upgrade to this version. If they don’t, then they will not see the new field, but that is fine for existing apps because their code does not use this field. Note that this requires that CustomerInfo is an interface (consisting mainly of getters) that has an implementation in the customer-table artifact.  QuestionsSimilarly, a new method could be added the the interface, and applications that do not use this new interface, do not require recompile and redeploy.

This scenario is one the most complex but the complexity is limited to the app that maintains the dataset as a service for others, who only need to know the published interface. This scenario also poses some important questions:

  • what is the deployment mechanism for the two artifacts (customer-api and customer-table)?
  • how does CDAP know that customer-table implements customer-api? Does it have to know?
  • how can C X migrate the dataset to a new data format without having control over the apps that consume it? Even after upgrading the dataset to a new version, C X does not know when all apps have picked that up, because they may have long-running programs such as a flow or service that need to be restarted for  picking picking up the new version.

 

...

  • version

...

  • .

...