EntityId

Issues with Id.*, ProgramType, and ElementType (cdap-cli)

  • Id.*
    • toString() is inconsistent
    • JSON serialization is inconsistent (since sometimes we use NamespacedIdCodec and sometimes we don't)
    • Doesn't look pretty in JSON form due to heavy nesting
    • Needs lots of extra code to implement JSON codec (NamespacedIdCodec)
    • Id.Program vs Id.Flow: two classes that do the same thing
    • Constructing a child Id.* given the parent Id.* requires repetition
  • ProgramType
    • Inconsistent in JSON form compared to other CDAP entities (using "Flow", "Mapreduce", etc. form instead of "flow", "mapreduce", etc. form to maintain backwards-compatibility)
    • SchedulableProgramType duplicates ProgramType
  • ElementType
    • Only in cdap-cli
    • Doesn't cover every type of entity

 

Changes

  • Additions
    • EntityType: enum that describes every type of entity in CDAP (namespace, application, program, program run, flowlet, flowlet queue, schedule, notification feed, artifact, system service, etc.)
    • EntityId: base class for entities in CDAP
      • No nesting
      • Consistent "static EntityId fromString(String)" and "String toString()"
      • Consistent JSON serialization and deserialization (still need custom deserializer, but it should have much less code than NamespacedIdCodec, and should support all EntityIds)
      • Easier way to construct child Ids (e.g. given a namespace id, construct app id)
  • Modifications
    • Make ProgramType have consistent JSON and fix duplication between SchedulableProgramType and ProgramType
    • Deprecate Id.*
    • Remove ElementType

 

Examples

  • EntityId construction 

    Ids.namespace("foo")
    Ids.namespace("foo").artifact("art", "1.2.3")
    Ids.namespace("foo").dataset("zoo")
    Ids.namespace("foo").datasetModule("moo")
    Ids.namespace("foo").datasetType("typ")
    Ids.namespace("foo").stream("t")
    Ids.namespace("foo").app("app")
    Ids.namespace("foo").app("app").program(ProgramType.FLOW, "flo")
    Ids.namespace("foo").app("app").program(ProgramType.FLOW, "flo").run("run1")
    Ids.namespace("foo").app("app").program(ProgramType.FLOW, "flo").flowlet("flol")
    Ids.namespace("foo").app("app").program(ProgramType.FLOW, "flo").flowlet("flol").queue("q")
  • toString() 

    namespace:foo
    artifact:foo.art.1.2.3
    dataset:foo.zoo
    dataset_module:foo.moo
    dataset_type:foo.typ
    stream:foo.t
    application:foo.app
    program:foo.app.flow.flo
    program_run:foo.app.flow.flo.run1
    flowlet:foo.app.flo.flol
    flowlet_queue:foo.app.flo.flol.q
  • GSON.toJson()

    {"namespace":"foo","entity":"NAMESPACE"}
    {"namespace":"foo","artifact":"art","version":"1.2.3","entity":"ARTIFACT"}
    {"namespace":"foo","dataset":"zoo","entity":"DATASET"}
    {"namespace":"foo","module":"moo","entity":"DATASET_MODULE"}
    {"namespace":"foo","type":"typ","entity":"DATASET_TYPE"}
    {"namespace":"foo","stream":"t","entity":"STREAM"}
    {"namespace":"foo","application":"app","entity":"APPLICATION"}
    {"namespace":"foo","application":"app","type":"Flow","program":"flo","entity":"PROGRAM"}
    {"namespace":"foo","application":"app","type":"Flow","program":"flo","run":"run1","entity":"PROGRAM_RUN"}
    {"namespace":"foo","application":"app","flow":"flo","flowlet":"flol","entity":"FLOWLET"}
    {"namespace":"foo","application":"app","flow":"flo","flowlet":"flol","queue":"q","entity":"FLOWLET_QUEUE"}

 

 

Â