Edge side data collection and aggregation (de-centralized)
Edge side anomaly detection and notification
Edge side data cleansing and transformation
Collect data from local sensors, devices and systems
Transport aggregated data using MQTT, HTTP or TCP
Goals
CDAP Standalone used in a production deployment capacity for constrained environment
Light-weight with minimal capability to run on the edge for IoT type of applications
Run "All" CDAP Applications
Run in environment that is constrained by memory, disk and compute
Self-healing capabilities
Integrate with central CDAP
Remote update or upgrade capabilities
Area of Focus
Resiliency and reliability of CDAP Standalone (Was not built to run in a production like environment) with self healing capabilities
Reduce CDAP Standalone footprint
Customize required component and programs
Integration with central CDAP Management
Reduce run-time footprint ( Remove un-necessary components and sub-systems)
High Level Requirements
Support long running applications (CDAP Applications)
Support the ability to run in a constrained environment with 512 MB of Memory (Reducing the footprint)
Automatic clean-up or management of transient data, metadata
Remove Kafka and Zookeeper dependencies
Support the ability to run reasonable number of applications
Remove User Interface and harden REST API interface
Remove extension interfaces
Technical Breakdown
LITE-001 : Remove Kafka and Zookeeper dependencies
Kafka and Zookeeper adds extensive requirements to footprint of CDAP Standalone, so replacing them with some reliable messaging service would reduce the footprint. There is already an initiative on the way to remove these dependencies. Find out more about this here.
LITE-002 : Remove Hive dependencies
LITE-003 : Disable or Remove Audit Log Capabilities
LITE-004 : Remove User Interface
LITE-005 : Cleanup, Log Rotation
LITE-006 : Self healing capability
LITE-007 : Fix memory leaks with in memory MR and Spark