Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Checklist

  • User stories documented (Shankar)
  • User stories reviewed (Nitin)
  • Design documented (Shankar/Vinisha)
  • Design reviewed (Terence/Andreas)
  • Feature merged ()
  • Examples and guides ()
  • Integration tests () 
  • Documentation for feature ()
  • Blog post

Usecase

  • User wants to group log messages at application level and write multiple separate log files for each application. Example: application-dir/{audit.log, metrics.log, debug.log}
  • User wants to write these log files to a configurable path in HDFS.
  • User also wants to be able to configure rolling policy for these log files similar to log-back. 

User Stories

  1. For each application, user wants to  collect the application's logs into multiple logs files based on log level 

  2. For each application, user wants to configure a location in HDFS to be used to store the collected logs. 
  3. For each application, User wants the application log files stored in text format. 

Design

Introduce Log Processor, FileWriter and RotationPolicy interfaces. Pluggable in CDAP Log-saver. 

Programmatic API

 

public interface LogProcessor {

  /**
   * Called during initialize, passed properties for log processor.
   *
   * @param properties
   */
  void initialize(Properties properties);

  /**
   * Process method will be called with iterator of log messages, log messages received will be in sorted order,
   * sorted by timestamp. This method should not throw any exceptions. If any unchecked exceptions are thrown,
   * log.saver will log an error and the processor will not receive messages.
   * Will start receiving messages on log.saver startup.
   * 
   * @param events list of {@link LogEvent}
   */
  void process(Iterator<LogEvent> events);

  /**
   * stop logprocessor
   */
  void destroy();
}
class LogEvent {
  /**
   * Logging event
   */
  ILoggingEvent iLoggingEvent;
 
  /**
   * CDAP program entity-id
   */
  EntityId entityId;
}

 

Currently, we only have AvroFileWriter in Log.saver; we can create an interface for users to configure the FileWriter if needed. This (would?) provide the option to abstract certain common logic for file rotation, maintaining created files, etc. in Log saver and a custom file writer can implement the other methods specific to its logic,

Example: Creating files in HDFS and maintaining the size of events processed is maintained by custom FileWriter extension.

public interface FileWriter { 
  /**
   * append events to file
   */
  void append(Iterator<LogEvent> events);
 
  /**
   * create a file corresponding to the entityId and timestamp and return the file
   **/
  File createFile(EntityId entityId, long timestamp);
 
  /**
   * close the file
   **/
  void close(File file, long timestamp);

  /**
   * flush the contents
   **/
  void flush();
}
public interface RotationPolicy {
  /**
   * For the logEvent, decide if we should rotate the log file corresponding to this event or not.
   */
  boolean shouldRotateFile(LogEvent logEvent);
}

 

Approach

Option-1

Log Processor/File Writer Extensions run in the same container as log.saver. 

Lifecycle

1) Log saver will load and initialize the log processor plugin.

2) As Log saver processes the messages, the log processor's process will also be called with logging events.

3) if log processor extension throws an error :

  • we can Stop the plugin (or) 
  • we can log an error and continue and stop the plugin after an error threshold.

4) FileWriterExtension will be used for file system operations (create, append, close) and RotationPolicyExtension will be used for deciding when to rotate the file.

5) stop the log processor when log.saver stops.

 

Class-Loading Isolation

1) Should the log processor plugins have separate class-loaders (or) can they share the same ClassLoader as the log.saver system. 

     Having isolation helps with processor extensions to depend on different libraries, but should we allow them ? 

2) If we create separate Class loader - we need to expose the following 

  • cdap-watchdog-api
  • cdap-proto
  • hadoop
  • logback-classic ( we need ILoggingEvent)
  • should we expose more classes ? 
  • What if user wants to write to a kafka server or third-party storage s3 on the log.processor logic? Having separate class loader will help in these scenarios.

 

Sample Custom Log Plugin Implementation 

1) Log Processor would want to process the ILoggingEvent, format it to a log message string (maybe using log-back layout) and write it to a destination.

2) However the configuration for this log.processor cannot be a logback.xml.

  • as there can only be one logback.xml in a JVM and the logback is already configured for the log.saver container.
  • logback doesn't existing implementation for writing to HDFS. 

3) The properties required for extensions could be provided through cdap-site.xml.

4) Log processor extension could provide an implementation of FileWriter and RotationPolicy interfaces for HDFSFileWriter logic for the events it has received from LogProcessor. 

4) Future implementation for other policies have to be implemented at the extensions and can be configured through cdap-site.xml

 

Pros

1) Leverages scalability of log.saver

2) Utilization of existing resources , logic and processing.  Leveraging sorted messaging capability of log.saver for plugins.

3) Makes log saver extendable, to have option to store in different format and have custom logic on filtering.

 

Cons

1) As number of extensions increases and if a processor extension is slow, this could cause performance of log.saver to drop, which will affect the CDAP log.saver performance

 

Option-2 (or) Improvement 

 

Configure and Run a separate container for every log.processor plugin. 

Log.saver could have capability to launch system and user plugin containers. Scalability of these plugin containers could be managed separately. 

 

 

 

 

  • No labels