Goals:
- Improve operability in the Hydrator Studio (Improvements to logs, metrics, debuggability)
- Improve usability in the Hydrator Studio (Redesign of bottom panel, etc)
Checklist
- User stories documented (Bhooshan)
- User stories reviewed (Nitin/Sree)
- Design documented (Bhooshan/Brady)
- Design reviewed (Nitin/Sree)
Use Cases:
Use Case 1: Improve Log Viewer
Problems with current Log Viewer
- Doesn't cater to usual developer interactions with logs - tail'ing (with log file monitoring
-f
) or less'ing (viewing the log) or downloading - Hard to distinguish between two log lines
- Exception stack traces are virtually un-readable
- Virtually no formatting in the UI - almost rendered as the logs appear from the backend, which is not ideal to an end-user
- No search (even at the UI level)
- No way to download logs
- No way to distinguish whether logs are live or past
- Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
User Stories:
- As a Hydrator/CDAP user, I want to be able to view my pipeline logs from both currently running pipelines as well as past pipelines to effectively debug the pipeline during failures
- As a Hydrator/CDAP user, I want to clearly know if the logs I am viewing are being updated live or are from a past run
- As a Hydrator/CDAP user, I want greater emphasis on the most important part of logs - the messages
- As a Hydrator user, I do not want logs to be flooded with stack traces. I want the ability to suppress them individually and as a whole
- As a Hydrator/CDAP user, I want the ability to download complete log files
- As a Hydrator/CDAP user, I want to view a summary of the logs I'm viewing (the number of messages, the number of errors, the number of warnings)
- As a Hydrator/CDAP user, I want to be able to filter logs by the lowest log level
- As a Hydrator/CDAP user, I want to be able to filter logs by keywords
- As a Hydrator/CDAP user, I want to be able to view a larger number of log events with a single-line summary for each, with the capability to drill down into particular events as desired
- As a Hydrator/CDAP user, I want to be able to view logs in the selected time range. I want to be able to dynamically change the time range for which I want to view logs, with context about how that time range maps to the duration of the program/service run.
- As a Hydrator/CDAP user, I want to be able to be able to maximize the log viewer to full screen size and restore it to original size as required.
Possible solutions
- Tabular view: Columns for date, Class Name/Thread Name, Log Level, Log Line
- Alternate row background colors
- Vertically expandable with scrolling
- Searchable (Filter-able) columns
- Clear demarcation of rows
- Snippet with expand - especially for stack traces
- Picking only 1 or more log level -INFO, DEBUG, WARN, ERROR, ALL
- Ability to view and download raw logs if required
- Ability to view and expand only the "content" column of a log line
Use Case 2: Bottom Panel
Problems with current Bottom Panel
- Constant back-and forth between DAG and bottom panel - click on a node, then view the bottom panel - not very intuitive
- Reserved real-estate for configurations that are not commonly updated
- Schema available in both bottom panel as well as the DAG
- Reduced "prominance" for both the DAG as well as the bottom panel, since you're not using the full available space ever
- Restricted space in the bottom panel for logs, pipeline configuration, node configuration, etc
- Association between a DAG and its bottom panel is not always clear enough
User Stories:
- As a Hydrator Product Team, I want to better plan the Hydrator real-estate so it is not statically allocated for configurations/views that are not commonly used/mandatory to be updated for creating pipelines
- e.g. Pipeline configurations like post run actions, engine, schedule
- As a Hydrator Product Team, I want to better design the Hydrator UI to lay more emphasis on the DAG
- As a Hydrator user, I do not want to switch back-and-forth between the DAG and the bottom panel repeatedly for building my pipeline
- I should be able to provide node-level details right near the node
- I should be able to simultaneously view details for multiple nodes both while editing a pipeline as well as viewing it.
- As a Hydrator user, I want to be able to build my pipeline incrementally. I want mandatory information to be more obvious.
- Build the pipeline with mandatory fields only to start off
- Incrementally add schedule, post run actions, etc
- As a Hydrator Product Team, I want remove the disparity between the pipeline detail view and the studio view. This will facilitate the move towards being able to edit a pipeline after publishing
- e.g. Reference is unavailable in the pipeline details view
- e.g. Reference is unavailable in the pipeline details view
- As a Hydrator user, I want the messaging regarding multiple runs from the Hydrator UI to be clearer.
- Does Hydrator only always show the last run?
- If so, what is the "History" view for
- As a Hydrator Product Team, I want to reduce duplication
- The console is not very useful today, it just shows messages. Can it be reconciled with the notification center?
- As a Hydrator user, I want related actions to appear together.
- e.g. "Export" is available in the bottom panel, but other pipeline controls are in the top bar.
- As a Hydrator Product team, I want to bring Jump buttons to Hydrator to make them the primary method of viewing entities in different contexts across CDAP, Hydrator and Tracker
- Jump from pipeline details view in Hydrator to program details view in CDAP
- Jump actions for source/sink in Hydrator:
- View in Dataset Details page in CDAP
- View in entity details page in Tracker
- Explore Dataset (if possible) in CDAP
Use Case 2: Debuggability/Testing
User Stories:
Design:
Proposed Log Viewer:
Composed of two main views:
- Viewing current logs along with monitoring (Live)
- Similar to
tail -f
- Starts off with 50 lines
- Shows newer logs as they become available towards the end
- Users can see newer logs if they are 'scroll-positioned' at the last log line
- Scroll position is retained if users are at any position other than the last log line
- Previous button
- Similar to
- Viewing logs within a specified time range (Not Live)
- Similar in behavior to
less
, so its not live, but allows the following capabilities - Time range selector
- Previous/Next buttons
- Download button
- Similar in behavior to
Common to both views:
- Compact view: A log line is a single line, so you see more logs (even though they are partial) at once.
- Expanded view: A log line contains the entire content, including message and stack trace
- Suppress stack trace: The stack trace in a log line can be suppressed by clicking something
- 1, 2, and 3 can be achieved either for all logs, or for an individual log line
- Logs are tabular, consisting of columns: Timestamp, Log Level, Origin (includes thread name, class name and line number - but these can be split into separate columns if there is a requirement), Message (contains stack trace too).
- Error/Warn level logs have some sort of highlighting (a symbol next to the log level?)
- Log level column has a dropbox with checkboxes to select only a particular log level - ALL, DEBUG, INFO, WARN, ERROR
- The message column can be expanded to the full width of the table, thereby hiding other columns. This operation can be reversed.
- Search box that allows filtering log lines with the search text
Backend support:
Work Streams:
Tech Debt
- Simplify Config Store
- Simplify DAG component ~ Ajai's hack
Moving hard-coding/logic to backend
- Drafts
- Default plugin version
- For a stage, define whether it can accept an Input, Output or both
- Single APIs for status/logs/metrics for hydrator pipelines
New features
- Preview
- Log Viewer