Goals:
- Improve operability in the Hydrator Studio (Improvements to logs, metrics, debuggability)
- Improve usability in the Hydrator Studio (Redesign of bottom panel, etc)
...
Use Case 1: Improve Log Viewer
Problems with current Log Viewer:
- Doesn't cater to usual developer interactions with logs - tail'ing (with log file monitoring
-f
) or less'ing (viewing the log) or downloading - Hard to distinguish between two log lines
- Exception stack traces are virtually un-readable
- Virtually no formatting in the UI - almost rendered as the logs appear from the backend, which is not ideal to an end-user
- No search (even at the UI level)
- No way to download logs
- No way to distinguish whether logs are live or past
Jira Legacy server Cask Community Issue Tracker serverId 45b48dee-c8d6-34f0-9990-e6367dc2fe4b key CDAP-5733
...
- As a Hydrator/CDAP user, I want to be able to view my pipeline logs from both currently running pipelines as well as past pipelines to effectively debug the pipeline during failures
- As a Hydrator/CDAP user, I want to clearly know if the logs I am viewing are being updated live or are from a past run
- As a Hydrator/CDAP user, I want greater emphasis on the most important part of logs - the messages
- As a Hydrator user, I do not want logs to be flooded with stack traces. I want the ability to suppress them individually and as a whole
- As a Hydrator/CDAP user, I want the ability to download complete log files
- As a Hydrator/CDAP user, I want to view a summary of the logs I'm viewing (the number of messages, the number of errors, the number of warnings)
- As a Hydrator/CDAP user, I want to be able to filter logs by the lowest log level
- As a Hydrator/CDAP user, I want to be able to filter logs by keywords
- As a Hydrator/CDAP user, I want to be able to view a larger number of log events with a single-line summary for each, with the capability to drill down into particular events as desired
- As a Hydrator/CDAP user, I want to be able to view logs in the selected time range. I want to be able to dynamically change the time range for which I want to view logs, with context about how that time range maps to the duration of the program/service run.
- As a Hydrator/CDAP user, I want to be able to be able to maximize the log viewer to full screen size and restore it to original size as required.
Design:
- Timeline:
- Starts at the program/service start time. Ends at the program/service end time (past) or current.
- Time range indicated by two sliders on each side. Time range can be selected by sliding these sliders.
- Updating slider position causes a refresh of the log viewer to show logs in the selected range with the selected filters
- If program/service is still running, the right/bottom end of the slider indicates current time, and if the slider is at this position, logs are updated live. The timeline keeps updating to reflect that.
- Sliders must not cross each other
- Label on the selected time range indicates the selected time range
- The timeline is marked with time range with granularity that depends on the duration of the log (which is the duration of the program run).
- Filters:
- Filter by lowest log level:
- If ERROR is selected, then we show only ERROR
- If WARN is selected, then we show ERROR and WARN
- If INFO is selected, then we show ERROR, WARN and INFO
- If DEBUG is selected, then we show ERROR, WARN, INFO and DEBUG
- If TRACE is selected, then we show ERROR, WARN, INFO, DEBUG and TRACE
- Filter by search keywords:
- Search box that filters logs by the search text.
- This is a simple filter that applies on the message column
- Filter by lowest log level:
- Log viewer Table:
- Columns:
- Timestamp
- Lowest Log Level
- Source - Only in CDAP - This column should not be shown in Hydrator
- Message (also contains stack trace).
- Default view shows single line messages, with / buttons to expand individual messages if they have more content
- Ability to suppress/show stack trace with a similar / buttons.
- Ability to expand all messages
- Ability to only view the message column
- Columns:
- Top Bar:
- Shows information/summary of the log
- Indicates program/service name
- Summary of total messages with number of warnings and errors
- Download button to download entire log
- Search box for filtering.
Backend support:
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
Use Case 2: Bottom Panel
Problems with current Bottom Panelbottom panel:
- Constant back-and forth between DAG and bottom panel - click on a node, then view the bottom panel - not very intuitive
- Reserved real-estate for configurations that are not commonly updated
- Schema available in both bottom panel as well as the DAG
- Reduced "prominance" for both the DAG as well as the bottom panel, since you're not using the full available space ever
- Restricted space in the bottom panel for logs, pipeline configuration, node configuration, etc
- Association between a DAG and its bottom panel is not always clear enough
...
- As a Hydrator Product Team, I want to better plan the Hydrator real-estate so it is not statically allocated for configurations/views that are not commonly used/mandatory to be updated for creating pipelines
- e.g. Pipeline configurations like post run actions, engine, schedule
- As a Hydrator Product Team, I want to better design the Hydrator UI to lay more emphasis on the DAG
- As a Hydrator user, I do not want to switch back-and-forth between the DAG and the bottom panel repeatedly for building my pipeline
- I should be able to provide node-level details right near the node
- I should be able to simultaneously view details for multiple nodes both while editing a pipeline as well as viewing it.
- As a Hydrator user, I want to be able to build my pipeline incrementally. I want mandatory information to be more obvious.
- Build the pipeline with mandatory fields only to start off
- Incrementally add schedule, post run actions, etc
- As a Hydrator Product Team, I want remove the disparity between the pipeline detail view and the studio view. This will facilitate the move towards being able to edit a pipeline after publishing
- e.g. Reference is unavailable in the pipeline details view
- e.g. Reference is unavailable in the pipeline details view
- As a Hydrator user, I want the messaging regarding multiple runs from the Hydrator UI to be clearer.
- Does Hydrator only always show the last run?
- If so, what is the "History" view for
- As a Hydrator Product Team, I want to reduce duplication
- The console is not very useful today, it just shows messages. Can it be reconciled with the notification center?
- As a Hydrator user, I want related actions to appear together.
- e.g. "Export" is available in the bottom panel, but other pipeline controls are in the top bar.
- As a Hydrator Product team, I want to bring Jump buttons to Hydrator to make them the primary method of viewing entities in different contexts across CDAP, Hydrator and Tracker
- Jump from pipeline details view in Hydrator to program details view in CDAP
- Jump actions for source/sink in Hydrator:
- View in Dataset Details page in CDAP
- View in entity details page in Tracker
- Explore Dataset (if possible) in CDAP
Design:
Use Case 2: Debuggability/Testing
User Stories:
Design
...
Proposed Log Viewer:
...
- Starts at the program/service start time. Ends at the program/service end time (past) or current.
- Time range indicated by two sliders on each side. Time range can be selected by sliding these sliders.
- Updating slider position causes a refresh of the log viewer to show logs in the selected range with the selected filters
- If program/service is still running, the right/bottom end of the slider indicates current time, and if the slider is at this position, logs are updated live. The timeline keeps updating to reflect that.
- Sliders must not cross each other
- Label on the selected time range indicates the selected time range
- The timeline is marked with time range with granularity that depends on the duration of the log (which is the duration of the program run).
...
:
- Filter by lowest log level:
- If ERROR is selected, then we show only ERROR
- If WARN is selected, then we show ERROR and WARN
- If INFO is selected, then we show ERROR, WARN and INFO
- If DEBUG is selected, then we show ERROR, WARN, INFO and DEBUG
- If TRACE is selected, then we show ERROR, WARN, INFO, DEBUG and TRACE
- Filter by search keywords:
- Search box that filters logs by the search text.
- This is a simple filter that applies on the message column
...
- Similar to
tail -f
- Starts off with 50 lines
- Shows newer logs as they become available towards the end
- Users can see newer logs if they are 'scroll-positioned' at the last log line
- Scroll position is retained if users are at any position other than the last log line
- Previous button
...
- Similar in behavior to
less
, so its not live, but allows the following capabilities - Time range selector
- Previous/Next buttons
- Download button
Common to both views:
- Compact view: A log line is a single line, so you see more logs (even though they are partial) at once.
- Expanded view: A log line contains the entire content, including message and stack trace
- Suppress stack trace: The stack trace in a log line can be suppressed by clicking something
- 1, 2, and 3 can be achieved either for all logs, or for an individual log line
- Logs are tabular, consisting of columns: Timestamp, Log Level, Origin (includes thread name, class name and line number - but these can be split into separate columns if there is a requirement), Message (contains stack trace too).
- Error/Warn level logs have some sort of highlighting (a symbol next to the log level?)
- Log level column has a dropbox with checkboxes to select only a particular log level - ALL, DEBUG, INFO, WARN, ERROR
- The message column can be expanded to the full width of the table, thereby hiding other columns. This operation can be reversed.
- Search box that allows filtering log lines with the search text
Backend support:
Jira Legacy
Work Streams:
Tech Debt
- Simplify Config Store
- Simplify DAG component ~ Ajai's hack
...