Goals:
- Improve operability in the Hydrator Studio (Improvements to logs, metrics, debuggability)
- Improve usability in the Hydrator Studio (Redesign of bottom panel, etc)
Checklist
- User stories documented (Bhooshan)
- User stories reviewed (Nitin/Sree)
- Design documented (Bhooshan/Brady)
- Design reviewed (Nitin/Sree)
Use Cases:
Use Case 1: Improve Log Viewer
Problems with current Log Viewer:
- Doesn't cater to usual developer interactions with logs - tail'ing (with log file monitoring
-f
) or less'ing (viewing the log) or downloading - Hard to distinguish between two log lines
- Exception stack traces are virtually un-readable
- Virtually no formatting in the UI - almost rendered as the logs appear from the backend, which is not ideal to an end-user
- No search (even at the UI level)
- No way to download logs
- No way to distinguish whether logs are live or past
- Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
User Stories:
- As a Hydrator/CDAP user, I want to be able to view my pipeline logs from both currently running pipelines as well as past pipelines to effectively debug the pipeline during failures
- As a Hydrator/CDAP user, I want to clearly know if the logs I am viewing are being updated live or are from a past run
- As a Hydrator/CDAP user, I want greater emphasis on the most important part of logs - the messages
- As a Hydrator user, I do not want logs to be flooded with stack traces. I want the ability to suppress them individually and as a whole
- As a Hydrator/CDAP user, I want the ability to download complete log files
- As a Hydrator/CDAP user, I want to view a summary of the logs I'm viewing (the number of messages, the number of errors, the number of warnings)
- As a Hydrator/CDAP user, I want to be able to filter logs by the lowest log level
- As a Hydrator/CDAP user, I want to be able to filter logs by keywords
- As a Hydrator/CDAP user, I want to be able to view a larger number of log events with a single-line summary for each, with the capability to drill down into particular events as desired
- As a Hydrator/CDAP user, I want to be able to view logs in the selected time range. I want to be able to dynamically change the time range for which I want to view logs, with context about how that time range maps to the duration of the program/service run.
- As a Hydrator/CDAP user, I want to be able to be able to maximize the log viewer to full screen size and restore it to original size as required.
Design:
- Timeline:
- Starts at the program/service start time. Ends at the program/service end time (past) or current.
- Time range indicated by two sliders on each side. Time range can be selected by sliding these sliders.
- Updating slider position causes a refresh of the log viewer to show logs in the selected range with the selected filters
- If program/service is still running, the right/bottom end of the slider indicates current time, and if the slider is at this position, logs are updated live. The timeline keeps updating to reflect that.
- Sliders must not cross each other
- Label on the selected time range indicates the selected time range
- The timeline is marked with time range with granularity that depends on the duration of the log (which is the duration of the program run).
- Filters:
- Filter by lowest log level:
- If ERROR is selected, then we show only ERROR
- If WARN is selected, then we show ERROR and WARN
- If INFO is selected, then we show ERROR, WARN and INFO
- If DEBUG is selected, then we show ERROR, WARN, INFO and DEBUG
- If TRACE is selected, then we show ERROR, WARN, INFO, DEBUG and TRACE
- Filter by search keywords:
- Search box that filters logs by the search text.
- This is a simple filter that applies on the message column
- Filter by lowest log level:
- Log viewer Table:
- Columns:
- Timestamp
- Lowest Log Level
- Source - Only in CDAP - This column should not be shown in Hydrator
- Message (also contains stack trace).
- Default view shows single line messages, with / buttons to expand individual messages if they have more content
- Ability to suppress/show stack trace with a similar / buttons.
- Ability to expand all messages
- Ability to only view the message column
- Columns:
- Top Bar:
- Shows information/summary of the log
- Indicates program/service name
- Summary of total messages with number of warnings and errors
- Download button to download entire log
- Search box for filtering.
Backend support:
Use Case 2: Bottom Panel
Problems with bottom panel:
- Constant back-and forth between DAG and bottom panel - click on a node, then view the bottom panel - not very intuitive
- Reserved real-estate for configurations that are not commonly updated
- Schema available in both bottom panel as well as the DAG
- Reduced "prominance" for both the DAG as well as the bottom panel, since you're not using the full available space ever
- Restricted space in the bottom panel for logs, pipeline configuration, node configuration, etc
- Association between a DAG and its bottom panel is not always clear enough
User Stories:
- As a Hydrator Product Team, I want to better plan the Hydrator real-estate so it is not statically allocated for configurations/views that are not commonly used/mandatory to be updated for creating pipelines
- e.g. Pipeline configurations like post run actions, engine, schedule
- As a Hydrator Product Team, I want to better design the Hydrator UI to lay more emphasis on the DAG
- As a Hydrator user, I do not want to switch back-and-forth between the DAG and the bottom panel repeatedly for building my pipeline
- I should be able to provide node-level details right near the node
- I should be able to simultaneously view details for multiple nodes both while editing a pipeline as well as viewing it.
- As a Hydrator user, I want to be able to build my pipeline incrementally. I want mandatory information to be more obvious.
- Build the pipeline with mandatory fields only to start off
- Incrementally add schedule, post run actions, etc
- As a Hydrator Product Team, I want remove the disparity between the pipeline detail view and the studio view. This will facilitate the move towards being able to edit a pipeline after publishing
- e.g. Reference is unavailable in the pipeline details view
- e.g. Reference is unavailable in the pipeline details view
- As a Hydrator user, I want the messaging regarding multiple runs from the Hydrator UI to be clearer.
- Does Hydrator only always show the last run?
- If so, what is the "History" view for
- As a Hydrator Product Team, I want to reduce duplication
- The console is not very useful today, it just shows messages. Can it be reconciled with the notification center?
- As a Hydrator user, I want related actions to appear together.
- e.g. "Export" is available in the bottom panel, but other pipeline controls are in the top bar.
- As a Hydrator Product team, I want to bring Jump buttons to Hydrator to make them the primary method of viewing entities in different contexts across CDAP, Hydrator and Tracker
- Jump from pipeline details view in Hydrator to program details view in CDAP
- Jump actions for source/sink in Hydrator:
- View in Dataset Details page in CDAP
- View in entity details page in Tracker
- Explore Dataset (if possible) in CDAP
Design:
Use Case 2: Debuggability/Testing
User Stories:
Design:
Work Streams:
Tech Debt
- Simplify Config Store
- Simplify DAG component ~ Ajai's hack
Moving hard-coding/logic to backend
- Drafts
- Default plugin version
- For a stage, define whether it can accept an Input, Output or both
- Single APIs for status/logs/metrics for hydrator pipelines
New features
- Preview
- Log Viewer
Scratch Pad:
Possible solutions
- Tabular view: Columns for date, Class Name/Thread Name, Log Level, Log Line
- Alternate row background colors
- Vertically expandable with scrolling
- Searchable (Filter-able) columns
- Clear demarcation of rows
- Snippet with expand - especially for stack traces
- Picking only 1 or more log level -INFO, DEBUG, WARN, ERROR, ALL
- Ability to view and download raw logs if required
- Ability to view and expand only the "content" column of a log line