Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Checklist

  •  User Stories Documented
  •  User Stories Reviewed
  •  Design Reviewed
  •  APIs reviewed
  •  Release priorities assigned
  •  Test cases reviewed
  •  Blog post

Introduction 

The current metadata search does not allow for users to search for entities by their creation datedates. The feature presented in this design doc aims to allow users to query entities created in a desired time-frame through the use of specified intervals and / or inequalities in the search bar. This by their creation date as well as accepted forms of user-defined date properties. A new search syntax will be introduced to facilitate this. The feature will be implemented through Elasticsearch.

Goals

Allow users to search for entities based on their creation a date property. 

User Stories 

  • As a pipeline developer, I want to be able to see entities that I made within the past hour.As a pipeline developer, I want to be able to see entities that were created before or after a certain date.

  • As a pipeline developer, I want to be able to see entities that I made between one date and another.

  • As a pipeline developer, I want to be able to see entities that are set for deployment on a specific date.

  • As a pipeline developer, I defined my own property with a date value, X, and now I want to be able to see entities that are more than (or less than) x-days old.get entities whose value X is after a specific date.  

Design

  • To accommodate for this type of search, a new syntax will be introduced. Relevant methods in the the QueryParser class  class will be modified to be able to detect this syntax, and the the QueryTerm class  class will include a new field to indicate if the term is a date search.  

  • Modify the current search method such that when a date query is supplied, entities created within a given range of time with a date-value condition are found. 

  • In case there is a similarity between date-search syntax and other user-created metadata labels, the program will conduct a regular string search for the query in addition to the date search.

Implementation

  • Currently the the QueryBuilder object  object used for searches is a BoolQueryBuilder, which has been used to distinguish which terms are required and optional in the search results. ElasticSearch also has a RangeQueryBuilder which  which will be applied to search for creation times as a range over an hour, day, or multiple days. 

  • A date query will be assumed to be an optional field, unless otherwise indicated by the user’s syntax, a required query and so the the RangeQueryBuilder object  object can be added into the main main BoolQueryBuilder’s optional required searches. Since the the RangeQueryBuilder will  will be embedded in the the BoolQueryBuilder, there will still be only one search object for the entirety of the user's query

  • The user’s search date / time will be converted from the predetermined UI syntax into a long to represent the millis since the epoch (unix time). This is the format that is currently used to store creation time of entities and is supported by Elasticsearch

  • The metadata Property class will now include an additional field called date which will be a Long. If a user defines a metadata property with a date-format value, the value will be parsed into unix time and stored as date. This will allow for a date search over user defined dates

Approach

Approach #1

  • User writes their query with the new syntax
  • Once we detect that it is a date query we indicate it
in QueryTerm class
  • in QueryTerm class with an enum: SearchType
    • But we keep the term as it is originally
  • Edge case: term is indicated as both required and date by user

      • Assumption: all date queries are required, whether indicated or not
    Modify
    • Modify createMainQuery()
    in
    •  in ElasticsearchMetadata
    so
    •  so that it creates a date subquery when necessary
      • When iterating over
    the
      • the QueryTerms, if it is a DATE SearchType, call
    dateQuery
      •  dateQuery = createDateQuery  and add the result to create a new range query builder. 
      • Continue to call  termQuery = createTermQuery  as earlier, for possible conflict reasons as mentioned in the design section.
      • Place both objects (dateQuery
    and
      •  and termQuery) into
    the
      • the boolQuery
    search
      •  search, checking, as in the current implementation, to see if the query is required or optional.  
    • createDateQuery()
      • This method will create a range query from the
    supplied
      • supplied QueryTerm.
    The
      • The QueryTerm’s term (string) will be parsed to extract the date range that is meant to be searched for. The extracted range will be in the form of a long representing unix time.  
      • The method will consider whether a user-defined field is provided for the search. If so, only that field will be searched for given date values. Otherwise search over creation times and all user-defined dates - creation time results will be presented first. 

    API changes

    QueryTerm

    Introduce the following fields:

    public enum SearchType { STANDARD, DATE }

    private SearchType searchType;

    private Long date;

    Accommodate a new constructor that takes a SearchType as  as a parameter. Existing constructor will default the the searchType to be  to be SearchType.STANDARD

    QueryParser.parseQueryTerm()

    Create a check for the new keywords (using an if-statement, similar to checking for required terms). If the term contains the new keywords, return a QueryTerm indicating  indicating SearchType.DATE in  in its construction, otherwise otherwise SearchType should be  should be STANDARD. 

    UI Impact or Changes

    • Users create their date query starting with one of the following keywords

      • before 

      • after 

      • on?

      • between 

      • newer_than 

      • older_than 

    • Keywords are followed by a colon : and either a date or a time measurement 

      • before, after, and on take a date

      • between takes two dates which are also separated by : 

      • newer_than and older_than take a time measurement 

    • Date representation

      • MM/DD

      • MM/DD/YY

    • Time measurement representation

      • #m for minutes

      • #h for hours

      • #d for days

    • Examples searches and their results: 

    • “before:04/15”

      • Returns entities created before April 15th of this year
    • “after:04/15/18”

      • Returns entities created after April 15th, 2018

    • “newer_than:3d”

      Returns entities that are less than 3 days old

      Choices to be made:
    • Should the user input the date MM/DD or DD/MM format?
    • Should there be a main keyword for date searches such as DATE? This way we only need to check for one keyword instead of (currently) six to determine if it is a date search. And once we know it is a date search we can determine what kind
    • "DATE:" 

    • The colon can be followed by a date-field name to search over, followed by another colon

    • The supported date format (for both defining and querying dates) is:
      • YYYY-MM-DD
    • The date can immediately follow "DATE:" or it can follow the field name and colon

    • To indicate conditions such as before or after a date, use the following comparison operators immediately before the date:
      • > (after)
      • >= (on or after)
      • <= (on or before)
      • < (before)
      • == (on - if no condition is specified this is the assumed condition)
    • Examples searches and their results: 

      • “DATE:2019-07-01” or "DATE:==2019-07-01"

        • Returns entities created on July 1st, 2019 or with equivalent date property values. 
      • “DATE:<2019-07-01”

        • Returns entities created before July 1st, 2019 or with equivalent date property values. 

      • “DATE:<=2019-07-01”

        • Returns entities created on or before July 1st, 2019 or with equivalent date property values. 

      • “DATE:>2019-07-01”

        • Returns entities created after July 1st, 2019 or with equivalent date property values. 

      • “DATE:>=2019-07-01”

        • Returns entities created on or after July 1st, 2019 or with equivalent date property values. 

      • "DATE:my_date:2019-07-01"
        • Returns entities that have a property called "my_date" the value of which is equivalent to the queried July 1st, 2019 date. 
        • All other comparison operators work similarly with this type of field search.

    Related Work

    • Work done on the Required Search Fields feature was the first work that allowed for new syntax to be easily defined for the UI

    Future Work

    • Currently only YYYY-MM-DD format is supported for date search and definition. Support for other date syntaxes should be implemented in the future