Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
stylecircle

...

  • Users can choose and install Neo4j source and sink plugins.
  • Users should see Neo4j logo on plugin configuration page for better experience.
  • Users should get relevant information from the tool tip:
    • The tool tip should describe accurately what each field is used for.
  • Users should not have to specify any redundant configuration.
  • Users should get field level lineage for the source and sink that is being used.
  • Reference documentation should be updated to account for the changes.
  • The source code for Neo4j database plugin should be placed in repo under data-integrations.org.
  • The data pipeline using source and sink plugins should run on both mapreduce and spark enginesdata pipeline using source and sink plugins should run on both mapreduce and spark engines.

Fraud detection use case

Source: Neo4j website

Traditional fraud prevention measures focus on discrete data points such as specific accounts, individuals, devices or IP addresses. However, today’s sophisticated fraudsters escape detection by forming fraud rings comprised of stolen and synthetic identities. To uncover such fraud rings, it is essential to look beyond individual data points to the connections that link them.

No fraud prevention measures are perfect, but by looking beyond individual data points to the connections that link them your efforts significantly improve. Neo4j uncovers difficult-to-detect patterns that far outstrip the power of a relational database.

Enterprise organizations use Neo4j to augment their existing fraud detection capabilities to combat a variety of financial crimes including first-party bank fraud, credit card fraud, ecommerce fraud, insurance fraud and money laundering – and all in real time.

User Storie

  • User should be able to install Neo4j specific database source and sink plugins from the Hub.
  • Users should have each tool tip accurately describe what each field does.
  • Users should get field level lineage information for the Neo4j source and sink.
  • Users should be able to setup a pipeline avoiding specifying redundant information.
  • Users should get updated reference document for Neo4j source and sink.
  • Users should be able to read all the DB types.

...

SectionUser Facing NameWidget TypeDescriptionConstraints
GeneralLabeltextboxLabel for UI.

Reference NametextboxUniquely identified name for lineage.Required

Neo4j Host
textboxNeo4j database host.Required

Neo4j Porttextbox
Neo4j database port.Required

Input Querytextbox

The query to use to import data from the Neo4j database.
Query example: 'MATCH (n:Label) RETURN n.property_1, n.property_2'.

Required
CredentialsUsernametextboxUser identity for connecting to the Neo4j.Required

PasswordpasswordPassword to use to connect to the Neo4j.Required
AdvancedSplits NumbernumberThe number of splits to generate. If set to one, the orderBy is not needed.

Order Bytextbox

Field Name which will be used for ordering during splits generation. This is required unless numSplits is set to one.


...

SectionUser Facing NameWidget TypeDescriptionConstraints
GeneralLabeltextboxLabel for UI.

Reference NametextboxUniquely identified name for lineage.Required

Neo4j HosttextboxNeo4j database host.Required

Neo4j Porttextbox
Neo4j database port.Required

Output Querytextbox

The query to use to export data to the Neo4j database.
Query example: 'CREATE (n:<label_field> $(property_1, property_2))' or
'CREATE (n:<label_field> $(*))'

Required
CredentialsUsernametextboxUser identity for connecting to the Neo4j.Required

PasswordpasswordPassword to use to connect to the Neo4j.Required

...

CDAP Schema Data TypesNeo4j Data Types
nullnull
arrayList
booleanBoolean
longInteger
doubleFloat
stringString
bytesByteArray
dateDate
time-microsTime
timestamp-microsDateTime

record

Code Block
{
  "type": "record",
  "name": "duration",
  "fields": [
    {"name": "duration", "type": "string"},
    {"name": "seconds", "type": "long"},
    {"name": "months", "type": "long"},
    {"name": "days", "type": "long"},
    {"name": "nanoseconds", "type": "int"}
  ]
}
Duration

record
point 2D

Code Block
{
  "type": "record",
  "name": "point_2d",
  "fields": [
    {"name": "crs", "type": "string"},
    {"name": "x", "type": "double"},
    {"name": "y", "type": "double"},
    {"name": "srid", "type": "int"}
  ]
}

point 3D

Code Block
{
  "type": "record",
  "name": "point_3d",
  "fields": [
    {"name": "crs", "type": "string"},
    {"name": "x", "type": "double"},
    {"name": "y", "type": "double"},
    {"name": "z", "type": "double"},
    {"name": "srid", "type": "int"}
  ]
}

geo point 2D

Code Block
{
  "type": "record",
  "name": "geo_2d",
  "fields": [
    {"name": "crs", "type": "string"},
    {"name": "latitude", "type": "double"},
    {"name": "x", "type": "double"},
    {"name": "y", "type": "double"},
    {"name": "srid", "type": "int"},
    {"name": "longitude", "type": "double"}
  ]
}

geo point 3D

Code Block
{
  "type": "record",
  "name": "geo_3d",
  "fields": [
    {"name": "crs", "type": "string"},
    {"name": "latitude", "type": "double"},
    {"name": "x", "type": "double"},
    {"name": "y", "type": "double"},
    {"name": "z", "type": "double"},
    {"name": "srid", "type": "int"},
    {"name": "longitude", "type": "double"},
    {"name": "height", "type": "double"}
  ]
}
Point

Approach

Create a new maven project in it's own repositorymodule neo4j-plugin in database-plugins project, reuse existing database-plugins code if possible. Add Neo4j-specific properties to configuration, add support for Neo4j-specific data types. Update UI widgets JSON definitions.

Pipeline Samples

Please attach one or more sample pipeline(s) and associated data. 

...