Introduction
Marketo provide Marketing Automation software that allows companies to streamline, automate, and measure marketing tasks and workflows, so they can increase operational efficiency.
Use case(s)
- Read and replicate all of Marketo's entity data tables, specifically leads, activities, programs and campaigns. This would allow users to extract and load all of their Marketo data into specified multi-table or multi-file sink.
- Retrieve data from one of Marketo's entity data tables, allowing users to transform and enrich the data
User Storie(s)
- As a data pipeline developer, I should be able to import all of Marketo's entity data tables so that I can analyze Marketo data
- As a data pipeline developer, I should be able to specify Marketo entity (e.g. Leads) for which I can retrieve and transform the data
- As a data pipeline developer, I should be able to specify Client ID and Client secret that would allow me to extract the entity datasets from Marketo
- As a data pipeline developer, I should be able to see any errors from Marketo API calls so that I can fix those issues
- As a data pipeline developer, I should be
Plugin Type
- Batch Source
- Batch Sink
- Real-time Source
- Real-time Sink
- Action
- Post-Run Action
- Aggregate
- Join
- Spark Model
- Spark Compute
Configuration
Marketo source will support two different plugins
- Marketo Reporting Plugin - This plugin would get all the Marketo's entities (e.g. Leads, Activities, Campaigns, Companies etc.)
- Marketo Entity Plugin - This plugin would get only the data associated with specified entity (e.g. Leads)
Marketo Reporting plugin
This plugin should be used when the user would like to retrieve all of the Marketo datasets.
User Facing Name | Type | Description | Constraints |
---|---|---|---|
App ID | string | Application ID | |
Client ID | string | ||
Client Secret | string | Marketo Client secret | |
Report ID | string | The report ID to fetch the data for | Only shown if "Use Existing Report" is set to true. |
Report type | Select | One of Instant, Standard, Floodlight, Path to Conversion (P2C), Reach, Cross-Dimension Reach or GRP. Defaults to Standard. | Only shown if "Use Existing Report" is set to false. |
Date Range | Select | One of "Last 14 days", "Last 24 months", "Last 30 days", "Last 365 days", "Last 60 days", "Last 7 days", "Last 90 days", "Month to data", "Previous Month", "Previous quarter", "Previous week", "Previous year", "Quarter to date", "Today", "Week to date", "Year to date", "Yesterday". Defaults to "Last 30 days" | Only shown if "Use Existing Report" is set to false. |
Dimensions | Select | A list of dimensions based on the report type. Defaults to all. Full list here | Only shown if "Use Existing Report" is set to false. |
Metrics | Select | A list of metrics based on the report type. Defaults to all. Full list here | Only shown if "Use Existing Report" is set to false. |
Advanced properties | Text | A set of advanced properties to include in the report criteria, based on the selected report type. Full list is here | Only shown if "Use Existing Report" is set to false. |
Marketo Entity plugin
This plugin should be used when the user would like to extract the data associated with a single entity (e.g. Leads)
Pro: No upfront work involved in generating a new report
Con: Need to work with Google to get DCM files uploaded to GCS
Note: This plugin should be a wrapper over the existing GCS source, that hides some details from the user, but also auto-populates schemas.
User Facing Name | Type | Description | Constraints |
---|---|---|---|
GCS Bucket Name | string | Name of GCS bucket where DCM data is stored | |
File Name pattern | string | Optional prefix for filenames | Optional |
CM ID | string | Doubleclick campaign manager ID |
Design / Implementation Tips
- It seems like the flow for the reporting plugin will be:
- Once the pipeline starts, request to create a report - https://developers.google.com/doubleclick-advertisers/v3.2/reports/insert
- Once the report is created, make an async request to run the report - https://developers.google.com/doubleclick-advertisers/v3.2/reports/run
- Keep polling for the report to be generated, then download it using https://developers.google.com/doubleclick-advertisers/guides/download_reports
- Once downloaded, read the CSV report and convert it into StructuredRecords for parallel execution.
Design
Approach(s)
Properties
Security
Limitation(s)
Future Work
Test Case(s)
- Test case #1
- Test case #2
Sample Pipeline
Please attach one or more sample pipeline(s) and associated data.
Pipeline #1
Pipeline #2
References
- Marketo Developers Guide: https://developers.marketo.com/getting-started/