This document is aimed to summarize some basic use-cases for which CDAP is being used. Basic testing to be done from UI before creating a PR.
Functional Tests:
...
NOTE: This is a working draft, to replace another page. Do not edit or delete.
This document summarizes basic CDAP use-cases. This basic testing is to be performed in the CDAP UI before creating a PR.
Updated as of version CDAP-3.4.0.
Functional Tests
Use Case 1: How a Purchase is tracked and processed
This use - case skims through the dev developer section of the CDAP UI to test how a purchase history app is supposed to be used.
These tests check that Apps, Flows, MapReduce and Spark Programs, Services, Workflows, Datasets, Streams, and Explorer work fine for the base use-cases.
Objective: In summary, what we are testing: we have a flow through which we can inject events—which then writes it to a dataset—a Workflow/MapReduce will read from the dataset, process it and write it to another dataset, while a Service helps us in viewing the data (we could do the same thing with Explorer, too). Here, the purchases dataset stores all purchases made by the user and the history dataset stores the history of purchases made by the user.
Testing a Flow:
- Deploy PurchaseHistory (Purchase history ) app
- Go to Appthe app's detailed view.
- Go to Purchase FlowPurchaseFlow
- Start the Flow flow
- Inject events into the stream of a flow from UI - Should the CDAP UI: it should show the count of events on the stream flowlet.
- See if the events flow through all the flowlets and reach the collector.
- Stop the flow
- Go to the Datasets tab - Should : should show the datasets
- Go to History - Should : should show the run history that we just started
- Go to Purchases Datasets - Status to purchases dataset: schema page should show storage as a few Bytes bytes as we just added some streamsevents to the stream
- Go to explore the Explore tab and execute the default "select *" query. - Should : should show the results in the bottom section table (events we injected)
GIF
...
demonstrating these steps: TestingFlow
Testing a Workflow/Mapreduce:MapReduce
- Go to to PurchaseHistoryWorkflow
- Start the Workflow - This : this should pick up the events injected into the stream of a flow
- Mapreduce should run fine - : initially having green border and once completed, should be shaded with green indicating success
- Click on the Mapreduce program to MapReduce program PurchaseHistoryBuilder to go the program and check its status - Should : should show status as completed and switching between mappers and reducers should show proper metricsmetrics (Distributed CDAP only)
- Hit back and it should come back to the workflow run view
- Go to History Dataset - Same, to history Dataset: the status page should show storage as a few bytes.
- Explore Exploring the dataset should show the history of purchases made by the user (Explore tab, execute query on the dataset).
...
GIF demonstrating these steps: TestingWorkflowMR.gif
Testing a Service:
Service Use - Case - 1:
- Go to the PurchaseHistoryService and Start and start it.
- Make a request to to the "/history/{customer} end point. - The customer is the same customer " endpoint, using a customer that we referred to in our stream injection
- Should show the list of purchases the user has made.
...
GIF demonstrating these steps:
...
Service Use - Case - 2:
- Go to the UserProfileService and Start and start it.
- Make a POST call to the "/user/{id} end point with the following JSON,
{
- " endpoint with this JSON:
{
"id":"Alice",
"firstName":"Alice",
"lastName":"Bernard",
"categories":["fruits"]
}
...
- Go to the flow and inject events in the name of Alice
...
- Go to the PurchaseHistoryWorkflow
...
- , start it and wait
...
- until it completes successfully
...
...
- Go to the PurchaseHistoryService
...
- again and make the same GET Request as we did above "/user/{customer}
...
- ", using the customer
...
- "Alice
...
- "
- We should be able to see the User profile
...
- in addition to the purchase history information in the response
...
GIF demonstrating these steps: TestingService.gif
Testing a Spark :Program
- Deploy the SparkPageRank app app
- Start SparkPageRankService
- Inject data by running
./
cdapbin/cdap-cli.sh load stream backlinkURLStream examples/SparkPageRank/
binresources/
inject-data.sh (the script runs for a couple of minutes) - Start RanksService and TotalPagesPR
- Go to SparkPageRankProgram
- Click Start
urlpairs.txt
- Go to SparkPageRankProgram
- Click PageRankWorkflow to get to the workflow detail page, set the runtime arguments using
spark.SparkPageRankProgram.args
as the key and3
as the value, then click the Start button - Go to PageRankSpark program
- You should see the metrics getting ("Storage", "Stages") being updated in the page
TODO:
...
The above test makes sure Apps, Flows, Mapreduce, Service, Workflows, Datasets, Streams, Explorer work fine for base use-case.
Objective: Essentially what we are testing is -
We have a flow through which we can inject events - which then writes it to a dataset - a Workflow/Mapreduce will read from the dataset, process it and write it to another dataset - a Service helps us in viewing the data or we could do the same thing with explorer too. Here purchases dataset stores all purchases made by the user and history dataset stores the history of purchases made by the user.
Use-Case - 2: How an Adapter Works
Testing Adapter Creation:
...
Setup Source - Stream Source
- Give Stream Name
- Set Process Time Window to 1m
- Set Format to Text
- Set Schema to:
- body (type string)
...
Add GIF demonstrating these steps
Use Case 2: How a Pipeline works
These base cases should work. If not, something is wrong; the UI should say what is the error.
Objective: See if an adapter can convert a stream that is of CSV format to a TPFSAvro dataset that we use internally anywhere.
Testing Pipeline Creation
- Click "Add Application" in CDAP UI Home page and select "Hydrator Pipeline"
- Choose the pipeline type, "Batch" or "Realtime"
- "Batch" pipeline
- Give pipeline a name: "BatchTest"
- Setup Source: a Stream source, click in left sidebar
- Give Stream a name: "BatchTestStream"
- Set Duration to 1m
- Set Delay to "0"
- Set Format to "text"
- Set Schema:
- Remove all existing
- Add a "body" of type string
- Setup a Transform: Projection transform
- Fields to Drop:
- headers
- Fields to Drop:
- Setup Sink: a TPFSAvro sink
- Give Dataset a name: "BatchTestDataset"
- Set Schema:
- ts (type long)
- body (type string)
- Setup a Transform - Projection transform
- Fields to Drop:
- headers
- Fields to Drop:
- Schedule it for every 5 mins
- : enter in the "Pipeline Configuration": "Cron Expression", under "Min": "0/5"
- Save, Validate, and then Publish the pipeline
- This base case should work. If not, something is wrong
...
- and the UI should say what is the error
...
- Once the
- pipeline is created, send one or more events to the stream
- using the CDAP UI
- Either start the pipeline manually or wait until the pipeline runs on the schedule
- Every 5 mins, the dataset associated with the
- pipeline should be injected with data
- you injected through
- Send some events to the stream you created
- Wait for 5 minutes
- Explore the sink dataset sink. You should see the events you sent to the stream
...
- .
- "Realtime" pipeline
GIFs explaining the above steps: AdapterTest1.gif , and TestingAdapter2.gif
Objective: The objective of this test is to see See if an adapter can convert a stream that is of CSV format csv to a TPFSAvro dataset that we use internally anywhere.
TODO: For metrics, we need a basic test case to test.
Once the above-mentioned steps work, push the code to two different clusters, a "secure" and a "non-secure" cluster (beamer software install cluster_id cdap-ui - : should take 5 mins to beam code to a cluster)
Once the cluster is up and running, we should provide the cluster url and a gif GIF of our test. This helps for the reviewer to assume that the feature/bug fix works and could can then start reviewing the code.
Behavioral Tests
...
This is more of an open-ended section where it which depends on the user/developer to test their UI extensively. This needs more thought and automated tests to run.