Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Multiple connections into a stage become a union of all DStreams coming into that stage.

Code Block

Iterator<JavaDStream<Object>> previousStreamsIter = previousStreams.iterator();
JavaDStream<Object> stageStream = previousStreamsIter.next();
while (previousStreamsIter.hasNext()) {
  stageStream = stageStream.union(previousStreamsIter.next());
}

...

A Transform stage will become a mapflatMap() call on its input DStream.

Code Block
   final Transform<Object, Object> transform = pluginContext.newPluginInstance(stageName);
  transform.initialize(transformContext);
  stageStream.flatMap(new FlatMapFunction<Object, Object>() {
    @Override
    public Iterable<Object> call(Object o) throws Exception {
      DefaultEmitter<Object> emitter = new DefaultEmitter<>();
      transform.transform(o, emitter);
      return emitter.getEntries();
    }
  });

...

A SparkCompute stage will become a transform() call on its input DStream

Code Block

  final SparkCompute sparkCompute = pluginContext.newPluginInstance(stageName);
  stageStream = stageStream.transform(new Function<JavaRDD<Object>, JavaRDD<Object>>() {
    @Override
    public JavaRDD<Object> call(JavaRDD<Object> v1) throws Exception {
      return sparkCompute.transform(computeContext, v1);
    }
  });

...