Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

New APIs (in MapReduceContext, used in beforeSubmit):

Code Block
languagejava
// addsspecify a Dataset and arguments, to be theused setas ofan output Datasets for the MapReduce job:
context.addOutput(String datasetName);
context.addOutput(String datasetName, Dataset datasetMap<String, String> arguments); 

 

New APIs - note that this will be a custom mapper, reducer, and context classes which override the hadoop classes, providing the additional functionality of writing to multiple outputs:

...

New APIs (in BatchSinkContext, used in prepareRun of the BatchSink):

Code Block
languagejava
// addsspecify a Dataset and arguments, to be theused setas ofan output Datasets for the Adapter job:
context.addOutput(String datasetName);
context.addOutput(String datasetName, Map<String, DatasetString> datasetarguments); 

Example Usage:

Code Block
languagejava
public void beforeSubmit(MapReduceContext context) throws Exception {
  context.addOutput("cleanCounts");
  context.addOutput("invalidCounts");
  // ...
}

public static class Counter extends AbstractReducer<Text, IntWritable, byte[], Long> {
  private MultipleOutputs mos;

  @Override
  public void reduce(Text key, Iterable<IntWritable> values, MapReduceTaskContextContext context) {
    // do computation and output to the desired dataset
    if ( ... ) {
      context.write("cleanCounts", key.getBytes(), val);
    } else {
      context.write("invalidCounts", key.getBytes(), val);
    }
  }

...