...
New APIs (in MapReduceContext, used in beforeSubmit):
Code Block | ||
---|---|---|
| ||
// addsspecify a Dataset and arguments, to be theused setas ofan output Datasets for the MapReduce job: context.addOutput(String datasetName); context.addOutput(String datasetName, Dataset datasetMap<String, String> arguments); |
New APIs - note that this will be a custom mapper, reducer, and context classes which override the hadoop classes, providing the additional functionality of writing to multiple outputs:
...
New APIs (in BatchSinkContext, used in prepareRun of the BatchSink):
Code Block | ||
---|---|---|
| ||
// addsspecify a Dataset and arguments, to be theused setas ofan output Datasets for the Adapter job: context.addOutput(String datasetName); context.addOutput(String datasetName, Map<String, DatasetString> datasetarguments); |
Example Usage:
Code Block | ||
---|---|---|
| ||
public void beforeSubmit(MapReduceContext context) throws Exception { context.addOutput("cleanCounts"); context.addOutput("invalidCounts"); // ... } public static class Counter extends Reducer<TextAbstractReducer<Text, IntWritable, byte[], Long> { private MultipleOutputs mos; @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) { // do computation and output to the desired dataset if ( ... ) { context.write("cleanCounts", key.getBytes(), val); } else { context.write("invalidCounts", key.getBytes(), val); } } |
...