In a MapReduce job, you want each of your input files processed by a single map task. How do you configure a MapReduce job so that a single map task processes each input file regardless of how many blocks the input file occupies?
A) Increase the parameter that controls minimum split size in the job configuration.
B) Write a custom MapRunner that iterates over all key-value pairs in the entire file.
C) Set the number of mappers equal to the number of input files you want to process.
D) Write a custom FileInputFormat and override the method isSplitable to always return false.

Question

Quizplus · Accepted Answer

By overriding the isSplitable method to return false, you ensure that each input file is treated as a single split, thus processed by a single map task.

In a MapReduce Job, You Want Each of Your Input