Point out the wrong statement.

[amp_mcq option1=”the mapper outputs are sorted and then partitioned per reducer” option2=”the total number of partitions is the same as the number of reduce tasks for the job” option3=”the intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format” option4=”none of the mentioned” correct=”option3″]

The correct answer is: C. the intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format.

The intermediate, sorted outputs are stored in a format that is specific to the Hadoop implementation. For example, in Hadoop 2.x, the intermediate, sorted outputs are stored in a sequence file format.

The following is a brief explanation of each option:

A. the mapper outputs are sorted and then partitioned per reducer: This is correct. The mapper outputs are sorted by key and then partitioned per reducer.
B. the total number of partitions is the same as the number of reduce tasks for the job: This is correct. The total number of partitions is the same as the number of reduce tasks for the job.
C. the intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format: This is incorrect. The intermediate, sorted outputs are stored in a format that is specific to the Hadoop implementation.
D. none of the mentioned: This is incorrect. Option C is the only incorrect option.

More similar MCQ questions for Exam preparation:-