Large advances have been made in hardware and every level of the software stack since the virtualized Hadoop tests published in April 2013. This paper shows how to take advantage of these advances to achieve maximum performance. The cluster size remains at 32 two-processor 2U hosts; however, the processor, memory, network, and storage capabilities are all roughly doubled from those reported in the earlier paper. The performance of native and several VMware vSphere® 6 virtualized configurations were compared using the same TeraSort application suite as before.
It was found that the more powerful hosts give a larger advantage to multi-VM per host configurations: virtualized TeraSort is now up to 12% faster than the optimized native configuration. The apples-to-apples case of a single virtual machine per host again shows performance close to that of native Linux. The origins of the improvements are examined and recommendations for optimal hardware and software configurations are given.