Should a PHP developer learn Java or Python to use Hadoop effectively?

Hadoop streaming has a very generic interface for writing Hadoop jobs that just reads from stdin. You can definitely use any scripting language to write Hadoop mappers and reducers, including PHP (or even bash).

1.  You may get benefits from projects like dumbo that make it slightly easier to launch jobs, chain mappers together, etc. However, that is not very necessary and it is not too hard to replicate in the language of your choice. Python is a great language, and if you are looking for another language, this is a reasonable one for you.

3.  But there are some things more complicated in Apache Hadoop which will be done using the Java API. So, it makes it necessary for you to learn Java too to work well with the platform and understanding the code inside it.

5.   You can also do Java since that is what hadoop was written in and what you’ll find the most support for. But, Python will be more similar to PHP than Java.

2.  You can use perfectly Python for MapReduce programming. There are some wrappers for that:
-mrjob from Yelp

4.   For Hadoop, PHP is the perfect language although PHP do not have threading mechanism but there are few good libraries for it.

6.    Hadoop is written in Java. So anyone wanting to eventually get to ground zero would have to Java-in. Python can get a lot of work done. PHP continues to be the language of the web. The PHP community will probably evolve to the level of being Hadoop friendly sooner, rather than later, to prevent loss of core user base.

So, PHP developers may use Java or Python according to their work requirement to use Hadoop effectively. For any relevant queries, you may contact our customer helpdesk team at Laitkor.

