Using Hadoop for historical sales data analysisÂ
Use this dataset to analyze various aspects of transactionÂ
1) Average unit_price by country for a given item type in a certain year
2) Total units_sold by year for a given country and a given item type
3) Find the max and min units_sold in any order for each year by country for a given item type. Use a custom partitioner class instead of default hash based.
4) What are the top 10 order id for a given year by the total_profitÂ
Please help in doing above analysis working on a Hadoop system using map reduce code, written either in Java or Python. Data preparation steps can be done as required before running a MapReduce job to answer questions above.Â
Dataset: