2
answers
2
watching
301
views

Using Hadoop for historical sales data analysis 

Use this dataset to analyze various aspects of transaction 

1) Average unit_price by country for a given item type in a certain year

2) Total units_sold by year for a given country and a given item type

3) Find the max and min units_sold in any order for each year by country for a given item type. Use a custom partitioner class instead of default hash based.

4) What are the top 10 order id for a given year by the total_profit 

Please help in doing  above analysis working on a Hadoop system using map reduce code, written either in Java or Python.  Data preparation steps can be done as required before running a MapReduce job to answer  questions above. 

Dataset:

For unlimited access to Homework Help, a Homework+ subscription is required.

Unlock all answers

Get 1 free homework help answer.
Already have an account? Log in
Avatar image
Read by 2 people
Already have an account? Log in

Weekly leaderboard

Start filling in the gaps now
Log in