ASSINGMENT 1 – BUSI650 Business Analytics

Weightage: 10% of the final grade

DATA WRANGLING
Submission deadline: Friday, Nov 05, 2021 @ 11:59PM (PST)

Submission Instructions: Please submit an Excel file that includes your work (functions, computations, formulas, etc) to answer all the questions in this assignment. Remember to specify the question number and part name (1. a, 1. b, …). Save your file as FirstName-LastName-Assignmnet1.xlsx

Retail Customer Data

Catherine is a marketing manager at Organic Food Superstore. She would like to use the company’s marketing dollars to market a new line of Asian- inspired meals to college-educated millennials. She has acquired a representative sample. She has spends days to form a small dataset, you can see in what shape her collected data is from the attached Excel file.

1) Question 1: Primary Analysis [50% marks]

a) Find the % missing values for each and every feature? (variable / columns) b) Based on your answer to a), which feature has the most and least amount of missing values

(ignore the first column)? c) What portion of the data (200 subjects in total) is complete and can be used for analysis

immediately without further processing? (Use a method that could have been applied even if there were 200,00 subjects in this dataset)

d) How many customers have an income greater than 50K but made a total orders of less than 10?

2) Question 2: Improving the Data [50% marks]

a) In your opinion what is the best strategy to deal with the missing values in the “ZipCode” column? b) In your opinion what is the best strategy to deal with the missing values in the “Income” column? c) Apply the strategy you mentioned in b) on “Income” column d) Compare the average value of “Income” column before and after your applied strategy.

Good Luck

