+1 (208) 254-6996 essayswallet@gmail.com
  

Need discussion with one response for 

Kirk (2016) tells us that all requirements and restrictions of a project must be identified.  Select 1 key factor below and discuss why Kirk (2016) states it will impact your critical thinking and shape your ambitions:

People: stakeholders, audience.

Constraints: pressures, rules.

Consumption: frequency, setting.

Deliverables: quantity, format.

Resources: skills, technology

Please make sure you have an initial post (about 200 words) and a comment/post to one of your friends’ posts.

Homework 3

Answer the following questions: (10 point each)

1. The following table summarizes a data set with three attributes A, B. C and two class labels *, -. Build a two-level decision tree.

ABCNumber ofInstances
+
TTT50
FTT010
TFT100
FFT05
TTF010
FTF250
TFF100
FFF025

a. According to the classification error rate, which attribute would be chosen as the first splitting attribute? For each attribute, show the contingency table and the gains in classification error rate.

b. Repeat for the two children of the root node.

c. How many instances are misclassified by the resulting decision tree?

d. Repeat parts (a), (b), and (c) using C as the splitting attribute.

e. Use the results in parts (c) and (d) to conclude about the greedy nature of the decision tree induction algorithm.

2. Classify the following attributes as binary, discrete, or continuous. Also classify them as qualitative (nominal or ordinal) or quantitative (interval or ratio). Some cases may have more than one interpretation, so briefly indicate your reasoning if you think there may be some ambiguity. Example: Age in years. Answer: Discrete, quantitative, ratio

a. Gender in terms of Mor F.

b. Temprature as measured by people’s judgments.

c. Height as measured by people’s height.

d. Body Mass Index (BMI) as an index of weight-for-height that is commonly used to classify underweight, overweight and obesity in adults.

e. States of matter are solid, liquid, and gas.

3. For the following vectors, x and y, calculate the indicated similarity or distance measures.

a. (a) x : (1,0,0,1), y : (2,1,1,2) cosine, correlation, Euclidean

b. (b) x : (1,1,0,0), y : (1,1,1,0) cosine, correlation, Euclidean, Jaccard

4. Construct a data cube from Fact Table. Is this a dense or sparse data cube? If it is sparse, identify the cells that are empty. The data cube is shown in Data Cube Table.

Fact Table
Product IDLocation IDNumber Sold
11223131221065222
Data Cube Table
Product IDLocation IDNumber Sold
123
1231050022260016272
Total1524645

5. Consider the decision tree shown below:

a. Compute the generalization error rate of the tree using the optimistic approach.

b. Compute the generalization error rate of the tree using the pessimistic approach. (For simplicity, use the strategy of adding a factor of 0.5 to each leaf node.)

c. Compute the generalization error rate of the tree using the validation set shown above. This approach is known as reduced error pruning.

Training:
InstanceACB+-+-010101ABCClass
1000+
2001+
3010+
4011
5100+
6100+
7110
8101+
9110+
10110+
Validation:
InstanceABCClass
11000+
12011+
13110+
14101
15100+

2