(Chapter 5): What is the relationship between Naïve Bayes and Bayesian networks? What is the process of developing a Bayesian networks model?
(Chapter 6): List and briefly describe the nine-step process in con-ducting a neural network project.
Analytics, Data Science and A I: Systems for Decision Support Eleventh Edition
Chapter 6
Deep Learning and Cognitive
Computing
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Slide in this Presentation Contain Hyperlinks.
JAWS users should be able to get a list of links
by using INSERT+F7
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Introduction to Deep Learning
• The placement of Deep Learning within the overarching
A I-based learning methods
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Introduction to Deep Learning
• Differences between Classic Machine-Learning Methods
and Representation Learning/Deep Learning
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Process of Developing Neural-Network Based
Systems
• A process with constant
feedbacks for changes and
improvements!
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Backpropagation for A N N Training
1. Initialize weights with random values
2. Read in the input vector and the desired output
3. Compute the actual output via the calculations
4. Compute the error.
5. Change the weights by working backward
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Deep Neural Networks
• Deep: more hidden layers and number of neurons
• Uses Graphics Processing Units (G P U)
– With programming languages like C U D A by N V I D I A
• Process larger datasets
• There are different types and capabilities of Deep Neural
Networks for different tasks/purposes
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Convolutional “Deep” Neural
Networks
• Most popular M L P-base D L method
• Used for image/video processing, text recognition
• Has at least one convolution weight function
– Convolutional layer
• Convolutional layer involves Polling (sub-sampling)
– Consolidating large vectors into a smaller size
– Reducing the number of model parameters
– Keeping only the important features
– There can be different types of polling layers
Analytics, Data Science and A I: Systems for Decision Support Eleventh Edition
Chapter 6
Deep Learning and Cognitive
Computing
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Slide in this Presentation Contain Hyperlinks.
JAWS users should be able to get a list of links
by using INSERT+F7
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Recurrent Neural Networks (R N N) &
Long Short-Term Memory (L S T M)
• L S T M is a variant of R N N
– In a dynamic network, the weights are called the long-
term memory while the feedbacks role is the short-
term memory
Typical Long
Short-Term
Memory (L S T M)
Network
Architecture
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Conceptual Framework for Cognitive
Computing and Its Promises
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Cognitive Search
• Can handle a variety of data types
• Can contextualize the search space
• Employ advanced A I technologies.
• Enable developers to build enterprise-specific search
applications
Analytics, Data Science and A I: Systems for Decision Support Eleventh Edition
Chapter 5
Machine-Learning Techniques for
Predictive Analytics
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Slide in this Presentation Contain Hyperlinks.
JAWS users should be able to get a list of links
by using INSERT+F77
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Processing Information in Artificial
Neural Networks
• A single neuron (processing element – P E) with inputs
and outputs
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Elements of A N N
• Processing element (P E)
• Network information processing
– Inputs
– Outputs
– Hidden layers
– Connection weights
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Neural Network Architectures
• Architecture of a neural network is driven by the task it is
intended to address
– Classification, regression, clustering, general
optimization, association
• Feedforward, multi-layered perceptron with
backpropagation learning algorithm
– Most popular architecture:
– This A N N architecture will be covered in Chapter 6
• Other A N N Architectures – Recurrent, self-organizing
feature maps, hopfield networks, …
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Support Vector Machines (S V M)
• S V M are among the most popular machine-learning
techniques.
• S V M belong to the family of generalized linear models…
(capable of representing non-linear relationships in a
linear fashion)
• S V M achieve a classification or regression decision based
on the value of the linear combination of input features.
• Because of their architectural similarities, S V M are also
closely associated with A N N.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Support Vector Machines (S V M)
• Many linear classifiers (hyperplanes) may separate the
data
Copyright © 2020 by Pearson Education, Inc. All Rights Reserved
Support Vector Machines (S V M)
Rohrer, B (2017) How SVM work
https://www.youtube.com/watch?v=-Z4aojJ-pdg (9:00 & 10:00)https://www.youtube.com/watch?v=-Z4aojJ-pdg
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
The Process of Building a S V M
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
k-Nearest Neighbor Method (k-N N)
• A N Ns and S V M s → time-demanding, computationally
intensive iterative derivations
• k-N N a simplistic and logical prediction method, that
produces very competitive results
• k-N N is a prediction method for classification as well as
regression types (similar to A N N & S V M)
• k-N N is a type of instance-based learning (or lazy
learning) – most of the work takes place at the time of
prediction (not at modeling)
• k : the number of neighbors used in the model
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
k-Nearest Neighbor Method (k-N N)
• The answer to
“which class a
data point
belongs to?”
depends on the
value of k
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Naïve Bayes Method for
Classification
• Naïve Bayes is a simple probability-based classification
method
– Naïve – assumption of independence among the input
variables
• Output variable must be nominal
– Can use both numeric and nominal input variables
• Can be used for both regression and classification
• Naïve based models can be developed very efficiently
and effectively
– Using maximum likelihood method
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Naïve Bayes Method for
Classification
• Process of Developing a Naïve Bayes Classifier
• Training Phase
1. Obtain and pre-process the data
2. Discretize the numeric variables
3. Calculate the prior probabilities of all class labels
4. Calculate the likelihood for all predictor
variables/values
• Testing Phase
– Using the outputs of Steps 3 and 4 above, classify the
new samples
▪ See the numerical example in the book…
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Ensemble Modeling
• Ensemble – combination of models (or model outcomes)
for better results
• Why do we need to use ensembles:
– Better accuracy
– More stable/robust/consistent/reliable outcomes
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Types of Ensemble Modeling
Figure 5.20 Simple
Taxonomy for Model
Ensembles.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Types of Ensemble Modeling
Figure 5.20 Bagging-Type Decision Tree Ensembles.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Types of Ensemble Modeling
Figure 5.20 Boosting-Type Decision Tree Ensembles.
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Ensemble Modeling
• Variants of Bagging & Boosting (Decision Trees)
– Decision Trees Ensembles
– Random Forest
– Stochastic Gradient Boosting
• Stacking
– Stack generation or super learners
• Information Fusion
– Any number of any models
– Simple/weighted combining
( )
Homogeneous
model types
decision trees
( )
Homogeneous
model types
decision trees
Copyright © 2020, 2015, 2011 Pearson Education, Inc. All Rights Reserved
Ensembles – Pros and Cons
Table 5.9 Brief List of Pros and Cons of Model Ensembles Compared to
Individual Models.
PROS (Advantages) Description
• Accuracy Model ensembles usually result in more accurate models than individual models.
• Robustness Model ensembles tend to be more robust against outliers and noise in the data set
than individual models.
• Reliability (stable) Because of the variance reduction, model ensembles tend to produce more stable,
reliable, and believable results than individual models.
• Coverage Model ensembles tend to have a better coverage of the hidden complex patterns in
the data set than individual models.
CONS (Shortcomings) Description
• Complexity Model ensembles are much more complex than individual models.
• Computationally
expensive
Compared to individual models, ensembles require more time and computational
power to build.
• Lack of transparency
(explainability)
Because of their complexity, it is more difficult to understand the inner structure of
model ensembles (how they do what they do) than individual models.
• Harder to deploy Model ensembles are much more difficult to deploy in an analytics-based Managerial
decision-support system than single models.