SAS Visual Data Mining and Machine Learning Demo


I’m Lorry Hardt, the
global product marketing manager for Advanced
Analytics, and joining me today is Jonathan Wexler,
the principal product manager for SAS’s
Machine Learning suite. Jonathan, thanks for
talking with me today. JONATHAN WEXLER: Glad
to be here, Lorry, to discuss our new release. LORRY HARDT:
Justifiably, there’s a lot of hype that
surrounds supplying more advanced
modeling techniques, such as machine learning. This is coming from the C level
enterprise architecture leaders to chief analytic officers. This has really
become a hot topic. What are your thoughts on this? Why now? JONATHAN WEXLER:
Lorry, to be frank, machine learning is
not a new concept. Many of these techniques,
such as neural networks, have been around for many years. It’s just that the
technology has finally caught up with the
computational need for analyzing large complex data sources,
including suspicious trading activity,
recommendation engines, to analyzing complex images. LORRY HARDT: One
of the key areas that when we discuss
machine learning is the enabling non-expert
data scientists or even business analysts
with the ability to rapidly explore
and model data. In SAS’s latest release of SAS
Visual Data Mining and Machine Learning, how do you see
this product addressing those types of tasks? JONATHAN WEXLER: Lorry,
SAS Visual Data Mining and Machine Learning
enables users to rapidly expand
their models in a very visual interactive way. This solution enables users to
analyze data in the experience that they feel comfortable with. They can quickly go from data
prep to interactive modeling to advanced
pipelining, and finally and most importantly,
to deployment all within this one solution. Let’s see SAS Visual Data Mining
and Machine Learning in action. For my particular
use case, I’m trying to analyze what
chemical factors helped me predict the quality of wine. I’ve already prepared my
data using a data preparation area of SAS Visual Data
Mining and Machine Learning, and now I’m in the process
of exploring and expanding my analysis by using advanced
machine learning techniques. So I can very quickly
add in a neural network, and before I build
my neural network, the system gives
me an indication of what I’m about to build. So if I’m not familiar
with the neural network, it shows me that it’s about
to add in a network diagram and various assessment metrics. And it’s very easy to
build this neural network. I can just select my attributes
and the neural network will build itself
in near real-time. I can interact with
the neural network by adding additional
hidden layers. I can change any
of the activation functions of the hidden
layers themselves. I can also edit any of
the additional options. I can even take advantage of
SAS’s advanced auto-tuning where the model
will automatically find the best performing
attributes for me. I can also interpret
my results by looking at a relative importance plot. Typically, machine learning
models are black box and you’re not
given an indication of the most important factors
that drive your model. LORRY HARDT: Jonathan,
you mentioned in this latest release users can
rapidly expand their analyzes. Can you give an example of that
type of analytic expansion? JONATHAN WEXLER: Sure, Lorry,
within SAS Visual Data Mining and Machine Learning, we
give users the ability to enhance their interactive
models by mixing and matching different techniques, applying
more advanced methods, and even integrating
programming into their analyzes. Let’s go back to
that neural network. What SAS Visual Data
Mining and Machine Learning allows you to do is
expand this neural network and build upon it by
creating a pipeline. And by creating a
pipeline, this enables me to try out different advanced
machine learning techniques. And you’ll see here that
the pipeline represents all of the steps
that I took inside the exploratory environment. But now that I’m in
this more advanced view, I have more advanced
techniques available to me. So for example, I can
try anomaly detection to either remove outliers
or model them separately. I can also add in other
advanced techniques. I could add in a
decision tree node. I can also add in, for example,
a gradient boosting node. And the system
will automatically pick the best
performing model for me. Let’s take a look
at the results. In this case, it
picked the gradient boosting model with the
highest KS statistic, but it also compared itself to
the interactive neural network that I built and a decision
tree that I added in. The system automatically
picked the best model for me. I can also customize this
pipeline by changing options. I have complete flexibility
to enhance this analysis. LORRY HARDT: Jonathan,
you mentioned that users can quickly
collaborate and share their analyzes. How can users take advantage of
these best practice templates? JONATHAN WEXLER:
Out of the box, SAS will provide a series of
best practice templates. These are stored
in the SAS toolbox. Data scientists can also
create their own pipelines and share them with
the community, which in turn, a business
analysts is empowered to produce rapid results. If we go back to the pipeline
that I’ve already built, I can access the
SAS toolbox, which allows me to take advantage
of either SAS best practice pipelines, or even
add in a pipeline from one of my coworkers,
a proven best practice. Now this best practice
pipeline has a mix and match of different advanced
techniques from gradient boosting to using our
auto tuning to even a deep neural network. It even has the ability
to enable SAS code. Let’s run it. And you can see it still
picked the gradient boosting model, which had the
highest KS statistic. It also had an ensemble
model in there. It also did auto tuning
for my decision tree, and it also added in
a logistic regression and a deep neural network,
but the gradient boosting was still the champion model. Now I mentioned this
pipeline also enables users to write their own code. With this system, you can not
only use pre-built SAS nodes, but you can also
add in SAS code, and you have the
complete SAS language available at your
fingertips to be able to add in procedures,
macros, data step, and so forth. So you can really
build the best model that makes sense for your data. LORRY HARDT: One other
thought from you. The analytic lifecycle
is no longer a debate. It is necessary to support. No model’s complete unless
you get it into action, enable businesses
to respond quickly, and take action if needed. Likewise, we live
in an open world. Can you talk about
the ability to get these models in production
internally and externally within the organization? JONATHAN WEXLER:
Absolutely, Lorry. As the scale and
demand increases for these properly
tuned analytic models, having an effective deployment
strategy and architecture is a key component in
the analytics lifecycle. Let me take you
through a few steps. So if I go back to
my pipeline, let’s take a look at how we can deploy
these models very quickly. So if I go over to the
pipeline comparison area, this screen will
compare the models across the different
pipelines that I’ve built. It will automatically pick
the best model for me. Now I can override this
pipeline comparison, I can even add in
challengers, but once I have decided which model
or models I want to deploy, the system enables you to
deploy these in the method that you see fit. You can publish these models
to databases or to Hadoop with one click with no rewrite. I can instantly
score these models once they’re pushed
to the database. This allows for tremendous
flexibility in building and testing models instantly. I can also download
a scoring API. This API call is given to
a user in multiple ways. It’s given to them
in a SAS wrapper, but it’s also
wrapped with Python. So users have the
ability to call these methods and the
score code in the method that they see fit. And if you had a
web application, we’ll even give you
the REST API call. So you have the ability to
call these models in the way that you see fit. LORRY HARDT: Jonathan,
thanks so much for sharing SAS Visual Data Mining and
Machine Learning with us today. JONATHAN WEXLER: You’re
very welcome, Lorry. LORRY HARDT: Thanks
for watching.

3 Replies to “SAS Visual Data Mining and Machine Learning Demo”

  1. Great overview of a very powerful analytical tool for industry. Manufacturing companies would really benefit from this type of approach for many different business challenges.

Leave a Reply

Your email address will not be published. Required fields are marked *