End-to-End Data Science with SAS
181 pages
English

Vous pourrez modifier la taille du texte de cet ouvrage

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

End-to-End Data Science with SAS , livre ebook

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
181 pages
English

Vous pourrez modifier la taille du texte de cet ouvrage

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Learn data science concepts with real-world examples in SAS!


End-to-End Data Science with SAS: A Hands-On Programming Guide provides clear and practical explanations of the data science environment, machine learning techniques, and the SAS programming knowledge necessary to develop machine learning models in any industry. The book covers concepts including understanding the business need, creating a modeling data set, linear regression, parametric classification models, and non-parametric classification models. Real-world business examples and example code are used to demonstrate each process step-by-step.


Although a significant amount of background information and supporting mathematics are presented, the book is not structured as a textbook, but rather it is a user’s guide for the application of data science and machine learning in a business environment. Readers will learn how to think like a data scientist, wrangle messy data, choose a model, and evaluate the model’s effectiveness. New data scientists or professionals who want more experience with SAS will find this book to be an invaluable reference. Take your data science career to the next level by mastering SAS programming for machine learning models.

Sujets

Informations

Publié par
Date de parution 26 juin 2020
Nombre de lectures 0
EAN13 9781642958065
Langue English
Poids de l'ouvrage 5 Mo

Informations légales : prix de location à la page 0,0075€. Cette information est donnée uniquement à titre indicatif conformément à la législation en vigueur.

Extrait

The correct bibliographic citation for this manual is as follows: Gearhart, James. 2020. End-to-End Data Science with SAS®: A Hands-On Programming Guide . Cary, NC: SAS Institute Inc.
End-to-End Data Science with SAS®: A Hands-On Programming Guide
Copyright © 2020, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-64295-808-9 (Hard cover) ISBN 978-1-64295-804-1 (Paperback) ISBN 978-1-64295-805-8 (Web PDF) ISBN 978-1-64295-806-5 (Epub) ISBN 978-1-64295-807-2 (Kindle)
All Rights Reserved. Produced in the United States of America.
For a hard copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated.
U.S. Government License Rights; Restricted Rights: The Software and its documentation is commercial computer software developed at private expense and is provided with RESTRICTED RIGHTS to the United States Government. Use, duplication, or disclosure of the Software by the United States Government is subject to the license terms of this Agreement pursuant to, as applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR 227.7202-3(a), and DFAR 227.7202-4, and, to the extent required under U.S. federal law, the minimum restricted rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision serves as notice under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The Government’s rights in Software and documentation shall be only those set forth in this Agreement.
SAS Institute Inc., SAS Campus Drive, Cary, NC 27513-2414
June 2020
SAS ® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
SAS software may be provided with certain third-party software, including but not limited to open-source software, which is licensed under its applicable third-party software license agreement. For license information about third-party software distributed with SAS software, refer to http://support.sas.com/thirdpartylicenses .


Contents

Contents
About This Book
About The Author
Chapter 1: Data Science Overview
Introduction to This Book
The Current Data Science Landscape
Introduction to Data Science Concepts
Chapter Review
Chapter 2: Example Step-by-Step Data Science Project
Overview
Business Opportunity
Initial Questions
Get the Data
Select a Performance Measure
Train / Test Split
Target Variable Analysis
Predictor Variable Analysis
Adjusting the TEST Data Set
Building a Predictive Model
Decision Time
Implementation
Chapter Review
Chapter 3: SAS Coding
Overview
Get Data
Explore Data
Manipulate Data
Export Data
Chapter 4: Advanced SAS Coding
Overview
DO Loop
ARRAY Statements
SCAN Function
FIND Function
PUT Function
FIRST. and LAST. Statements
Macros Overview
Macro Variables
Macros
Defining and Calling Macros
Chapter Review
Chapter 5: Create a Modeling Data Set
Overview
ETL
Extract
Data Set
Transform
Load
Chapter Review
Chapter 6: Linear Regression Models
Overview
Regression Structure
Gradient Descent
Linear Regression Assumptions
Linear Regression
Simple Linear Regression
Multiple Linear Regression
Regularization Models
Chapter Review
Chapter 7: Parametric Classification Models
Overview
Classification Overview
Logistic Regression
Visualization
Logistic Regression Model
Linear Discriminant Analysis
Chapter Review
Chapter 8: Non-Parametric Models
Overview
Modeling Data Set
K-Nearest Neighbor Model
Tree-Based Models
Random Forest
Gradient Boosting
Support Vector Machine (SVM)
Neural Networks
Chapter Review
Chapter 9: Model Evaluation Metrics
Overview
General Information
Model Output
Accuracy Statistics
Black-Box Evaluation Tools
Chapter Review


About This Book
What Does This Book Cover?
Hello, my name is James, and I’m an addict. I’m addicted to data science books, web courses, instructional videos, blogs, data science podcasts, predictive modeling competitions, and coding. This addiction takes up the majority of my mental energy. From the time that I wake up until I fall asleep (and all through my dreams), I’m generally thinking about data science concepts and coding. I’m going to bet that many of you are in a similar situation. If so, I’m sure that you have been as frustrated as I have been about the massive hole in the instructional data science market.

The market is overrun with data science books for Python, R, and Hadoop. These books provide an overview of data science and in-depth instructions on the various machine learning models, and they provide the associated development code for those particular programming languages. Although these books are great resources for data scientists, they do not offer direct programming instruction to the most popular programming language in the business community. SAS is used by 95% of Fortune 100 companies, and these companies are the leading employers of data scientists. There is an incredible opportunity to fill the need of professional data scientists for hands-on machine learning training with real-world examples.
The unfortunate reality for many SAS programmers is that we often do not have access to the latest and greatest SAS products. SAS Enterprise Miner, SAS Visual Analytics, SAS Forecast Server, and SAS Viya are all incredible products, but they are not universally available to all SAS programmers. It is essential that a data scientist who is working in a SAS environment be able to develop and implement machine learning models in any SAS environment. Even if data scientists have access to SAS Viya, it is incredibly beneficial for them to have a solid understanding of the programming code that drives the models that they develop in SAS Viya.
This book, End-to-End Data Science in SAS ® , provides all SAS programmers insight into the models, methodology, and SAS coding required to develop machine learning models in any industry. It also serves as a reference for programmers of any language who either want to expand their knowledge base or who have just been hired into a data scientist position where SAS is the preferred language.
The goal of this book is to provide clear and practical explanations of the data science environment, machine learning techniques, and the SAS code necessary for the proper development and evaluation of these highly desired techniques. These explanations are demonstrated with real-world business applications across a variety of industries. All code and data sets are publicly available in a dedicated GitHub repository.
Is This Book for You?
If you are interested in this book, then you (or most likely the organization that you work for) have SAS installed on your computer. However, not all SAS installations are created equal. Some programmers work in Base SAS (also called PC SAS). Others have a variety of SAS software available to them:
● SAS Enterprise Guide
● SAS Enterprise Miner
● Visual Analytics
● SAS Studio
● SAS Viya

This list is just a sample of the many SAS products available. In addition to these products, there are several software components that SAS offers:
● SAS/ACCESS software
● SAS/ETS software
● SAS/IML software
● SAS ODS Graphics Editor
● SAS/OR software
● SAS/STAT software
Your company’s IT department generally dictates the SAS products that you have and the software components that are available to you. If you desperately want SAS Viya or SAS/ETS, you will often have to “fight the power” to get it. I sincerely hope that you can access one or many of these SAS products because they are awesome, and they will make your life as a data scientist much easier and much more productive. However, if you are like me and you have to develop predictive models without the benefit of all the toys that SAS has to offer, then this book is what you have been waiting for.
SAS Software Requirements
The minimum requirement for the majority of procedures detailed in this book is SAS 9.2 with SAS/STAT installed. This requirement should cover most SAS users. With this minimum requirement, we will be able to develop:
● Linear regressions
● Logistic regressions
● Clustering
● Decision trees
Some of the more advanced procedures will require SAS Enterpr

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents