Applied Data Mining for Forecasting Using SAS , livre ebook

SAS Institute - Tim Rey , Arthur Kordon , Chip Wells

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

336 pages

English

Vous pourrez modifier la taille du texte de cet ouvrage

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Applied Data Mining for Forecasting Using SAS, by Tim Rey, Arthur Kordon, and Chip Wells, introduces and describes approaches for mining large time series data sets. Written for forecasting practitioners, engineers, statisticians, and economists, the book details how to select useful candidate input variables for time series regression models in environments when the number of candidates is large, and identifies the correlation structure between selected candidate inputs and the forecast variable.

Sujets

Business intelligence

Finance

Série temporelle

Informations

Publié par	SAS Institute
Date de parution	31 juillet 2012
Nombre de lectures	1
EAN13	9781629597997
Langue	English

Informations légales : prix de location à la page 0,0145€. Cette information est donnée uniquement à titre indicatif conformément à la législation en vigueur.

Extrait

The correct bibliographic citation for this manual is as follows: Rey, Tim, Arthur Kordon, and Chip Wells. 2012. Applied Data Mining for Forecasting Using SAS . Cary, NC: SAS Institute Inc.
Applied Data Mining for Forecasting Using SAS
Copyright 2012, SAS Institute Inc., Cary, NC, USA ISBN 978-1-60764-662-4 (Hardcopy) ISBN 978-1-62959-799-7 (EPUB) ISBN 978-1-62959-800-0 (MOBI) ISBN 978-1-61290-093-3 (PDF)
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others rights is appreciated.
U.S. Government License Rights; Restricted Rights: The Software and its documentation is commercial computer software developed at private expense and is provided with RESTRICTED RIGHTS to the United States Government. Use, duplication or disclosure of the Software by the United States Government is subject to the license terms of this Agreement pursuant to, as applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR 227.7202-3(a) and DFAR 227.7202-4 and, to the extent required under U.S. federal law, the minimum restricted rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision serves as notice under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The Government's rights in Software and documentation shall be only those set forth in this Agreement.
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513-2414.
July 2012
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration.
Other brand and product names are trademarks of their respective companies.
Contents
Preface
Chapter 1 Why Industry Needs Data Mining for Forecasting
1.1 Overview
1.2 Forecasting Capabilities as a Competitive Advantage
1.3 The Explosion of Available Time Series Data
1.4 Some Background on Forecasting
1.5 The Limitations of Classical Univariate Forecasting
1.6 What is a Time Series Database?
1.7 What is Data Mining for Forecasting?
1.8 Advantages of Integrating Data Mining and Forecasting
1.9 Remaining Chapters
Chapter 2 Data Mining for Forecasting Work Process
2.1 Introduction
2.2 Work Process Description
2.2.1 Generic Flowchart
2.2.2 Key Steps
2.3 Work Process with SAS Tools
2.3.1 Data Preparation Steps with SAS Tools
2.3.2 Variable Reduction and Selection Steps with SAS Tools
2.3.3 Forecasting Steps with SAS Tools
2.3.4 Model Deployment Steps with SAS Tools
2.3.5 Model Maintenance Steps with SAS Tools
2.3.6 Guidance for SAS Tool Selection Related to Data Mining in Forecasting
2.4 Work Process Integration in Six Sigma
2.4.1 Six Sigma in Industry
2.4.2 The DMAIC Process
2.4.3 Integration with the DMAIC Process
Appendix: Project Charter
Chapter 3 Data Mining for Forecasting Infrastructure
3.1 Introduction
3.2 Hardware Infrastructure
3.2.1 Personal Computers Network Infrastructure
3.2.2 Client/Server Infrastructure
3.2.3 Cloud Computing Infrastructure
3.3 Software Infrastructure
3.3.1 Data Collection Software
3.3.2 Data Preparation Software
3.3.3 Data Mining Software
3.3.4 Forecasting Software
3.3.5 Software Selection Criteria
3.4 Data Infrastructure
3.4.1 Internal Data Infrastructure
3.4.2 External Data Infrastructure
3.5 Organizational Infrastructure
3.5.1 Developers Infrastructure
3.5.2 Users Infrastructure
3.5.3 Work Process Implementation
3.5.4 Integration with IT
Chapter 4 Issues with Data Mining for Forecasting Application
4.1 Introduction
4.2 Technical Issues
4.2.1 Data Quality Issues
4.2.2 Data Mining Methods Limitations
4.2.3 Forecasting Methods Limitations
4.3 Nontechnical Issues
4.3.1 Managing Forecasting Expectations
4.3.2 Handling Politics of Forecasting
4.3.3 Avoiding Bad Practices
4.3.4 Forecasting Aphorisms
4.4 Checklist Are We Ready?
Chapter 5 Data Collection
5.1 Introduction
5.2 System Structure and Data Identification
5.2.1 Mind-Mapping
5.2.2 System Structure Knowledge Acquisition
5.2.3 Data Structure Identification
5.3 Data Definition
5.3.1 Data Sources
5.3.2 Metadata
5.4 Data Extraction
5.4.1 Internal Data Extraction
5.4.2 External Data Extraction
5.5 Data Alignment
5.5.1 Data Alignment to a Business Structure
5.5.2 Data Alignment to Time
5.6 Data Collection Automation for Model Deployment
5.6.1 Differences between Data Collection for Model Development and Deployment
5.6.2 Data Collection Automation for Model Deployment
Chapter 6 Data Preparation
6.1 Overview
6.2 Transactional Data Versus Time Series Data
6.3 Matching Frequencies
6.3.1 Contracting
6.3.2 Expanding
6.4 Merging
6.5 Imputation
6.6 Outliers
6.7 Transformations
6.8 Summary
Chapter 7 A Practitioner s Guide of DMM Methods for Forecasting
7.1 Overview
7.2 Methods for Variable Reduction
Traditional Data Mining
Time Series Approach
7.3 Methods for Variable Selection
Traditional Data Mining
Example for Variable Selection
Variable Selection Based on Pearson Product-Moment Correlation Coefficient
Variable Selection Based on Stepwise Regression
Variable Selection Based on the SAS Enterprise Miner Variable Selection Node
Variable Selection Based on the SAS Enterprise Miner Partial Least Squares Node
Variable Selection Based on Decision Trees
Variable Selection Based on Genetic Programming
Comparison of Data Mining Variable Selection Results
7.4 Time Series Approach
7.5 Summary
Chapter 8 Model Building: ARMA Models
Introduction
8.1 ARMA Models
8.1.1 AR Models: Concepts and Application
8.1.2 Moving Average Models: Concepts and Application
8.1.3 Auto Regressive Moving Average (ARMA) Models
Appendix 1: Useful Technical Details
Appendix 2: The I in ARIMA
Chapter 9 Model Building: ARIMAX or Dynamic Regression Modes
Introduction
9.1 ARIMAX Concepts
9.2 ARIMAX Applications
Appendix: Prewhitening and Other Topics Associated with Interval-Valued Input Variables
Chapter 10 Model Building: Further Modeling Topics
Introduction
10.1 Creating Time Series Data and Data Hierarchies Using Accumulation and Aggregation Methods
Introduction
Creating Time Series Data Using Accumulation Methods
Creating Data Hierarchies Using Aggregation Methods
10.2 Statistical Forecast Reconciliation
10.3 Intermittent Demand
10.4 High-Frequency Data and Mixed-Frequency Forecasting
High-Frequency Data
Mixed-Interval Forecasting
10.5 Holdout Samples and Forecast Model Selection in Time Series
Introduction
10.6 Planning Versus Forecasting and Manual Overrides
10.7 Scenario-Based Forecasting
10.8 New Product Forecasting
Chapter 11 Model Building: Alternative Modeling Approaches
11.1 Nonlinear Forecasting Models
11.1.1 Nonlinear Modeling Features
11.1.2 Forecasting Models Based on Neural Networks
11.1.3 Forecasting Models Based on Support Vector Machines
11.1.4 Forecasting Models Based on Evolutionary Computation
11.2 More Modeling Alternatives
11.2.1 Multivariate Models
11.2.2 Unobserved Component Models (UCM)
Chapter 12 An Example of Data Mining for Forecasting
12.1 The Business Problem
12.2 The Charter
12.3 The Mind Map
12.4 Data Sources
12.5 Data Prep
12.6 Exploratory Analysis and Data Preprocessing
12.7 X Variable Imputation
12.8 Variable Reduction and Selection
12.9 Modeling
12.10 Summary
Appendix A
Appendix B
References
Index
Preface
It is utterly impossible that a mathematical formula should make the future known to us, and those who think it can would once have believed in witchcraft.
Jacob Bernoulli, in Ars Conjectadi , 1713
Curiosity about what will happen next is part of human nature, and thus the first attempts at forecasting are found rooted in history. In the ancient and medieval times, prophets like the Oracle of Delphi or Nostradamus had the status of demigods. The situation is significantly different in the 21 st century, though, when predicting the future is not divine magic anymore but a necessity in contemporary business. Thousands of professionals are building forecasts in almost all areas of human activity. Since the global recession of 2008-2009, it has been much more widely understood that reliable forecasting is necessary.
The increased demand for f