La lecture à portée de main
Vous pourrez modifier la taille du texte de cet ouvrage
Découvre YouScribe en t'inscrivant gratuitement
Je m'inscrisDécouvre YouScribe en t'inscrivant gratuitement
Je m'inscrisVous pourrez modifier la taille du texte de cet ouvrage
Description
Sujets
Informations
Publié par | ASQ Quality Press |
Date de parution | 18 mars 2019 |
Nombre de lectures | 1 |
EAN13 | 9781951058685 |
Langue | English |
Poids de l'ouvrage | 1 Mo |
Informations légales : prix de location à la page 0,6750€. Cette information est donnée uniquement à titre indicatif conformément à la législation en vigueur.
Extrait
Data Quality
Also available from ASQ Quality Press:
Quality Experience Telemetry: How to Effectively Use Telemetry for Improved Customer Success
Alka Jarvis, Luis Morales, and Johnson Jose
Linear Regression Analysis with JMP and R
Rachel T. Silvestrini and Sarah E. Burke
Navigating the Minefield: A Practical KM Companion
Patricia Lee Eng and Paul J. Corney
The Certified Software Quality Engineer Handbook , Second Edition
Linda Westfall
Introduction to 8D Problem Solving: Including Practical Applications and Examples
Ali Zarghami and Don Benbow
The Quality Toolbox , Second Edition
Nancy R. Tague
Root Cause Analysis: Simplified Tools and Techniques , Second Edition
Bjørn Andersen and Tom Fagerhaug
The Certified Six Sigma Green Belt Handbook , Second Edition
Roderick A. Munro, Govindarajan Ramu, and Daniel J. Zrymiak
The Certified Manager of Quality/Organizational Excellence Handbook , Fourth Edition
Russell T. Westcott, editor
The Certified Six Sigma Black Belt Handbook , Third Edition
T. M. Kubiak and Donald W. Benbow
The ASQ Auditing Handbook , Fourth Edition
J.P. Russell, editor
The ASQ Quality Improvement Pocket Guide: Basic History, Concepts, Tools, and Relationships
Grace L. Duffy, editor
To request a complimentary catalog of ASQ Quality Press publications, call 800-248-1946, or visit our website at http://www.asq.org/quality-press .
Data Quality
Dimensions, Measurement, Strategy, Management, and Governance
Dr. Rupa Mahanti
ASQ Quality Press
Milwaukee, Wisconsin
American Society for Quality, Quality Press, Milwaukee 53203
© 2018 by ASQ
All rights reserved. Published 2018
Library of Congress Cataloging-in-Publication Data
Names: Mahanti, Rupa, author.
Title: Data quality : dimensions, measurement, strategy, management, and
governance / Dr. Rupa Mahanti.
Description: Milwaukee, Wisconsin : ASQ Quality Press, [2019] | Includes
bibliographical references and index.
Identifiers: LCCN 2018050766 | ISBN 9780873899772 (hard cover : alk. paper)
Subjects: LCSH: Database management—Quality control.
Classification: LCC QA76.9.D3 M2848 2019 | DDC 005.74—dc23
LC record available at https://lccn.loc.gov/2018050766
ISBN: 978-0-87389-977-2
No part of this book may be reproduced in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher.
Publisher: Seiche Sanders
Sr. Creative Services Specialist: Randy L. Benson
ASQ Mission: The American Society for Quality advances individual, organizational, and community excellence worldwide through learning, quality improvement, and knowledge exchange.
Attention Bookstores, Wholesalers, Schools, and Corporations: ASQ Quality Press books, video, audio, and software are available at quantity discounts with bulk purchases for business, educational, or instructional use. For information, please contact ASQ Quality Press at 800-248-1946, or write to ASQ Quality Press, P.O. Box 3005, Milwaukee, WI 53201-3005.
To place orders or to request ASQ membership information, call 800-248-1946. Visit our website at http://www.asq.org/quality-press .
List of Figures and Tables
Figure 1.1 Categories of data.
Figure 1.2 Metadata categories.
Table 1.1 Characteristics of data that make them fit for use.
Figure 1.3 The data life cycle.
Figure 1.4 Causes of bad data quality.
Figure 1.5 Data migration/conversion process.
Figure 1.6 Data integration process.
Figure 1.7 Bad data quality impacts.
Figure 1.8 Prevention cost:Correction cost:Failure cost.
Figure 1.9 Butterfly effect on data quality.
Figure 2.1a Layout of a relational table.
Figure 2.1b Table containing customer data.
Figure 2.2 Customer and order tables.
Figure 2.3a Data model—basic styles.
Figure 2.3b Conceptual, logical, and physical versions of a single data model.
Table 2.1 Comparison of conceptual, logical, and phsycial model.
Figure 2.4 Possible sources of data for data warehousing.
Figure 2.5 Star schema design.
Figure 2.6 Star schema example.
Figure 2.7 Snowflake schema design.
Figure 2.8 Snowflake schema example.
Figure 2.9 Data warehouse structure.
Figure 2.10 Data hierarchy in a database.
Table 2.2 Common terminologies.
Figure 3.1 Data hierarchy and data quality metrics.
Figure 3.2 Commonly cited data quality dimensions.
Figure 3.3 Data quality dimensions.
Figure 3.4 Customer contact data set completeness.
Figure 3.5 Incompleteness illustrated through a data set containing product IDs and product names.
Figure 3.6 Residential address data set having incomplete ZIP code data.
Figure 3.7 Customer data—applicable and inapplicable attributes.
Figure 3.8 Different representations of an individual’s name.
Figure 3.9 Name format.
Table 3.1 Valid and invalid values for employee ID.
Figure 3.10 Standards/formats defined for the customer data set in Figure 3.11.
Figure 3.11 Customer data set—conformity as defined in Figure 3.10.
Figure 3.12 Customer data set—uniqueness.
Figure 3.13 Employee data set to illustrate uniqueness.
Figure 3.14 Data set in database DB1 compared to data set in database DB2.
Table 3.2 Individual customer name formatting guidelines for databases DB1, DB2, DB3, and DB4.
Figure 3.15 Customer name data set from database DB1.
Figure 3.16 Customer name data set from database DB2.
Figure 3.17 Customer name data set from database DB3.
Figure 3.18 Customer name data set from database DB4.
Figure 3.19 Name data set to illustrate intra-record consistency.
Figure 3.20 Full Name field values and values after concatenating First Name, Middle Name, and Last Name.
Figure 3.21 Name data set as per January 2, 2016.
Figure 3.22 Name data set as per October 15, 2016.
Figure 3.23 Customer table and order table relationships and integrity.
Figure 3.24 Employee data set illustrating data integrity.
Figure 3.25 Name granularity.
Table 3.3 Coarse granularity versus fine granularity for name.
Figure 3.26 Address granularity.
Table 3.4 Postal address at different levels of granularity.
Figure 3.27 Employee data set with experience in years recorded values having less precision.
Figure 3.28 Employee data set with experience in years recorded values having greater precision.
Figure 3.29 Address data set in database DB1 and database DB2.
Figure 3.30 Organizational data flow.
Table 3.5 Data quality dimensions—summary table.
Table 4.1 Data quality dimensions and measurement.
Table 4.2 Statistics for annual income column in the customer database.
Table 4.3 Employee data set for Example 4.1.
Table 4.4 Social security number occurrences for Example 4.1.
Figure 4.1 Customer data set for Example 4.2.
Figure 4.2 Business rules for date of birth completeness for Example 4.2.
Table 4.5 “Customer type” counts for Example 4.2.
Figure 4.3 Employee data set—incomplete records for Example 4.3.
Table 4.6 Employee reference data set.
Table 4.7 Employee data set showing duplication of social security number (highlighted in the same shade) for Example 4.5.
Table 4.8 Number of occurrences of employee ID values for Example 4.5.
Table 4.9 Number of occurrences of social security number values for Example 4.5.
Table 4.10 Employee reference data set for Example 4.6.
Table 4.11 Employee data set for Example 4.6.
Figure 4.4 Metadata for data elements Employee ID, Employee Name, and Social Security Number for Example 4.7.
Table 4.12 Employee data set for Example 4.7.
Table 4.13 Valid and invalid records for Example 4.8.
Table 4.14 Reference employee data set for Example 4.9.
Table 4.15 Employee data set for Example 4.9.
Table 4.16 Employee reference data set for Example 4.10.
Table 4.17 Accurate versus inaccurate records for Example 4.10.
Table 4.18 Sample customer data set for Example 4.11.
Figure 4.5 Customer data—data definitions for Example 4.11.
Table 4.19 Title and gender mappings for Example 4.11.
Table 4.20 Title and gender—inconsistent and consistent values for Example 4.11.
Table 4.21 Consistent and inconsistent values (date of birth and customer start date combination) for Example 4.11.
Table 4.22 Consistent and inconsistent values (customer start date and customer end date combination) for Example 4.11.
Table 4.23 Consistent and inconsistent values (date of birth and customer end date combination) for Example 4.11.
Table 4.24 Consistent and inconsistent values (full name, first name, middle name, and last name data element combination) for Example 4.11.
Table 4.25 Consistency results for different data element combinations for Example 4.11.
Table 4.26 Record level consistency/inconsistency for Example 4.12.
Table 4.27a Customer data table for Example 4.13.
Table 4.27b Claim data table for Example 4.13.
Table 4.28a Customer data and claim data inconsistency/consistency for Example 4.13.
Table 4.28b Customer data and claim data inconsistency/consistency for Example 4.13.
Table 4.29 Customer sample data set for Example 4.17.
Table 4.30 Order sample data set for Example 4.17.
Table 4.31 Customer–Order relationship–integrity for Example 4.17.
Table 4.32 Address data set for Example 4.18.
Table 4.33 Customers who have lived in multiple addresses for Example 4.18.
Table 4.34 Difference in time between old address and current address for Example 4.18.
Figure 4.6 Data flow through systems where data are captured after the occurrence of the event.
Figure 4.7 Data flow through systems where data are captured at