La lecture à portée de main
Vous pourrez modifier la taille du texte de cet ouvrage
Découvre YouScribe en t'inscrivant gratuitement
Je m'inscrisDécouvre YouScribe en t'inscrivant gratuitement
Je m'inscrisVous pourrez modifier la taille du texte de cet ouvrage
Description
Sujets
Informations
Publié par | eBookIt.com |
Date de parution | 18 février 2021 |
Nombre de lectures | 0 |
EAN13 | 9781456637033 |
Langue | English |
Informations légales : prix de location à la page 0,0500€. Cette information est donnée uniquement à titre indicatif conformément à la législation en vigueur.
Extrait
Netops 2.0 Transformation
The Dire Methodology
By
Ray Belleville
© Copyright 2021 by Ray Belleville - All rights reserved.
It is not legal to reproduce, duplicate, or transmit any part of this document in electronic means or printed format. Recording of this publication is strictly prohibited.
CONTENTS
Introduction
Part I: Goals Of A Netops Organization
Chapter One: Network Availability
Influencing Network Availability
Redundancy
Resiliency
Useful Network Availability Tools
Chapter Summary/Key Takeaways
Chapter Two: Mean Time To Repair
Influencing Mean Time To Repair
Useful Mean Time To Repair Tools
Chapter Summary/Key Takeaways
Part II: Why We’ve Reached The Limits Of Netops 1.0
Chapter Three: Organizational Structure
Chapter Summary/Key Takeaways
Chapter Four: Documentation
Documentation Challenges
Tools For Documentation:
Chapter Summary/Key Takeaways
Chapter Five: Isolation
Limitations Of Cli Consoles:
Limitations Of Traceroute
Limitations Of Guis And A Single Pane Of Glass
Tools For Isolation
Chapter Summary/Key Takeaways
Chapter Six: Repair Challenges
Tools For Repair
Chapter Summary/Key Takeaways
Chapter Seven: Escalation Challenges
Tools For Escalation:
Chapter Summary/Key Takeaways
Part III: Dire Netops Transformation
Chapter Eight (Bonus): Network Hygiene
The Results?
Chapter Nine (Bonus): 5 Questions You May Not Be Able To Answer Today
Conclusion
Acknowledgments
About The Author
Introduction
C
ongratulations on purchasing this book. You are on the road to understanding network operations (NetOps) through the DIRE NetOps Methodology lens.
This book is for anyone who has a role in Network Operations. It doesn’t matter if you’re in your first year or 30 th ; there will be value for you.
Somewhere around 1996, when I was working on a consulting opportunity with Canada's government, I needed to classify the types of work an operator did every day. Things have not changed in these classifications for almost 30 years, but the needs have.
The results of my analysis provided four buckets: documentation, isolation, repair, and escalation. I hadn’t written them down in that order yet but quickly realized they spelled DIRE, and the acronym was born, evolving ever since.
We are at a time in networking where Engineers must know multiple vendor CLI syntax, hundreds of protocols and technologies, and dozens of tools. However, each new “monitoring tool” we deploy does not address yesterday’s problems. They only exacerbate them.
While there are many ways that NetOps can impact a business, including preserving the company’s reputation and increasing satisfaction for the network users. The focus of most companies is on the financial implications of outages.
A Ponemon Institute study 1 highlights the problems experienced by NetOps organizations around the globe. Operator error is the third leading root cause for unplanned outages. 63% of organizations feel they do not have the resources to deal with unplanned outages. 22% of the outages are considered avoidable, improving by only 2% in 6 years Average cost of an outage is $9000 per minute or $570,000 per hour
While these data points are alarming, they are often not enough to influence a change that will cost some upfront dollars to improve the overall health of the NetOps team.
Since the tech bubble burst in the early 2000s, companies have been running their NetOps organization as lean as possible, while their networks are growing in complexity, scale, and mission-critical status.
The challenge has been “planning for the unexpected,” which is where the NetOps teams have struggled. Once activities are outside a small deviation from the norm, structures begin to break down, overwhelming the team to a point where they cannot keep up.
“There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.”
DONALD RUMSFELD 2
Being prepared for the unexpected is where we need to leverage our company’s domain knowledge and the “how-to” expertise to plan accordingly.
We’ve reached the limit of NetOps 1.0, and companies that don’t act fast will be left behind and continue to be at risk considering the pandemic we are living in right now.
The companies that transform will see fewer incoming tickets, faster resolution times, improved service delivery, and reduced friction between teams.
Using the netops understanding, techniques, and network automation tools identified in this book:
One company reduced P1/P2 incidents by an average of 34% each month.
Another reduced their network audit time from 3 months to 3 weeks.
And yet another reduced their escalation rate by 22%, freeing up their senior network Engineers.
I’ve worked on some pretty unique projects with some even more incredible people, who were all excellent teachers in their own right. This book aims to pass on the lessons I’ve learned from them and personal experience in a simple way to remember and provide quick returns.
Below is a brief list of some of the projects I’ve worked on in my networking career. When you contemplate the scale of the team of people it took to make these things happen, remember that is the team I was privileged to learn from, and in many cases, we learned through collaboration. Creating the first National Internet Service Provider in Canada (iSTAR Internet) Conducting the first trans-Atlantic 10GE testing with CERN and Carleton University Architected a 35-building Research and Education campus network upgrade, including its network operations center from the ground up. Deploying and supporting the largest IPTV network in the USA (2016-2017) Creating a CCIE-level vendor/technology certification program, including all training materials, student guides, labs, and certification exams (written and practical) Creating a Network Automation product that addresses the many needs of network operators.
In 2020, I saw that many companies were still struggling with Network Automation. They had tools, but they were not able to extract the full value.
This book will provide you with a deep understanding of NetOps and guide you with a methodology to transform your network operations into an efficient machine, resilient to change and increasing scale.
Let’s begin, shall we?
PART I
Goals Of A Netops Organization
V
irtually any discussion surrounding NetOps will have one of 2 goals in mind. Increase Network Availability : Network availability is from the user’s perspective and defines how available the service was that they subscribed to over a specific period. Reduce MTTR : MTTR is the mean time it takes to resolve an incident. Typically, time the customer agrees service was down and restored on average.
It doesn’t matter to which organization you belong. If you have a network, someone is tracking these numbers as a KPI. On a rare occasion, they may be calling them by a different name. But the reality is that companies rely heavily on the network to do everything.
When the network is down, they are losing business. IT leadership might track the symptoms by another name, but the root is always “how available was that service to the user.” And “when it broke, how long did it take to fix it.”
Let’s dive a little deeper into these ideas to understand better what they mean and how to influence them.
Chapter One
Network Availability
T
here is a widely accepted formula for determining network availability.
Total Uptime/Total Time
The first thing you’ve probably noticed is that Uptime is the only thing that matters when it comes to availability. So fundamentally, MTTR (the mean time it takes to restore) and Availability are virtually the same. Where they differ is in perspective when you are looking at improving the KPIs. We’ll get to that in a little bit.
Since we know that availability is critical, what sort of targets are companies setting for their networks in 2020? Table 1 shows the three most common grades of network availability that you will find in discussions today.
Availability
Outage Time Allowed per year
Grade
99.999%
5.26 minutes
Carrier
99.99%
52.56 minutes
Service
99.9%
8.76 hours
Standard