What Is Resilient IT and How Do You Build It
We didn’t coin the term “resilient IT” —we just perfected it.
Companies with Resilient IT approach technology strategically and proactively. Resilient IT reduces the frequency, severity, and duration of fiascoes. More importantly, it gives you an evergreen lens through which to make important IT decisions for competitive advantage.
Since our inception, Waident has attracted business owners who demand a strategic, pragmatic approach to IT. They often come to us after some IT fiasco. Y2K, the internet “boom and bust,” the cloud, and COVID-19 are just a few of a long list of major fiascoes. Others include major hardware meltdowns, a damaging hack, broken onboarding processes, an office move, an inattentive MSP vendor dropping the ball, or some other issue that breaks vertebrae in their IT backbone. Firms that rely on IT for business viability—and who don’t—need an approach that keeps their employees productive, their enterprises running, and company data safe. Pragmatic businesspeople understand that IT is a powerful business enabler but it is not a panacea and is far from perfect. Their goal is to reduce the frequency, severity, and duration of fiascoes. We call our approach to achieving these outcomes Resilient IT.
We didn’t coin the term “resilient IT” —we just perfected it.
What is Resilient IT?
Resilient IT refers to the ability of an organization’s information technology (IT) systems to withstand and recover from disruptions, such as cyberattacks, natural disasters, hardware failures, or human error, while maintaining essential functions and services.
Typically, when describing resilient IT, it has the following characteristics:
- Redundancy: This involves duplicating critical components of the IT infrastructure, such as servers, storage, and network connections to ensure that backups are in place to maintain operations.
- Backup and Recovery: Regularly backing up data and implementing robust recovery processes for restoring systems and data in the event of a disruption.
- Security Measures: Implementing strong cybersecurity measures protects IT systems from cyber threats like malware, ransomware, and unauthorized access.
- Scalability and Flexibility: Systems are designed to quickly scale resources up or down as needed and integrate new technologies seamlessly.
- Monitoring and Alerting: Continuous monitoring of IT systems to detect potential issues early and respond proactively.
- Disaster Recovery Planning: Developing and regularly updating a comprehensive disaster recovery plan that outlines procedures for responding to various scenarios and restoring systems and data.
Why is Resilient IT Important
The modern enterprise runs on IT. No IT no business. Resilient IT ensures that you keep your people productive, your enterprise running, and your data safe. Maintaining resilient IT
- Ensures that critical business functions can continue operating even in the face of disruptions, minimizing downtime and financial losses.
- Protects valuable data from loss or corruption, safeguarding sensitive information and maintaining regulatory compliance.
- Mitigates risks associated with cyber threats, natural disasters, and other disruptions, reducing the likelihood and impact of potential incidents.
- Builds customer trust by demonstrating a commitment to maintaining uptime and protecting data.
How We Think about and Build Resilient IT
Resilient IT is built upon five simple principles that I developed as a corporate CIO:
- A People-first, Tech-second Mindset
- Comprehensive Understanding of System Interdependencies
- Extensive Documentation
- Systematic Preventive Testing
- Proven Troubleshooting Protocols and Processes
Combined these principles create an adaptive, but disciplined approach to IT that is aligned with business outcomes and anticipates the imperfections of both technology and humans. More importantly today, Resilient IT gives you an evergreen lens through which to make important IT decisions.
Resilient IT Attribute 1: A People-first, Tech-second Mindset
The problem with technology is not stupid people, instead, “It’s all about people, stupid.” When making IT choices, it’s much easier to worry about a server or a computer than the actual end-users. Servers don’t make demands, have opinions, or need education. At Waident, we want our tech people to think like business people. For example, business people worry about how production or delivery changes will impact customers. We want the first thing that IT thinks about when starting a task to be “How will this impact the user?” How will it impact his job, his time doing his task(s), the effort to learn a new approach, or maintain a new piece of tech? Thinking this way, ensures that our tech people don’t do the “typical tech-guy” thing and reduce their focus to the technology sitting on a desk.
How you can build resilient IT people
You can begin implementing a “people first, tech second” mindset in your organization today. First, rewrite your IT job descriptions to recruit and hire IT people with business experience and a passion for technology, instead of a tech geek with a long list of certifications and no business acumen. Second, hire support people who think about people first, communicate effectively, and have a sense of responsive urgency. You can take someone with great people skills and get them up to speed on IT more quickly than the opposite. An additional bonus of this approach is avoiding the effort of unlearning bad IT habits (e.g., IT-focused support who demand to reboot a server in the middle of the business day).
Resilient IT Attribute 2: A Comprehensive Understanding of System Interdependencies
A strong virtual private network (VPN) enabled many companies to keep their employees productive remotely, the enterprise running with few on-site personnel, and company data safe from opportunistic hackers during the pandemic fiasco. Resilient VPNs enable employees to easily get to servers and essential applications remotely while reducing risk to both mission-critical and sensitive data. Companies that neither understood nor tested the interdependencies between their VPN and important applications were surprised by unexpected and unnecessary delays, downtime, irate customers, and reputational risk. Assuming that because an application worked fine in the office it would work perfectly fine remotely through VPN neglected an obvious and critical weak link in company IT. Resilient IT understands these complex relationships and tests the functionality beforehand. This foreknowledge allows a firm to pivot and explore alternative remote technology options to enable key applications.
Managing IT interdependence is simple but hard. It is simple to know that IT must think big picture about business objectives, risks, operating scenarios, IT functionality, and software compatibility. All IT people think they are “big picture” thinkers. What separates resilient thinking from conventional thinking is having obtained that big-picture perspective and then, digging deeply into the details from multiple perspectives. Resilient thinking asks questions like:
- How will this affect the business and users?
- How will this fit within the current infrastructure?
- What absolutely has to happen for us to achieve our vision?
- What is the worst thing that could happen to jeopardize our success?
- What do we not know, not understand, or not have complete information on?
- How will this work years from now as the business changes and grows?
- How do we implement in each situation and permutation to minimize disruption and maximize value to customers, employees, and the enterprise?
How you can build resilient IT system interdependencies
You can build more Resilient IT today by testing applications under various business and operating scenarios to guarantee that they work as expected before pushing it out to users. Then, document the protocols and processes for each scenario, instead of chaotically trying to solve the issue in a crisis. You’re prepared for anything.
Resilient IT Attribute 3: Extensive Documentation
To many companies, documentation means having a secret list of basic passwords and IP addresses for users and applications. To us, that definition does not even scratch the surface of documentation. Resilient documentation covers areas like processes and procedures, cybersecurity policies, how-to’s for business applications, new computer checkoffs, new hire and termination protocols, and more. Even the smallest company should have hundreds of pages of documentation. Yes, hundreds!
Here is a typical result of not having sufficient documentation. During the pandemic, one company took more than a week to get its remote users up to full speed. The company’s problem was the result of IT not having documentation on its VPN and other infrastructure. IT assumed because they had a VPN that everything needed was in place and would work fine. It didn’t. First, the firm had only purchased licenses for 10 users. It now needed 60 to accommodate remote access through the VPN. IT didn’t know this because they had no record of it. It took days just to figure that out. With Resilient IT, it would have taken hours. Second, after buying the user licenses, they discovered network configuration issues created by the new VPN volume. Simple documentation could have avoided all of this. (See Principles 1 and 2 above)
How you can build resilient IT with extensive documentation
Want to know how resilient your organization’s documentation is today? Ask your IT for all of your documentation. It’s a simple and powerful way to determine if you, the business person, are in good shape or not. It is a straightforward request and response. After you ask, time how quickly IT returns with the documentation. If you get hundreds of documents in real-time, hours, or even a day, you are in good shape. If it takes days or weeks and you only receive a few sheets with people and passwords, you’re in trouble. Building resiliency starts with setting the expectation that documentation is in place, understood, and actionable in the event of a fiasco.
Resilient IT Attribute 4: Systematic Preventive Testing
Real life shows that you have to be proactive and prepared for the inevitable fiasco. Hiding your head in the sand, feigning confidence, or trusting a warranty are common but unviable approaches. The best way to be prepared is to develop a big-picture perspective, understand interdependencies, create threat scenarios, and do as much testing as you can to ensure everything works as it should. Sitting with a flat along the highway on a dark, rainy night is not the best time to learn how to change a tire. Learn how in your driveway when it is sunny and dry.
Waident has a redundant internet in place and we KNOW it works. It worked fine when we tested it last year and we know that if it isn’t working we just need to reboot it to get it going. How do we know this? We tested it. What percent of companies do you think have our level of confidence and knowledge? Did that approach work for you during the pandemic? Did your corporate internet fail? Did your backup kick in when there was no one in the office to reboot the equipment?
How you can build resilient IT with systematic testing
Today, you may not know what should be or is being tested currently. That’s ok. If you don’t, you can take the simple action of sitting down, listing all of our key systems, and detailing the scenarios that keep you up at night. Share that list with your IT people to determine what proactive steps and testing protocols can be implemented to ensure that these systems are operational all of the time. A weak IT team will hate this exercise. A Resilient IT team will love this challenge and probably have much of it in place. Rest assured that everything on your “keep you up” list can be proactively addressed. Test, test, test. Test anything that can be tested and develop procedures to test it regularly.
Resilient IT Attribute 5: Proven Troubleshooting Protocols and Processes
Effective troubleshooting starts with a systematic approach to problem-solving. I’m not a big fan of “IT cowboys” and my bet is you are not as well. The IT cowboy is the tech geek who sees himself as a “hero” and believes he is smart enough to jump into any situation, solve any problem, forego all documentation, and never call for backup. You don’t get much more UNsystematic than that. Applying a systematic approach means starting with user impact (See Principle 1), then, the systems affected, and then, how those systems interact with the rest of the network (See Principle 2). Normally, troubleshooting requires a rifle, not a shotgun. The Resilient IT troubleshooter intuitively accepts Occam’s Razor, that is, the simple answer is usually the correct one.
In a common scenario, a user working remotely cannot access his corporate systems from his home wifi. He assumes his personal computer is fine because he has accessed the network fine from other non-corporate wifi connections. With “IT cowboy” tech support, techs will randomly experiment with their best “ideas” to solve the problem. Predictably, the cowboy will lob some idea like the user’s Internet setup is the culprit and needs to be changed. The change is often wrong, breaks more things, and turns a small problem into a big one. Often, the solution to this common scenario simply requires a small change to a setting that could have been found quickly through documented troubleshooting protocols.
How you can build resilient IT with protocols and processes
If your firm’s troubleshooting seems a little excessive and your IT tech starts with something drastic (e.g. replacing all of your equipment), it may be time for you to push back. While you may not understand some of the technical details during the troubleshooting process, the overall troubleshooting process should make sense and feel good to you as a business person. If it doesn’t, stop your IT cowboy and make him think through his troubleshooting steps and start with the simple solutions first.
Conclusion
Resilient IT is focused on the continuous strengthening of IT health with knowledge, discipline, and process to reduce the frequency, duration, severity, and cost of an inevitable fiasco. I trust that you now understand how important a Resilient IT approach is to the success of your business.
Fiascoes may throw a wrench into your IT and business operations. On the positive side, it also affords you an opportunity to think about your current approach to business operations and how your IT is enabling or hindering your performance.
I encourage you to start taking advantage of the five principles. Apply it to the technology you already have in place. Implement it to build stronger IT and greater efficiencies. It can be hard to get your employees to adopt something new. Fiascoes, like the pandemic, create momentum for change. Use that momentum to implement new IT systems or processes that can make your enterprise more resilient to the constant redefinition of “normal.”
The right strategic direction is Resilient IT and the goal is keeping your people productive, the enterprise running, and your data safe no matter what fiasco comes your way—be it a natural disaster, economic meltdown, IT vendor collapse, supply chain breakdown, or zombie apocalypse.
About the Author
John Ahlberg
CEO, Waident
CIO in the corporate world and now for Waident clients. John injects order and technology into business process to keep employees productive, enterprises running, and data safe.
I want to avoid fiascoes with resilient IT
Related
Waident Successfully Completes AICPA SOC 2 Examination
I am pleased to announce that Waident successfully completed its SOC 2® (System and Organization Control 2) examination and received a report dated May 31, 2023. SOC 2 is an important framework for assessing and reporting on the security, availability, processing...
Oh, Sh*t! You Seriously Mean to Tell Me That I Could Lose All of My Cloud Data?
Most of us would probably agree that the cloud has turned out to be a boon for business computing. But, have we grown too confident in “the cloud” and now putting too much faith in the cloud systems we are using? Do you assume that your data is secure and that data...
IT Support is NOT IT Security
I hear all too often from business leaders who think their IT Support team and their Security team are synonymous. They just assume that Support addresses all of their security needs by default. After all, the IT Support gang has been managing the anti-virus software...
Why Do Clients Choose Waident?
Here are 100 Reasons Why Our Clients Chose Waident Over Other MSPs.