The Science of Troubleshooting Supremacy
In my years of customer support experience , I have been able to efficiently resolve extremely complex production down customer issues. Yet at the same time, in these highly technical environments, with industrial strength data center demands, it is my troubleshooting skill set and not my technical knowledge that leads to success.
“Half of root cause analysis is being able to find the word Error or Warning” is a rule of thumb that I employ daily. I execute as a solid generalist. I admit that there are realms of deep technology that are not part of my arsenal. Yet I have been able to execute at a very high level, resolving the most critical customer cases. How is that possible?
The scientific method and years of support experience are my guides. As a hiring manager, I hired individuals who were strong troubleshooters rather than experts in a specific technology. The world of tech moves at the speed of light. What was true just yesterday may be very old news today. Did you get Microsoft Certified? Great - but now Microsoft is struggling not just in the consumer space but also the enterprise. Give me great troubleshooting and someone who can learn, and that is the talent I need to deliver world class support. Everywhere in the modern economy we see that highly specialized knowledge is quickly replaced, and that workers who are unable to learn the new, or apply a set of skills to a newer reality - they fall behind.
From DOS to Windows to Web to SOA to SOAP to ReST to VMware to Solid State Storage, I have been able to apply my ability to learn rapidly and apply my generalist skill set.
What are those skills?
I started learning the scientific method when I was very young, and fell in love with the single most powerful problem solving system in human history. What does a technical support agent do? They isolate variables. They record observations. They measure and instrument. They ask to make single changes to a customer environment and observe the results. That’s science.
Like all strong scientific experiments, prior to executing any changes in the environment it is best practice to get the lay of the land. For a customer support representative, that means acquiring data about the customer environment.
One of the best analogies I have heard to describe this is to think of the process as if it was a doctor visit. At the doctor, they first weigh you, check your blood pressure, and ask you a series of high level diagnostic questions such as current medications and current symptoms.
Customers can make massive changes in the environment that have a great impact on their experience with your product and the only way to know is to be practiced in asking the right questions. You can build into your practice a frequency of check up like this that is appropriate to your customer base.Perhaps it is daily, perhaps it is weekly. But whatever the frequency, it must be done and done with care and attention to detail.
The best scientist and troubleshooter is a lazy one. They work very hard at simplifying. They throw out what does not matter so as to focus on that which does matter. After gathering environmental detail there are usually many root cause suspects that can be immediately attention. One simple example would be defects in older product.The customer is using an older version of the product. A newer version has been improved and no longer suffers from the particular problem the customer is experiencing. This is rapidly identified by noting the details from the doctor visit and comparing those to knowledge of the current product set improvements. The more variables that can be eliminated the more efficient root cause analysis can be as focus is on the variables that matter.
Change one thing and only one thing
Chaos is anathema to a customer experience of your product. Don’t ask the customer to introduce more chaos. If more than one adjustment is made during the experimentation phase of root cause analysis, that serves to only confuse the results. One variable at a time. Not only is it best practice to change one variable at a time, it is also valuable to allow a sensible amount of time to pass prior to judging the impact of the change.
A serious offense in root cause analysis is to ignore what is already known. It is imperative that the customer support engineer consult all available knowledge material that applies to the problem at hand. You would not choose to parachute
out of an airplane without consulting the parachuting 101 guide. So don’t jump into a customer issue without having your tools at your side and ready to consult. Your brain can and will fail. Your knowledge is limited. The smartest engineers are very lazy. They create value by leveraging the value already available that was created by the hard work of others.
Steal other’s work, be a lazy scientist