Friday, January 22, 2010

The art of troubleshooting

Since I joined TAC, I've been the top case solver for 16 consequent quarters no matter what technology group I worked in. I'd like to share some tips on troubleshooting.

Understand the user's expectation.

Still remember the old joke that a user called IT support and said the "cup holder" on his computer stopped working? It turned out to be the CD drive. He insisted he's been using it for years.

Understanding user's expectation can help you determine if you should do customer education or troubleshooting.

Keep it simple

Which one is easier? Troubleshoot a light switch or troubleshoot a space shuttle?

Multiple-system integration adds complexity to the problem. You should try to simplify it as much as possible.

For example: When PSTN call comes in, it hits Unity Auto Attendant. Press 2 to transfer to sales queue, which is a CTI route point handled by UCCX. If no one answers the call, it should goes into a voicemail box dedicated for sales department. Instead of going into voicemail, the caller just heard repeating "transferring..."

In this case, we have too many elements in the picture - Unity, UCCX, CUCM, voice gateway, service provider. Instead of troubleshooting from end to end, we should troubleshoot it segment by segment -

1) What if we called the sale agent directly? If he didn't answer, would the call goes into voicemail? (get Unity Auto Attendant and UCCX out of picture)
2) What if we bypass Unity Attendant Console and call UCCX route point directly? Would it work properly? (get Unity Auto Attendant out of picture)
3) What if we make a test call from internal phone? Would the problem be the same? (get PSTN and voice gateway out of picture)

Other tips to make things simple during troubleshooting:
1) Use default settings. For example, use a "vanilla windows" (fresh installed with Microsoft CD) instead of using a "corporate customized" image.
2) Test on LAN instead of over VPN (again, decrease number of elements)
3) Always assume the system is case sensitive (err on the safe side)

Find a reference point

If a software doesn't work for one user and works for another one, use the good one as reference point and find out the difference.

Of course there are many differences between two users, such as their wife and kids. :) But we should look at the most relevant ones.

Most software nowadays are "client-server" model. The most relevant ones are accounts and computer. Switch the computer (or switch the account) to see if the problem follows the computer or account. If it follows the computer, it might be network or computer settings (client side). If it follows the account, it might be configuration issue (most likely server side).

Understand positive and negative result of the test

e.g.

"Dad, I couldn't find any Easter eggs in the backyard!". Does that mean there's no eggs there?

"Dad, I found some Easter eggs in the backyard!". That means there are some eggs there.

No comments:

Post a Comment