Suez Canal and bugs in production
In the software development world, nothing is scariest other than production server’s bugs. This is scary because we know they are there, although we can’t see them now, and we know they showed themselves up when we do not expect them.
Sometimes we can’t even replicate them in our test or development environments. In these situations, every minute counts, while not everyone can help. These bugs are risks to businesses, us companies and our clients. That is exactly what happened to the Ever Given cargo ship on Suez Canal recently: A production server bug.
Each time I saw a picture of the digger around the quarter-mile-long Ever Given ship, I remembered my team and me working late to fix a bug on a Friday afternoon. This time it was not me and no one ever saw my bug from space.
So I tried to look into this incident from a software development perspective. It’s not just about preventing such a thing, but more about what to do when it happens.
1- It doesn’t matter if your piece of software, method or service is working for the last 150 years, there is always a bug. There is always something to improve. “It works, is not enough” — Así es la Vida.
So refactor your code, refactor your solution, your architecture and methodologies and do that frequently. Frequency is the key. If not, you make more bugs than preventing them. Refactoring must be your organizational habit and culture.
2- Your small step matters. Do not listen to the crowd. They make jokes. Focus on what you are doing good. Focus on what you think works and do it. If you are as small as the digger and your problem is as big as the cargo, you need to focus on the issue, not on the report your manager asked for and not people and how they are trying to find a victim.
3- Fix the bug but do not forget to maintain. Most of the times, we fix the production bug then we let it stay untouched. Sometimes just because it’s scary to change. But any fixing done in such an environment needs to be refactored again and sometimes by other developers.
4- Fix it in an hour or never — We don’t have such a thing. We hear this a lot “We need this in an hour or never. we need to fix it by the end of this week else it doesn’t matter anymore”. This is the most useless, unproductive instruction that you can ever follow. There is a bug in your production system, you do your best to fix it. if you couldn’t fix that in your time, you continue working on that. You don’t stop because simply you can not. You continue working till you resolve the issue. If it was not like this, that ship will still be there., in the very same spot. So ignore the buzz. You have all the time in the world as long as you do not give up.
5- Never think less of your bug. As a typical software developer, we priorities features by their complexity but users do that by their needs, which is not always aligned with our list. So it doesn’t matter what is the bug, simple or complex. Try your best. Be committed to your users, even when you don’t have many of them. In less than 150 years, one afternoon there will be 369 ships stuck in a tailback waiting because you thought it’s not a big deal. The moment you say no one does this, you make a production server bug.