Question:
Have you ever broken your production code?
2014-06-17 16:09:42 UTC
We froze development on version 2 of our system and started from scratch with a new system.

Our development environment, staging environment, etc, were eventually taken over by the new system. Clients have been using the old system for a long time, and we just weren't going to make any changes to it, so we didn't need these other environments.

Well today we needed to make a change, and so I had to do it to live code. I made a change and deployed. Everything broke, something was wrong with the deploy process. After struggling to fix it I got help and we figured it out.

After that I was so stressed I quickly tested my change and moved on. Later on I got a message about a feature not working. I immediately knew that I broke it. Took me about 10 minutes to fix it again, but for a few hours it wasn't working because I didn't test it very well because for 15 minutes I was freaking out about taking the entire system down and wasn't in my best mind.

Have any of you ever done anything like this?
Four answers:
Ratchetr
2014-06-17 17:28:56 UTC
I've certainly introduced bugs into production code, but I can't say I've ever totally broken the production system. (I'm pretty sure I would remember that ;-)



I totally grokked our development system once. Copy pasted a delete from statement, went through and changed all the table names I wanted to delete, then went to lunch. Came back and totally forgot I needed to add a where clause to each of those. Just ran it.... But it was only a dev server. (It was a fun firedrill though).



I think 4 things went wrong for you:



1) Someone decided you didn't need a dev or staging system.

2) "needed to make a change, and so I had to do it to live code"

3) "I quickly tested my change and moved on"

4) You were allowed to do #2 and #3, because of #1.



If you skip the dev and staging systems, and go directly to production, then, yeah...your gonna break things there. It's not like you can write perfect code (see your question for proof, but don't worry, none of us can).



You use your dev system to do your quick test and move on (although as you get smarter, you might learn that your quick tests are costing you time down the road when you have to revisit code that is no longer fresh in your mind).



You use your staging system to verify you can deploy your changes without breaking things, and to test the new features/bug fixes. Somebody *else* tests your code on the staging system (because you suck at testing, but don't worry, most programmers do).



Then everyone involved gets together and reviews the results. Did the deploy work? Did the code work? What issues arose? Then you talk about what the rollback plan is. Did you have a rollback plan for today's change? I'm thinking not, or you might have used it. If you don't have a rollback plan, then you really don't have a deployment plan.



Use this to your advantage to argue for proper dev and staging machines. Yeah, they cost money. You need them (you just proved it). How much does it cost to have the entire system down, then to have bugs in the system once it is back up?
2014-06-17 23:00:37 UTC
I haven't totally killed a system. I have broken a feature or 2. But that's what dev and version control is for.
Pipsy
2014-06-17 16:31:16 UTC
I thought this was a normal thing to do haha
?
2014-06-17 16:23:35 UTC
I've done similar things and lived to tell about it. This is how you learn. Take your lumps, learn from them and move on, brother.


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...