Saturday, June 23, 2007

Been there... Done that...

I was reading Pete Finnigans blog this morning and ran across "a rant".

A rant I very much can relate to.

One that I myself experience from time to time.

Read this - and especially the last paragraph.

Note to Pete - if this is the first time someone has every written to you in this fashion - you are very lucky :)  I love it when they use sarcasm in "thanking you" as this individual did.

Wednesday, June 20, 2007

The outage...

There were a few questions about my recent outage on asktom.oracle.com - such as:
  • aren't you clustered
  • don't you have redundant hardware
  • isn't there redundant network links
  • do you use an ISP - wouldn't you provide your own connectivity
Well, truth be told, asktom is sort of "skunk works" in nature. It isn't in an official data center - the availability is "pretty darn good". It runs on a bit of hardware that cost about the same as my laptop (just a little more, but truthfully - not a lot more).

There is no redundant hardware - unless you count the raid array. A single computer. Single network.

It is not clustered. It has no real availability requirements beyond "pretty darn available". It is done without a budget. It just takes care of itself. If it isn't available - the world doesn't stop, people still work, life goes on.

We use an ISP - at the end of the day, everyone does (except for perhaps the ISP's themselves).

Given that asktom has been running for seven years now - and this was the first major 'incident', the tag I use of "pretty darn available" fits quite well. It takes about 1% of a DBA, 1% of a System Administrator to run. It is very low maintenance, by design. APEX based - as few moving pieces as possible. Single purpose machine. Low budget.

All of those things drive me to "pretty darn available" - and it is.

So, no major changes in the works to "harden" it. The world doesn't stop when it is unavailable (truth be told - I was in class on Monday and Tuesday - not having a large backlog of 'reviews' every night afterwards was sort of nice). We might move it into a data center - whereby the availability of replacement bits and some level of network redundancy would be present - but that is a "maybe".

Pretty darn available...

Tuesday, June 19, 2007

Woo-Hoo

Asktom is alive again...

Now, for the infamous "root cause analysis" phase with the ISP...

Any problems reaching asktom - please comment here and we'll look into it, thanks (and sorry for the outage)

Monday, June 18, 2007

Asktom...

Yes, asktom.oracle.com is painfully slow to access from many places (not all, many).

Hardware failure, networking related, problem is being looked at. Only about 10% of the packets are making it in right now.

Will be fixed as soon as possible, sorry but no estimated times available...

Thursday, June 14, 2007

Beware...

Reading this might break your monitor.

Monitors and coffee do not mix, so be careful out there.

Abject Programming - I love it.  My favorite bit:

Documentation

It’s said that code should be written for people to read, so it follows that documentation is written for no one to read. Documentation should be written for every new module and then maintained as changes are put into production, or at least the next time there’s a lull in the workload.

A good time to write documentation is when someone in the department gives two-weeks notice: use that time to make sure the departing team member documents all of their code.

Cluttering up the source code with lots of comments explaining what the code is trying to do is distracting and slows down the compiler. That’s why abject shops that follow “best practices” keep documentation in a document management system where programmers can’t accidentally delete it.

 

That article deserves a bookmark...