A recent ‘glitch’ in the CityLink’s computer system had
caused Melbourne road users serious frustration. Burnley and Domain tunnels
were both shut down, and CityLink were unable to communicate with its incident
detection and safety systems. The suspect cause has been narrowed down to
network connectivity, with Engineers working diligently since 5:30AM to find
the root cause.
Most software is tested heavily before being released into
production, often known as a ‘QA’ (Quality Assurance) process. Software Developers
utilise Test Engineers who try to break the software to emulate real life
scenarios such as over loading, typical user behaviour, power user behaviour
and validation. Software Test Engineers also compensate for the risk of
external factors which can also cause failure: For example, the outcome of a power
source interruption or a network link outage.
In many cases, software has inbuilt capabilities and protection
mechanisms to safeguard the data. If an ATM (Automatic Teller Machine) loses
connection with the primary main frame, it will not alert the users of the
outage or error, it will simply continue to dispense money and queue the
transactions awaiting re-connectivity to the bank’s database. Customers would
not be able to print receipts and online banking logs would not reflect the transactions
until a later date, and banks have assessed this as an acceptable amount of
risk at the cost of not inconveniencing the customer, and that’s great for me
and you.
On the other hand, there are aspects of society in which
there must be a 0% chance of failure. Life support systems and airport aeroplane
towers are an example of sophisticated technology driven by defect free
software design. A failure in either of these will lead to significant loss of
human life, and at the end of the day that is the ultimate prices no one is
willing to pay. Organisations invest heavily into ‘defect free software’, and
the level of Engineering involved with developing such software takes approximately
double the time.
In order for software developers to ensure that there is 0%
chance of failure, each state space (or scenario) the software may execute
under is tested for a predictable outcome. By minimizing the number of
unpredictable outcomes to 0 it almost mitigates any chance of failure, and by ‘almost’
I mean external factors still need to be considered (backup power, natural
disaster recovery etc.).
Software Developers and Test Engineers are often under
pressure and strict timelines to complete modules and this leads to more unpredictable
outcomes (glitches), and I strongly believe the outcome of today’s gridlock and
traffic chaos in Melbourne should be a testament for organisations to invest in
more defect free software methodology.
No comments:
Post a Comment