A bridge too far – a view on the TSB IT migration based on the Slaughter & May report

At the end of the film, a bridge too far, the generals reflect on the root causes of the disaster of Operation Market Garden.  Many causes were identified (‘it was Nijmegen’, ‘it was the single road getting to Nijmegen’. ‘it was after Nijmegen’, ‘it was the fog in England’), in reality these all contributed to the magnitude of the failure but they ignored the real root cause: a culture from the top that led to an overly ambitious operation, characterised by one of the generals as ‘a bridge too far’.

Slaughter & May likewise call out the board decision to go for a big bang solution (single event migration) as their bridge too far moment.

Unfortunately, the report doesn’t really get behind why that call was made. The terms of reference for the review include the requirement to determine the role that financial commitments may have made to the decisions taken. The report is silent on this point in its conclusions, only indicating that there was a business case for the acquisition of TSB that seemed heavily predicated on migrating to a common Sabadell platform within a predetermined timetable; one which proved inflexible to change until very late into the programme.  However, the report does point to a number of cultural challenges: there is a reported over reliance on previous experience and a systemic lack of challenge; ‘the culture of presenting a confident face’ to fellow executives and non-executives is also called out by the report’s authors.

It is quite possible that some of the actions and behaviours observed in the Slaughter & May report are not unique to TSB, Sabadell or its ‘in-house’ IT supplier, SABIS. So, the event offers a learning opportunity. Indeed, it is likely to serve as an operational resilience case study for many years to come, just as its authors intended. It’s not one control that fails in these events:  it is the pyramid of controls that collapses from top to bottom and this case study can therefore support the development of severe but plausible scenarios to assess one’s own resilience in similar situations.

To this end, there are naturally some operational learnings to take away from this incident as well:

  • The programme ran out of time but the main migration event went ahead. The new platform was not ready to support TSB’s full customer base and SABIS was not ready to operate the new platform. The report highlights the pitfalls of ‘right to left planning’ (which occurred for the original plan and the re-plan without left to right validation). Short cuts were therefore made and decisions taken to get the platform live and then fix forward.
  • The importance of non-functional requirements (NFRs) was not well understood at senior levels compared with functional requirements. Functional requirements can be explained in customer terms quite readily. The report calls out some specific NFRs: the test environment was not like production environment and capacity management thresholds for channels were changed to pass the test.
  • Supplier management was lacking. SABIS, the in-house IT supplier, was not subject to the due diligence and governance of an external supplier, specifically for assessment of its capacity and capability.  Independent control testing was available but not reviewed. Monitoring of the quality of the early cut overs would have served as an early warning indicator. SABIS pursued a traditional supplier-customer relationship with its own supply chain rather than a more collaborative, shared outcome approach.
  • Pre-planned contingency arrangements proved inadequate. TSB rightly anticipated increased customer enquiries following the migration and increased BAU resources to this end. However, the magnitude of the disruption had not been anticipated even though 1st line essentially identified a scenario where they would struggle with multiple major incidents and multiple emergency changes happening at the same time in the period immediately post go live. However, during the incident itself key employees transferred within Sabadell to SABIS and TSB received additional external support, including from IBM, while the rest of the group continued to operate.
  • Risk management lacked cohesion and coherence. The report’s authors noted that 2nd and 3rd line did not co-ordinate to cover the breadth of risks and depth of assurance required. The ‘15 Guiding Principles’ model was conceptually a good decision support tool. However, in practice it was not independently assured. While the approach of generating ca600 page packs for review by board members shortly ahead of meetings did not lend itself to thorough review and created a dependency on executive narrative.

The post migration comments by executives in the TSB annual report highlight the tensions inherent in the operational resilience challenge, specifically, and more broadly risk management. While deep regret is expressed at the harm caused to their customers, executives note that the new platform is now providing the business with a competitive platform for the future and customers will benefit from an improved product offering and experience.  While perhaps not seeking vindication, it does remind us that the lower risk option of remaining in part or in whole on the older Lloyds platform would have restricted achievement of TSB’s business objectives. Over time this incident may well get rationalised as ‘right strategy, poor execution’ with an associated £300M unplanned cost.  After all, in a bridge too far, Operation Market Garden was seen as ‘90% successful’ by the field marshal.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment