Tuesday 17th August 2021

Mobile service [SEWAN] Unexpected disturbances - 17.08.2021

Beginning: 17.08.2021 – 9.02 AM

Ending: 17.08.2021 – 2.47 AM

Impact: Degradation of mobile service


  UPDATE ***** 24.08.2021 – 5.00 PM

Please find below the full incident report received by Proximus.

 

Executive summary

  • On Tuesday 17/08/2021 at 08h55, Sewan MVNO’s end-user were unable to make calls from Belgium to Belgium and when roaming abroad.

Customer impact

  • End users were not able to make outgoing calls from 08h55 until 14h45 from Belgium to Belgium and calls in Roaming

Usage type not impacted by the issue:

  • Outgoing calls from Belgium to international destination
  • Outgoing calls to the voice mail
  • Call forward to the Voice mail
  • Incoming calls
  • SMS traffic
  • DATA traffic

Detail affected customers

  • In total, 3684 call attempts were unable to be performed

Detailed Timing

  • 08h55: First unsuccessful call detected on the platform
  • 09h00: Email sent by Sewan to our Operational support -> quick reaction by the partner
  • 10h09: Email escalated to the SUPPLIER support team and Proximus Mobile Engineering SPOC
  • 10h17: Email answered by SUPPLIER support team -> quick reaction
  • 11h45: Usage monitoring Voice disable for all contacts but not solving the issue
  • 13h30: Fall-back plan scenario discussed with technical team to unlock the situation and have a B Plan to allow the calls from Sewan end users, waiting final fix on the SUPPLIER platform
  • 13h38: Service usage monitoring Voice was completely removed, and all rules still present in the table were cleaned
  • 14h18: Fall-back plan analyzed by technical SPOC in view of possible implementation. Difficult part was to apply the fall-back only on the Sewan sub IMSI range.
  • 14h45: Unused regions were fully cleaned on SUPPLIER side and alarms stopped
  • 14h54: SUPPLIER support confirmed changes applied and issue to be solved -> last alarm raised at 14h49, as from 14h50, impact was cleared out
  • 14h58: fall-back plan ready for implementation by Ops team (but in the meantime, issue also solved on the SUPPLIER platform, so implementation was stopped)

Root cause

  • The root cause for the issues was that a region was deleted that was still in use for a specific sum plan. With Service Usage Monitoring enabled, and full authorization for the session in question, SUPPLIER platform performs the SUM check to verify whether the subscriber, who requests services, needs to be monitored.
  • This SUM check happens before the connection with the other party is established
  • The service usage monitoring rule with region 16 was still existing in the table.
  • But due to the missing region, the authorization of the session could not be completed and hence the session was rejected.

Future actions

  • Improve ticket creation and prioritization. In case of service interruption impacting all customers, L1 ticket should be created. Light MVNO could call the ICT desk after creation to get ticket prioritized on L1, high priority. Process improvement will be taken up by a process manager.
  • When a big update is needed on the setup/config, ChangeRequest will be created on SUPPLIER side to analyze the changes and possible impact on the platform => small impact on implementation time At SUPPLIER, a change request is raised to modify the logic in such a way that a delete of the region should be interrupted if it is in use in any configuration. ( Database modification: Foreign Key Constraint) Complete Fall-back plan description will be created so everybody knows how to implement on a very quick mode (Sub IMSI only)
  • Feedback loop towards partner has been inefficient or even absent. A process will be set-up in place with 1 main escalation/operational manager to handle the communication towards external partners. This will allow operational team to focus on resolution while providing live status to operational manager for communication purpose.

 

  UPDATE ***** 17.08.2021 – 3.30 PM

Please note that the mobile service has been successfully restored by Proximus at 2.47PM today.

The situation is still in investigation.


Dear customer, dear partner,

 

since 9.02AM today, we experience partial outage on our mobile service.

This results that all outbound calls to Belgian numbers are not working. 3G, 4G, 5G, SMS/MMS, inbound calls and international calls are working fine.

After investigation, we can confirm that Proximus is responsible for this outage, and are putting all necessary efforts to solve their issue. At this moment, all our team is mobilized to support them in their fast resolution.

 

No ETA has been communicated at this stage.

Our apologies for any inconvenience this may have caused.