Cognite engineering has monitored the system since several improvements were done in the period October 10th to October 13th. The performance of the service is found to be meeting our expectations and customers should no longer experience processing halts. This incident is considered resolved.
Resolved
Cognite engineering has monitored the system since several improvements were done in the period October 10th to October 13th. The performance of the service is found to be meeting our expectations and customers should no longer experience processing halts. This incident is considered resolved.
Monitoring
The engineering team has deployed changes that has improved the situation substantially. The incident remains open as the severity of the situation is tightly coupled to the load on the system. The team has been monitoring the performance over the last few days, and have decided to not do further changes before the system behavior is observed during high load again.
Identified
Cognite engineering has identified the failure patterns leading to the degraded processing capacity of the engineering diagram contextualization. The team is monitoring the system applying workarounds when the bottleneck appears and work builds up. The team will be monitoring the state of the service over the weekend, and have the capacity dedicated to working on a permanent fix as well.Customers may observe degradation in processing speed but should submit support tickets in case they experience a full stop or experience performance levels that cause production problems.
Investigating
Cognite engineering is working on resolving an incident where the throughput in the engineering diagram contextualization is significantly reduced. The impact is currently only seen on the USA-E1 cluster.