National telecom operators face a challenge: the data required to launch AI projects turns out to be fragmented, inconsistent, incomplete, and lacks unified directories. Attempting to train AI models on such data yields poor results, making it impossible to implement intelligent systems to improve customer service efficiency or optimize the network.
Reason: Architectural Chaos and Lack of Data Governance
This problem arises from the historically formed OSS/BSS ecosystem, which for large telecom operators can consist of 15–25 systems of different generations. Each system (CRM, billing, network management systems, customer support systems) was created to solve its own narrow task, often without considering the need for a unified customer profile or shared directories. As a result, the exact same customer might have multiple records with different addresses, contact details, or even names, while data about services and tariffs are stored in disparate billing systems. This leads to a situation where the Customer 360 concept (a unified, comprehensive view of the customer) does not work, and the time-to-market for new tariff plans is limited by the need for manual data reconciliation between legacy systems.
Solution Options: From MDM to Data Mesh
Solving this problem requires a systematic approach to data management. One option is the implementation of MDM (Master Data Management) with a consolidate model, where the MDM hub (central repository of master data) only matches and aggregates data from various sources, while the data itself remains in the source systems. This approach is relatively quick to implement but does not guarantee integrity during updates, as sources may continue to create duplicates. A more reliable, yet complex option is MDM with a registry model, which creates a centralized master data model with an API to synchronize changes back to the sources. This ensures a higher level of consistency but requires significant effort for integration and process restructuring. For large telecom operators with decentralized teams and high autonomy of business domains, a promising approach is Data Mesh. It treats data as a product, where each domain is responsible for the quality and availability of its data, eliminating a single point of failure and scaling data management.
A Typical Mistake
Often, companies, especially in the telecom sector, rush into launching AI projects, focusing exclusively on technologies and the accuracy of AI models. They spend significant resources developing complex algorithms without defining clear business metrics of success at the start. Consequently, a project might show high model accuracy but fail to deliver real business value, because it is unclear how this accuracy translates into reduced operational costs, increased customer satisfaction, or profit growth.
The right path is to prioritize business metrics first: this could be the percentage of time saved by contact center operators, the percentage reduction in request processing errors, or a specific ROI from AI implementation. Model accuracy is an important engineering metric, but it is secondary. Business value must be the priority, and only then do investments in AI become justified.
Using the example of a typical scenario involving a national telecom operator with millions of subscribers providing both B2C and B2B services, we can see how the AI Act (which came into force in the EU in 2024 and is already affecting companies working with European partners today) creates new challenges for VoIP and contact centers. The requirements for transparency, explainability, and non-discrimination of AI systems mean that an operator cannot simply deploy a voice bot for call automation without ensuring full transparency of its operations and the ability to audit decisions. This also applies to LCR-routing (Least Cost Routing) systems, which may use AI for traffic optimization. Without proper data management and clear data governance policies, fulfilling these requirements becomes impossible.
Technologies for AI in Communications
For the successful integration of AI into telecom infrastructure and compliance with the AI Act, using proven technological solutions is crucial:
- Apache Kafka — a distributed event streaming platform used for the asynchronous delivery of data change events between different OSS/BSS systems, for example, between CRM and billing (topics like customer.created, account.updated). Without Kafka, synchronization would happen via point-to-point calls, which with 15+ systems turn into a web of dependencies. Any change in one system causes a cascading failure in three others, and the integration time for a new system grows from weeks to months.
- API Gateway — a single entry point for all API calls that provides centralized access management, routing, and monitoring. In our scenario, the API Gateway is used to unify access to customer data from various OSS/BSS systems, providing a single interface for AI services. Without it, AI model developers would have to integrate with dozens of different APIs, significantly complicating development, support, and security.
- Kubernetes — an open-source system for automating deployment, scaling, and management of containerized applications. Kubernetes ensures the scaling of integration microservices and AI models during peak loads, such as during mass mailing days. Without it, peak hours would result in either API timeouts or excessive infrastructure provisioning “just in case,” which is expensive and inefficient.
In practice, to manage VoIP traffic and integrate AI models for voice data analysis, the DooxSwitch Platform (a VoIP platform for telecom operators by DooxSwitch) is used. It includes a softswitch, multi-tenant PBX, SIP-routing, billing, and WebRTC. This allows, for example, fraud detection or automatic determination of subscriber intent in real time, ensuring transparency and auditing of AI systems in voice channels in accordance with the AI Act requirements.
Risks
Despite the advantages, there are risks that can harm projects integrating AI into telecom communications. The biggest one is the lack of a clearly defined master data owner at the start of the project. If there is no single responsible party defining data quality standards and update policies, the MDM initiative is doomed to fail, and AI models will continue to run on poor-quality data. Another risk is underestimating the complexity of integrating legacy systems. Even using Kafka and API Gateway, synchronizing data from systems that are 20+ years old can turn out to be much more expensive and time-consuming than expected if a deep audit of their architecture is not conducted.
The sequential implementation of data governance and MDM practices, prior to the launch of AI projects, transforms data from an “unmanaged asset” into a ready resource for AI models and management analytics. This eliminates the surprises of garbage-in/garbage-out and allows telecom operators not only to meet the AI Act requirements but also to derive real business value from intelligent systems. For example, Softengi develops custom AI models for detecting traffic anomalies and potential fraud in VoIP networks, while the Softline and Data Management IG teams have experience implementing complex solutions to build a stable omnichannel architecture.