Research & Vision

AI-Driven Network Management with Large Language Models

April 11^th 2025, Würzburg

Since ChatGPT was introduced at the end of 2022, generative AI (genAI) has gained significant attention. Every day, developers are creating novel and powerful applications and use cases that showcase the potential and capabilities of genAI. The technology behind ChatGPT, known as Large Language Models (LLMs), has received significant attention. With the increased availability of information and computing power, numerous new models have emerged. The application possibilities of genAI and LLMs have steadily expanded in recent years. So it’s no wonder the demand for AI-driven network management with genAI is also growing. Let’s discuss how genAI and LLM can be beneficial to network management and operations.

Large Language Models vs. generative AI

Before we explore how genAI applies to network operations and management, let’s define some important terms.

Definition of generative AI

GenerativeAI (genAI) refers to a category of AI algorithms that generate new content. In contrast to predictive AI, where existing data is analyzed to make predictions or classifications, generative AI models create new sequences (text, images, code, and more) based on patterns learned from previous data. The new created sequences were not present in the training data but are similar or relevant to it. This includes a variety of techniques, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs).

Definition of Large Language Models

Large Language Models (LLMs) are a subset of genAI specifically designed to understand, generate, and transform human language. They are trained on large amounts of input data and excel at a variety of language tasks, including translation, summarization, Q&A, and content creation.

LLM models such as GPT (Generative Pretrained Transformer) and BERT (Bidirectional Encoder Representations from Transformers) are examples of natural language processing (NLP) techniques.

NLP is a field that intersects between AI and linguistics, focusing on the interaction between computers and human language. It includes various models and techniques, not limited to genAI, for understanding and manipulating human language. NLP has a wide range of applications, including text classification, sentiment analysis, language translation, and speech recognition. It involves not only generating language but also understanding, interpreting, and analyzing it.

Applications of genAI in the field of network operations and management

From a network perspective, genAI has significant potential for a wide range of applications and activities. These include report generation, data analysis, network automation, resolving network outages, optimizing business processes, and many more. You can already implement some applications out of the box with the existing and pre-trained models.

Co-pilot for communication

In simple cases, users can interact with it to respond to outage tickets and connect to commonly experienced problems. For this case, genAI becomes the backbone of an auto-response system, communicating with users on one side and the operations team on the other side.

Deeper data analysis

GenAI can be used for data analysis tasks such as anomaly detection in time series data (e.g. interface utilization) or to predict future values for capacity estimations. It can also be used for log file analysis to find performance issues, security threats or failures in devices or services, which can be very time consuming using traditional analysis methods. Over time, genAI can detect faults, gather customer feedback, create reports, and take action on those reports with minimal human intervention.

Independent operation

Thinking even further, genAI can run a network autonomously by diagnosing faults or responding to customer requests for bandwidth. It can execute the DevOps process by creating its own patches or code snippets to resolve issues, making the network more efficient, resilient, and restorative. It can optimize networks by simulate various scenarios to identify the most efficient configurations under different conditions. This can help in optimizing bandwidth allocation, reducing latency, and improving overall performance.

Predictive maintenance & forecasting

In predictive maintenance, genAI algorithms analyze network data to predict potential failures or identify maintenance needs before they escalate into critical issues. This proactive approach minimizes downtime and enhances network reliability.

Identifying Cyber Threats

In the realm of security, genAI offers novel solutions via anomaly detection for identifying and responding to cyber threats. By generating realistic network attack scenarios, genAI can also help in training systems to detect and counteract sophisticated cyber threats more effectively.

(Image created with ChatGPT)

Improving genAI for better data quality

However, some applications and use cases require further development and adaption to their respective requirements.

If pre-trained models do not perform as expected, you can manually input additional information with the input or system prompt to provide details or give context. This relatively basic approach can already yield significant improvements with minimal effort.

Retrieval-Augmented Generation (RAG) can also help to improve the output quality. RAG combines a retriever system, which searches for text snippets in a provided database, with an LLM, which generates an answer using the information from these snippets. Critical to the performance of a RAG system is the embedding process, i.e. how information is broken down and stored within the database, and how this information is retrieved (e.g. similarity search).

On the other hand, fine-tuning is the process of taking a pre-trained foundation model and re-train it with additional specific data to adapt the model’s internal parameters and weights. This process typically requires appropriate hardware and is therefore more complex to implement.

Advantages of genAI

The integration of generative AI in network operations can bring significant benefits.

AI-driven systems markedly enhance efficiency by automating routine tasks and continuously optimizing network performance.

Like other AI algorithms, genAI can identify patterns and anomalies that human operators might miss, providing another key advantage of accuracy.

Additionally, generative AI facilitates proactive network management, allowing operators to anticipate and mitigate potential issues before they impact network performance. This not only reduces the response time but also frees up valuable human resources for more complex tasks.

Challenges of genAI

However, the adoption of generative AI in network operations is not without challenges.

Data integrity, privacy and security are paramount concerns, as the use of AI requires access to large volumes of sensitive network data.

The technical complexity of integrating AI into existing network infrastructures can also be daunting, requiring significant investment and expertise.

“Hallucinations” refer to the model generating outputs that are factually incorrect, nonsensical, or not supported by the input data. The problem is, that the LLM confidently presents this incorrect information as truth. Possible approaches to reduce this are the integration of more true and valid data using RAG or finetuning, assign a confidence score to the response or integrate a human-in-the-loop.

The integration and operation of LLM is very costly at various levels. Although there are open-source models that do not charge any license fees, high-performance operation requires massive computing power, especially from GPUs. In addition to the cost of purchase and power consumption, in-house operation involves high maintenance costs in order to keep up to date with the rapid progress in this field. Another cost that should not be underestimated is the cost of collecting, processing and storing of data for fine-tuning or RAG.

Furthermore, there is a growing need for skilled professionals who can effectively manage and interpret the outputs of AI systems, ensuring that they align with organizational goals and industry standards.

StableChat – GenAI Integration in StableNet^®

We at Infosim^® have also recognized that genAI is not just hype, but that the technology can revolutionize applications and entire industries. That’s why we’ve been looking into this topic for some time now and exploring its potential and how we can use it for ourselves and our customers.

“StableChat” is an interactive, low-barrier chat bot for support and navigation within the StableNet^® environment, for which we have already successfully created a PoC. It aims to help new users to navigate through the system more easily and experienced users to get to the desired views more quickly. The StableChat could be a possible add-on for our network and service management solution StableNet^® in the future.

The Chat-bot can run on your own server, without connecting to a cloud or a third-party provider so you don’t need to worry about data security.

We have already shown the StableChat in more detail on our StableNet^® Youtube Channel.

GenAI in particular is a dynamic field in which new developments are published almost daily. Therefore, we keep ourselves constantly informed about interesting developments in AI and their potential use cases with a focus on network management.

What to expect from generative AI in the next years

Looking ahead, generative AI is poised to become even more integral to network operations. Advancements in architecture, algorithms and computing power will likely lead to more sophisticated solutions, also for autonomous network management. It’s clear that generative AI holds immense potential for revolutionizing network operations. By embracing this technology, organizations can not only enhance their operational efficiency and security but also position themselves at the forefront of the digital transformation era. StableNet^® is leading the charge by exploring and implementing cutting-edge solutions that have real-world value. We already implemented first demos and PoCs in this field and will intensify our research activities in this area. If you are interested feel free to checkout our StableNet^® Innovation Lab. We provide regular exchange with partners and customers to find more use cases where AI can enhance operational efficiency.

Dr. Stefan Kremling

Senior R&D Engeneers @ Infosim^® GmbH & Co. KG

As a R&D Engineer, Stefan is working on several national and international research projects with the focus on the future of automated network & service management. Before joining Infosim^® in 2022, he was working as a R&D Engineer at several positions in research institutes as well as industry. With his strong background in Physics, where he holds a PhD from University of Würzburg, he is mainly responsible for all topics relating quantum information processing and technology.