The role of mathematics in Artificial Intelligence models
Salim Bouzebda is a tenured university professor and Director of UTC-LMAC (Applied Maths). He describes the different mathematical tools used in artificial intelligence (AI) and their role.
Mathematics lies At the heart of AI technologies. The first mathematical model of neural networks was developed in the 1940s. However, it was in 1956 that the term AI was used first. Since then, various AI technologies have been developed over the years. The explosion of Big Data, in particular since 2010, has changed the game with so-called “generative” AI, which relies on complex algorithms capable of processing large amounts of data to mimic realworld situations and behaviours.
These algorithms require specific mathematical tools, depending on the AI models developed and their fields of application. First of all, there’s linear algebra. “This is an essential branch for calculations performed by neural networks. Input data, connection points between neurons and transformations carried out in the network layers are generally represented in either matrix or vector form. These tools are used, for example, in image recognition, where each pixel is represented by a number and the final image is then symbolized by a matrix or a vector”, describes Salim Bouzebda.
AI tools also call on other mathematical tools, including differential calculus, random models based on probability and statistics, graph theory and search algorithms, information theory and data compression. The former enables the parameters of AI models to be adjusted, particularly in supervised learning, and thus models can be optimized. “In this case, we know the input data and we know the results obtained. Once the operation has been repeated many times on several experiments with large amounts of historical data, we try to minimize the model using a cost function that will enable us to reduce, as far as possible, the distance between reality and what we are estimating”, he points out.
The second are used in situations of uncertainty. “These models enable us to measure the uncertainty associated with decisions made by AI systems. Thus, if you have a huge number of parameters to manage, you’ll have a real problem interpreting and classifying data. To get around this problem while retaining as much information as possible, we perform an encoding process that we project into a new space of smaller dimensions, and therefore make things more reasonable to study. Once we’ve classified the initial data, we decode it. This enables us to obtain a new signal very similar to the original one, but with the corresponding class,” he explains.
As for graphs, they apply in particular to relationships between objects. “Using the data available, we need to find a certain graph compatible with that data. In this case, the nodes represent the “individuals” and the edges the “relationships” between them. This method was applied during the Covid-19 pandemic to trace possible contamination but is also used in a large number of consumer applications: social networks, mobility or, more anecdotally, dating sites…”, assures Salim Bouzebda.
Finally, information theory and data compression. “In particular, it is used to transmit and store data at the right level of quality. Generally speaking, data has a certain size and, if stored as such, will not only consume a lot of memory but will also be more difficult to retrieve. To avoid this pitfall, data is compressed to a minimum size, while retaining almost all the original information. This method is used in computer vision tools in particular,” he concludes.
MSD