Viniit Mehta (00:35)
Hello, Mérouane. Very warm welcome to you, my friend.

Merouane Debbah (00:39)
Great to see you, Vinit. Very happy to be here.

Viniit Mehta (00:41)
Awesome. So for those who might not be very familiar, Mérouane is a leading light in the AI world of Abu Dhabi. He's a professor at Khalifa University and a founding director of the Khalifa University 6G Research Center. a frequent keynote speaker at international events in the field of telecommunication and AI.

research has been lying at the interface of fundamental mathematics, algorithms, statistics, information and communication sciences, with a special focus on random matrix theory and learning algorithms. Wow, that's a mouthful. But in the AI  field, he is known for his work on LLMs, distributed AI systems for networks and semantic communications. And it's important to mention here that

most recently, he was a lead researcher and head of the launch of the Arab world's first LLMs, Noor and Falcon. I hope I did justice, Mérouane. I can go on and on because it's such an illustrious career and lots to talk about in terms of background. But a very warm welcome. You know that Agora is doing a lot of work

at the intersection of AI and voice specifically. Maybe I'll start by, why don't you share with us your sort of background and journey into the Middle East for a start.

Merouane Debbah (02:05)
Yeah, so that's a very good basic introduction. Thank you again. And of course, I think that the evolution of how I've been going on from AI to telecommunication and then telecommunication to AI back again is quite important to highlight basically to all the people who are listening to us. So I came here in the Middle East basically as a chief researcher to build basically an AI center at the Technology Innovation Institute. That was roughly in 2021, with the aim basically of building models.

And quite rapidly, I realized that there were some missing spots here in the Middle East. Typically, one was building a large language model related in Arabic. And we were the first one to release it, which was called Noor, already end of 2021. So that was a 10 billion parameter and already had some very good results with the Chat Noor behind. And roughly after that, we decided, of course, to kickstart basically the journey of building these foundational models that everybody's aware of with a model called Falcon

which has basically turned out to be in 2023, and I think it's very important to showcase the capability that we're having here in the Middle East. It came up to be the top-ranked open-source model in the world. And that put a lot of spotlight basically in the UAE, which was great. Quite rapidly, when we started building that, I realized a couple of things. The first thing is the value of AI is basically not so much in foundational models, but  in the verticals

on how you use them, how you put them in production. And this, of course, pushed me to work more and more in looking at verticals where we could extract value basically on fine-tuning those models, making them production. And this is, of course, where Agora, I think, plays a very important role in helping basically in this conversational part. But on my side, what we did is decide to basically build models which are fine-tuned to the telecom domain. And we released a couple of models which are basically Telecom GPT and one recent one,

when I was at Khalifa, Telecom GPT Arabic. The second thing I realized even more is that the future is not so much in basically building the models, but building basically the infrastructure that connects those models. And I think this is also where Agora, and I'm happy here to speak with you for that, is basically the future is  about when you look at AI,

it's not anymore, or at least the kind of bottleneck, is not anymore  on the computation and the results, but basically on the interconnect, how these AIs talk to each other. And the future we're seeing today is more or less agents which are talking to each other. And you need to build infrastructure basically to connect those agents together, meaning how a ChatGPT will talk to a Falcon, typically, or another one. And how they talk, which language they use, how fast they're going to be doing it, how reliable

is exactly the work that I've been doing since two years within the 6G Center. So that was one of the journeys that pushed me to work on 6G because I highly believe in mobile agents because the future is about agents which are going to be moving, roaming, going at each point, and how basically you built the kind of, let's say, layer or foundation layer to make all these AI talk to each other. And so the distributed AI part that you talked about,

the semantic communication part that you talked about is within that realm.

Viniit Mehta (05:21)
Yeah, exactly. Wow. So, you know, there's a lot of work that you've done on the LLM side. And then there's the whole multimodal aspect of the AI world, which will enable conversations, whether it's agent-to-agent or agent-to-human and so on. So maybe I'll spend some time to understand the 6G world and the fact that, you know, the research center also has a 6G research center. Maybe you can share with us

you know, does that entail? The 6G universe, what is that world going to look like when it comes to communication and specifically AI?

Merouane Debbah (05:54)
Yeah, so I think I'm not sure how specialized all the people are listening to us right now are, but telecommunication has been going through what we call several generations. And I think all people are familiar with 2G, 3G, 4G, which are terminologies that you see on your phone with that logo. 2G was mostly based or built for mobile for voice. And I think GSM, everybody knew the success and made some calls. 3G was mostly built for mobile for data.

And basically that's the dongles that you put on your PC where you start downloading basically files and movies or whatever you want in a seamless manner. 4G was built for mobile for internet. And I think basically you've been all browsing and going on Facebook and all these kinds of social platforms and you're seeing basically the benefit. 5G, although we're not seeing all the benefits, was built for basically mobile for things, mobile for IoT, meaning the massive capability of having all these devices

capturing information at any point in oil and gas fields and everywhere around that. Today, of course, a lot of the people who are listening to us are more using 5G from a mobile broadband perspective. One of the big features which is coming, and they need to be aware of is basically what we call mobile for IoT. 6G is built for agents. It's mobile for agents, or mobile for intelligence.

Viniit Mehta (06:59)
Right.

Merouane Debbah (07:12)
And what I mean by that is what we're trying to do today is realize that the kind of latency requirements that you have, the kind of reliability that you have, and the kind of communication mechanisms that you can have between agents is totally different on how humans are talking. And the future internet that we're building is not anymore you and me behind a computer. It's basically your assistant talking to another assistant. Why? Because you'll delegate a lot of features on your phone.

Viniit Mehta (07:28)
Mm.

Mm. Mm.

Merouane Debbah (07:41)
About buying, about a lot of things, and you're just going to converse with your phone and then your phone is going to take all those responsibilities to do the job for you. Booking, going to entertainment, flights, and all these things. And I think that's very important. Now, let me give you one example that highlights why the actual infrastructure is not capable to handle basically the kind of mobile agents that we do. Let me just give you as an example, sending a file.

Okay, so typically suppose you want to send a picture from your phone to another person. What you do today is you take the picture or the video and what you do is then you compress it with the whatever zip file or a compressor and then you send the compressed file and on the other side the person decompresses and gets it. However, in the realm of AI you don't need to do that anymore. What you would do is image to text.

You will send the text or what we call today in the language of AI prompts. And on the other side, you text to image. So in the realm of AI where agents are talking, data does not need to be transiting in a network. What needs to transit in a network are only instructions because you can regenerate anything you want based on a description because the model is there. You have a model on the other side

Viniit Mehta (09:04)
All right.

Merouane Debbah (09:06)
who could recreate even a movie. So you just need to say the description. So you take movie to text and then text to movie. But if you do that, you have a lot of issues. And this is why you need a sort of a new infrastructure. The first one is you're not sure that the models are the same. So you may not get the same movie on the other side. And so you need to be sure that at least the reconstruction is the same. Or at least what you see, you understand the same. This is called semantic

Viniit Mehta (09:25)
Mm. Mm.

Merouane Debbah (09:36)
type of communication, which we're trying to build here at Khalifa within the 6G sector. Second, which is more important, is that this thing has to go fast. If you do text to video or text to image, you saw how long it takes. It's not going to take rapidly a couple of seconds. It may take you minutes or hours to do that. There are a lot of use cases where you're not willing to do that. And if agents are bottlenecked by the platform,

Viniit Mehta (09:46)
Mm.

Mm.

Merouane Debbah (10:06)
then we're not using the AI to its full capability because we're having models which can understand fast, but they cannot communicate fast. So we need to provide them the capability of interaction, which is at the speed of what we had before, or in fact, even faster, because at the end, these AIs will do millions of transactions at a given time. And so this requires some kind of mobile platform.

Viniit Mehta (10:12)
Mm.

Merouane Debbah (10:33)
And of course, these things are also mobile because basically you're talking about things where you're moving, you're in your car, you're not basically in a given spot. A lot of things are happening at a certain speed of mobility. And this requires basically the kind of infrastructure. And this is what 6G is about. Mobile for intelligence or mobile for agents if you consider that intelligence can be mapped to agents.

Viniit Mehta (10:57)
Wow, lovely. And how far away

from the 6G world being a reality? Is it already here? Are we talking about months? What's the time?

Merouane Debbah (11:06)
Yeah,

So it takes a bit of time. It takes a bit of time and this technology takes a bit of time. So people, I think who are listening to us are not familiar, but these G that we're talking about 2G, 3G, 4G, 5G are 10 years timeframe. It takes 10 years to build these things. Okay, and I'll explain why. So 2G started in 1990, 3G started in 2000, 4G started in 2010, 5G started in 2020, 6G started for 2030.

Viniit Mehta (11:21)
Mm.

Merouane Debbah (11:35)
Okay, so that gives you a bit of time. Why it takes time? Because basically the stakeholders or the players which are working in this are various and you need a process which is called standardization. You need to standardize those protocols and whenever you start going into standardization. To make a standard, worldwide standard where basically my agent could talk to a Chinese agent. My agent based on Falcon

Viniit Mehta (11:37)
it.

Mmm.

Merouane Debbah (12:03)
can talk to an agent based on DeepSeek. And things like that, you need to standardize. So standardization for people who don't understand is make sure that we have universal language. So you're not going to talk, basically, you need  a sort of a translator to make sure that people are capable of understanding each other. And going back to the image generation, I want to be sure that when I send a prompt to a guy who has a model on a Samsung phone

whereas a given model was chosen for Apple. Imagine Apple chooses basically to work with a known company called Mistral. And then Samsung decides basically to work with ChatGPT. We need to make sure basically that the models are able to understand each other and interact. And this requires standardization in terms of language. Going back to the example I gave you before, when you do image to text and then text to image, what is the text

language you'll be using is very important. Why should it be English? Why can't it be Chinese? Why can't it be Russian? You see what I mean? At the moment you transform a certain  kind of data into a description or a prompt. You need to define how the prompt  language will be. Is it a universal standard language that we'll create from scratch? Or are we going to use English? So today, of course, the standard is English.

Viniit Mehta (13:02)
Hmm.

Merouane Debbah (13:26)
As you all know, everybody, and I'm talking to you in English, even though  we come from different backgrounds. And this is why it takes a bit of time.

Viniit Mehta (13:36)
going back to the conversational AI element of all of this,  when you say essentially that, you know, a human is giving an instruction to an agent, increasingly people will use voice to speak to the agent versus, you know, prompting on chat. So we're going to soon move to a voice-based prompt, if not already there.

So in terms of your semantic work, what role do you currently do or see the semantic communications frameworks playing in this whole future AI or future agentic AI?

Merouane Debbah (14:11)
That's a very good question. The first thing I think that people are not seeing is that the phones that we see actually now will change drastically. Today, the phone, I would say, ecosystem is based on apps. When you think about LLMs, they're like operating systems. Why do you need an app in a phone? If you are capable of saying what you want and it executes,

if you want to book a car, what you would do is download Careem, Uber, whatever, and then go click and things like that. But in a world where  you have these large language models, you just need one interface, which is basically the chat corresponding to the world of LLMs. And then you can say anything you want. This is converted to what we call an action, which are called large action models, which are those agents that you use.

So in the future, phones that we're seeing today, they're going to be app-free. You don't need an app because you just need to say what you want and it executes the task. And so you don't have to have zillions of apps on your phone. You just say what you want. And saying what you want, of course, writing the description, we all know from a user perspective or user experience is going to be very hard. So voice is going to be the major capability basically of interaction  with the device where you will say,

Viniit Mehta (15:15)
Well.

Merouane Debbah (15:32)
with a voice, what you want. And that will be converted, of course, into a large action model and converts it to the action that we're talking and going a step forward. Now, the semantic part is behind because you will try to understand what you said, extract from it the right prompting way to large language model and make sure

that on the transaction or on the communication side, on the other side, the person can reconstruct it and get what you want and get back the result. So that's where the semantic part is going to go forward. But I think it's very important for many people to understand that this is going to be big game-changer. The second big game-changer is since your phone or this interface in entering the AI world will be on your device, it means more and more of what we call the AI

will be on the edge. The notion of edge, and I think it's very important for many people, is very diverse. The edge means, of course, what it means in English, the edge, but in terms of infrastructure, it could go to the phone, but also to the nearest point of an infrastructure, which is basically the base station, okay? Which is the nearest, kind of antenna for people who don't know what is a base station, but let's say, or the router, or the Wi-Fi for people who are not familiar with that.

Viniit Mehta (16:25)
UGH

Merouane Debbah (16:52)
This, of course, has two consequences. The first one is a big shift in the realm of AI towards what we call small language models, not large language models. Because to be able to embed those intelligence on a device, today these models are described by parameters, and these number of parameters are quite huge. We talk about billions of parameters. The number of parameters correspond to the size or the

Viniit Mehta (17:03)
Mm.

Merouane Debbah (17:22)
the data size of that model. Embedding that basically on a mobile phone is extremely hard. And so all the people and you guys also in Agora are doing that. To port that efficiently, you need to be smart. We have some techniques, but you need to be able to make sure that these things can run on the device and efficiently. We didn't talk about the efficiency, but you have to know that today these models that we're building consume a lot.

Viniit Mehta (17:27)
Yeah.

Mm.

Mm.

Merouane Debbah (17:51)
Consume

a lot

because today, a request that you do related to an AI compared to search requests on Google, there's a factor of 100 in terms of energy consumption. And so nobody wants to have his phone depleted within a couple of seconds. This is, as you know, a major kind of issue is to make sure that your phone is there for a day at least. And so this is, of course, impacting a lot of the size.

But it gives also opportunity, and I think this is also where Agora is playing an important role, it gives opportunity to the telecom operators to come back into the game. And one of the reasons it turns out that because things are going on the edge and because operators are mastering the edge, they are basically the ones managing the base station. They are managing the interaction with the phones.

Viniit Mehta (18:26)
Hmm

Mm.

Merouane Debbah (18:42)
They can leverage basically the capability to start serving AI, which is a kind of new feature. They were serving voice, they were serving data, they were serving internet, and now they have the capability of serving AI. Serving AI can become a huge opportunity for the telecommunications market. Of course, it's an opportunity, but also comes with a lot of good features for users because at the same time, it comes with the opportunity where you would buy a subscription today

Viniit Mehta (18:49)
Mm.

Mmm.

Merouane Debbah (19:11)
where you're paying phone and data, you'll pay a third subscription, which will be called inference on AI or serving AI. So you'll have a package which is bundled and you'll have unlimited basically prompt like high quality, super reliability, secure all the features that you want in terms of usage of your assistant and your agent can run on the infrastructure of basically the telecom operator.

Viniit Mehta (19:21)
And that's

Merouane Debbah (19:39)
And so from that perspective, the telecom operators for me are back. They're back in the game. It's great. The momentum is great. They're back, huge opportunity, huge things that we can build with new features on top of that. On the user perspective, also a big game-changer because the user experience is going to be dramatic. At the moment you bring more and more things on the edge and your conversational solution

Viniit Mehta (19:44)
Ugh.

Merouane Debbah (20:06)
and we can talk about, I will say, will impact a lot of things, the user experience in terms of exchanging will increase because latency is important. And latency is at the heart of conversation. So if you have to wait two, three seconds every time you ask a question, it's totally different from getting a result is quite immediately. The immediate feeling has to come from the fact that you're near and the edge makes it near.

Viniit Mehta (20:15)
Yeah.

Yeah.

Yeah.

Fascinating, fascinating. That's some fantastic insight. I mean, you speak about the fact that the new world will have, let's say, a mobile phone with no apps, the telecom operators are back in action, and then you cannot not think about this world without having to think about the fact that the LLMs and the AI world and the telecom world needs to be multilingual.

And, you know, that's where your experience with Noor and Falcon comes into the play with Arabic LLMs and Arabic, you know, inherently it's right to left, you know, so it's like, that's the fundamental change, the way the AI has to think and operate. So tell me that, you know, with Noor and your telecom GPT Arabic universe colliding, what unique challenges arise when you are creating

natural language or natural real-time dialogue for Arabic speakers. Particularly,  this will be fascinating for the listeners to understand that this region has some unique challenges, but also opportunities, because you've got to think about diversity, code switching, and voice accessibility. So that'll be some interesting notes that if you can share, Mérouane.

Merouane Debbah (21:45)
So definitely. So before talking about the challenges, maybe the reason why we and people are building basically more regional large language models. And I think this is quite important for people to understand because many people come to me and tell me, look, you know, I have a ChatGPT in English and as of today, or it was trained on English data.

And what happens is basically it's quite good in translation, which obviously is the case also translates quite good. And so if I want to talk or have something in Arabic, it's very simple. What I do is take my Arabic text, translate it in English, put it basically on ChatGPT, get the answer or my report in English, and then from the English, translate it back to Arabic. And that would work. It effectively works. And the translation, by the way,

Viniit Mehta (22:30)
Mm.

Mm.

Merouane Debbah (22:41)
today is quite good in terms of level. But if you do that, you do not understand at all what LLM are and how they work. Why? If you do that, you will get an answer, but most probably you'll get an answer which is basically an English guy speaking in Arabic. Okay? What you want from an LLM is to feel as if you're talking basically to the person who is an Arab.

Viniit Mehta (22:44)
Okay.

Merouane Debbah (23:08)
Because the kind of references he'll give you in the report will be more about Mark Twain. Mark Twain, there's no connection between Mark Twain and my culture because it's not linked to that. Typically, if you ask for an image generation of an image of a marriage, for example, there's more probably that there is basically a church which appears on the thing, which is also disconnected from your total environment.

So you want a large language model which conserves that kind of culture and that kind of interaction. And so at the end, you need to train it on the dataset, which corresponds to your culture. And that's quite important. And I think this is quite important for people to understand why you need basically some LLMS that are trained on the dataset of that region, that culture, and not do this translation mechanism to work out. Now to do that, there are some challenges. And one of the biggest challenges is data.

Viniit Mehta (23:44)
Mm.

Hmm

Merouane Debbah (24:01)
Data either basically written or spoken, spoken being the most complicated one. On the written one, there is a natural bias in internet, which is basically most of the data in internet is English based, okay? And also the images. And it's not because it's done like that, it's just historically, there's a cost to upload data for people.

Viniit Mehta (24:13)
Right. ⁓

Merouane Debbah (24:23)
To upload any kind of text or images on internet, it requires that you have an internet connection. And we know that internet is not ubiquitous on earth. Many countries in Africa will cost you money to get that. And so we have this challenge of basically getting datasets, which are basically on that language. Spoken, on which you guys are working on, is even more complicated. And the reason is basically that spoken language,

Viniit Mehta (24:31)
Mm.

Mm.

Merouane Debbah (24:49)
is something that basically we don't have that access to quite easily because you need to record it. It's a second step. And recording is another step. And a good example is India, by the way. India has the smartest people, as we all know. I mean, all the key CEOs in the world, many are Indians, things like that. And India has been very late in coming up in the LLM market in terms of building its models. And one of the reasons is exactly the digitalization.

Viniit Mehta (24:55)
Yeah. Right.

Mm.

Merouane Debbah (25:16)
you need

basically to have a lot of data which is already digitalized in the country to start working. And so it requires some kind of effort in these things to gather. And so one of the, as you said, one of the challenges is acquisition of data and it's in various forms. And of course, India is also an example here also, it's called dialects. Dialects makes your head even worse because

Viniit Mehta (25:29)
Mm.

Merouane Debbah (25:39)
the amount of dialect, the kind of diversity in dialects is huge. And second, the amount of dialects that we can get in terms of data is quite few. And so this requires some smart techniques on how you build these models in the sense that you're not going to do foundational models in Arabic or other languages. You're going to make sure that you use a foundational model, do the right fine-tuning, work it out and all these things. Conversation is another step.

Meaning having the interaction with voice, which is a modality. I'm talking here more about text. But once you go into, and we can talk about it into voice, then of course this incurs other kind of challenges because the way the intonation is made, all these things adds up on top of the real time because some countries speak slowly. I have a lot of Finnish friends. You can't know how much they speak very slowly. And some cultures, they're very speedy in how they speak.

Viniit Mehta (26:12)
Mm.

Merouane Debbah (26:36)
And so this also creates a lot of challenges.

Viniit Mehta (26:39)
Yeah, no, absolutely. I can imagine. I mean, you know, just from our point of view in the Middle East, I think we are amongst the few conversational AI platforms that can handle the multi-dialect Arabic world. You know, Saudi dialect is different from UAE, is different from, let's say, Egypt and Jordan, et cetera.

We can see the challenges there and we see it day in and day out and it's constantly improving and evolving as you say more data comes into the picture because you got to train it with the voice with the right tonality, et cetera, as well as the pace. So, you you spoke about the conversational AI

space as the next step, you know, first is obviously text and then it moves on to multimodal, in terms of let's say, evaluation or let's say, you know, let's say Khalifa, you know, the way you guys would look at a conversational AI sort of platform. Are there any sort of metrics that you consider that are meaningful? I mean, you spoke about obviously 6G, you know, the world is going to be ultra low latency. You want that pace to be fast, all of that, but what

kind of benchmarking or is that a consideration that you look at in terms of benchmarking conversational AI agents? Any thoughts on that?

Merouane Debbah (27:54)
Yeah,

So that's a very good point and I think it's one of the most important points in building these models. So I think it's very important to give to the people who listen to us a bit of background. So today when models are shared, they are ranked and in general you say I'm the best, I'm number one, I'm number one, I'm number one, I'm doing better. How is this done? It's through what we call evaluation benchmarks. It's how it's done. There are today some

Viniit Mehta (28:06)
Yeah.

Merouane Debbah (28:22)
I would say more or less neutral entities. One, maybe some people know it called Hugging Face, where basically you send your model and then it's ranked based on some benchmarks. Today, a lot of those benchmarks are based on more academic features, meaning, especially for text, how successful you are in solving this exercise, how

Viniit Mehta (28:32)
Mm.

Merouane Debbah (28:44)
successful you are in answering a couple of questions. And based on your capability of answering a couple of questions, then you're ranked number one, number two, and this is how we rank those models. The first thing outside the realm of conversation, which is even more difficult, is that we realize that these models on which they are ranked, they are not basically how they are used. If I take ChatGPT, for example, you can rank it or any model and it's benchmark on

knowing how to solve Maxwell equations, knowing how to do something like a PhD. But at the end, when you look at how people are using these models like ChatGPT, it's mostly taking their email, putting it in ChatGPT and say, hey, can you make my email more professional and get it back? Okay? The question is, how do you assess or evaluate that an email is professional? This is what we have as an issue already today.

Viniit Mehta (29:29)
Yeah.

Mm.

Mm.

Merouane Debbah (29:38)
Doing

evaluations which are standardized and saying you are good in solving one plus one, five plus six is something which is more academic. On that part, we're a bit lost because basically either you do what we call human evaluation that costs a lot of money. So you have a lot of human evaluators which say, hey, good. The email is professional. This email is professional and then you start evaluating and you need to give a grading. Or either what you could do is basically make yourself evaluated by another model, the teacher.

So you have some models which are better than you and then they can evaluate you. And these are the standard techniques. But we're a bit lost today because in terms of how we evaluate those models and how they benchmark, they are very far from how they are used in production. And that I think is very important for people to understand that. But this is how life has been done, by the way. When you look at basically how you will rank students.

Viniit Mehta (30:08)
Hmm.

Merouane Debbah (30:31)
While you will give them an exam at MIT. When you look at the exam of MIT for getting into MIT, it's mostly about how good you are in solving some problems in mathematics, in physics, chemistry. However, what is the probability of anybody graduating from MIT from using that in a practical life? In practical life, what you do is PowerPoints, interactions in meetings. That's exactly what you are in real life. And so basically, at the end, what you do in university, you're fine-tuned.

You're fine-tuned in some sense because you're starting, you know, learning how life is in production. So at the end, the skills that you have are embedded, but then you need some fine-tuning to be evaluated on things which are real and you start working on real cases and then you're evaluated by people who are your mentors, interns, in your internship and do it. And that's how we do in practical life. In the voice, we are in the same situation. If I want to evaluate the voice model, then the evaluation for sure is,

on the sensitivity in terms of latency that I need to get, on the accents of basically things. And this has turned out to be a bit costly to evaluate because I need to put in human evaluators. I need to build with a bunch of people from that country and spending time interacting and giving marks. And so this is what we're doing at the moment. So you have to know that within our work that we did within the 6G Center, we started working with GSMA

Viniit Mehta (31:32)
Mm.

Mm.

Okay.

Merouane Debbah (31:56)
which is a big organization in telecommunications, not for the voice conversational part, but at least for the telecom domain to come up with what the telecom people consider as important for models to be working in production and how we evaluate it. And this was announced in the last Mobile World Congress, in Barcelona in last March, where we we announced that kind of benchmark and also the evaluation behind. By the way,

these kind of, I would say, initiatives are very important because this is how you push a field to get better. Because competition is good. When you start telling to the people, well, you will be evaluated. So it starts bringing and making the field much better to come up with better and better products.

Viniit Mehta (32:31)
Mm. Mm.

Wow, nice. Well, that's super interesting. I was thinking about this yesterday and a lot of the folks on the LLM side keep

saying they're better with this model. And, you know, of course they have different models for image and video and, you know, the multimodal world and there's a ranking on each one and it keeps changing and everyone's pushing. I think one of the things that I'm noticing is just the pace of change that we go through today. And you are living that day in and day out in the research university. So let me flip it to more of a personal question. You know, there's so much action going on in the world of AI. You're in the middle of all this

tremendous change I would say right? In general. How does Mérouane unwind?

Merouane Debbah (33:28)
So that's a very good question. Before answering it, I just want to go back to your previous evaluation, just to finish on that. I just want to mention that I think now it came to my mind, I'm not so aware today of good evaluation benchmark platform and stuff like that for voice. And I think that's a good initiative you could take on your side, by the way, by leveraging a community or all the people working and start building that kind of benchmark, making sure that

Viniit Mehta (33:36)
Sure.

Merouane Debbah (33:56)
people are ranked, because this will create, and I think that what we did in GSMA field is something that you could certainly build up, and that would be a very good momentum for the community. Coming back to our case of how you live the spirit. Of course, the spirit is very intense, to be honest. mean, the acceleration and pace of how things are going is totally crazy, to be honest. And just, I would say that just following the field is a full-time job.

Not working on it, just following, just reading the news. Because the amount of announcements which are made, the amount of things which are being done is at a crazy pace. So certainly, you need to be linked to some good newsletters. You need, of course, to be connected to some good people. You're one of them. You need basically to know so that they can provide you the reliable information and separate a bit the noise from the true facts.

Because in every field, know, when it starts to be too high, there's also lot of noise behind. The good thing about it is that the community is starting to be more and more open on how basically things are built. And so one of the big platforms today, which is providing a lot of information where people want to be very transparent about the result is called arXiv. So arXiv basically is a website which was used many years ago by academics.

Whenever they published their paper, they would put it as an open-source paper where everybody could comment and give. And now within the AI, it has become also some kind of ubiquitous place where people publish their paper, even before submitting them, by the way, in review link places. And this arXiv, is arXiv.org is a place where people, at the moment, they issue their papers.

They publish them and then you can have the latest access. And so you just need to apply to any kind of keyword newsletters called LLM, whatever, voice conversation, if it's your domain. And you get in general the kind of latest kind of papers, things like that to dig in around that. The third thing beyond the hype is also basically to have also interactions with people. And I think the research community is very important

Viniit Mehta (36:03)
Mm.

Merouane Debbah (36:07)
to be connected to. I mean, because what people don't understand is what comes out is always work which was behind the scene one year or two years before. And so to get access to things which are happening two years before, the only way you can get access to it is to be in the lab or with researchers. And this is why, by the way, collaboration with universities and industries, which is in many countries well understood and here,

Viniit Mehta (36:26)
Mm.

Merouane Debbah (36:32)
also they're trying to push it, which is a good thing, is you don't do collaboration with researchers because you saw what they did. When you saw what they did, it's already too late. You do collaboration to start an interaction with these researchers to start understanding what's coming next. Because what's coming next is going to define your roadmap. You will start understanding, in one year, this is what we can do. And so it also triggers because you know some people are starting working on this.

Viniit Mehta (36:50)
Yeah.

Mm.

Merouane Debbah (37:01)
And their result will come in 18 months. Because if you go to the standard GITEX places, which are excellent places, you see what has come out. You don't see what will come out. And what will come out gives you basically competitive advantage. And this is how you get basically, as you said, how you can realize beforehand what you should do because you already know what are the smartest guys working on, where they're going, and this is quite important.

Viniit Mehta (37:04)
Mm.

Yeah.

Awesome.

Merouane Debbah (37:31)
So

Be always inside the lab, not outside the lab. Because inside the lab, you start knowing what people are starting to think about.

Viniit Mehta (37:39)
Yeah, wow, that's amazing. I mean, that's a message for people in corporate or enterprise as well, that, you need to have at least maybe half a leg inside the lab in some capacity, no matter which city you're in. It's always good to keep in touch with academia because that's where all the research is happening. ⁓

Merouane Debbah (37:51)
Yes.

Yes.

Yeah, don't wait

until the demo is showed, because if you wait until the demo is showed, it's too late. And in general, you're not competitive, because once the demo is showed, everybody is jumping on it. And unless you have like a huge pocket, which is basically unlimited, then maybe you can have a competitive advantage, but it's already too late in terms of timing.

Viniit Mehta (38:04)
It's too late. Exactly.

Yeah.

Yeah, yeah. No, awesome, this has been phenomenal, Mérouane. Thank you so much for your time. I really appreciate it. It's literally been an encyclopedia about what's happening in the region, hopefully, for our listeners. And, you know, I would encourage our listeners to add Mérouane on LinkedIn. You know, any questions you guys might have, please follow it up once the podcast goes live. And thank you for your time.

Merouane Debbah (38:42)
Well, thank you. It was a real pleasure and I was happy to share basically the insight. I think a lot of things are moving here in the region. I think it's quite exceptional. People don't realize the kind of massive changes which have been happening in the realm of AI since 2017. A lot of people ask me, there was a visionary kind of approach with the appointment of the first minister of AI in 2017. I think since 2017, we're going like on the 10-year frame now. So in two years, we're in 2027.

Viniit Mehta (39:03)
Mm.

Merouane Debbah (39:10)
A big, big, big move around basically a lot of things, either at the level of talent attraction, either at the level of infrastructure. And of course the results are now happening and I think this is quite great. It's a great moment to be here.

Viniit Mehta (39:24)
Yeah, awesome. Anyone wants to visit or visiting Abu Dhabi and Dubai, feel free to visit Khalifa University, which is in Abu Dhabi. I'm based in Dubai and Agora is alive and kicking in the Middle East as well. Thank you.

Merouane Debbah (39:39)
Thank you very much.