Derek Zheng (00:00) I heard the Agnes has just reached a valuation of over $100 million. Bruce Yang (00:04) US dollar's customer acquisition cost is as low as 20 cents, we are the fastest in the world. Derek Zheng (00:10) I think it gave me an impressively accurate response. Bruce Yang (00:14) the role of this agent is trying to understand human intent. Agora is very popular, very famous for its speed. We want to be the fastest AI consumer product mobile in the world. Agora is almost like the only choice If we are able to take this opportunity together with Agora we might be the next Facebook. Derek Zheng (00:40) Hi everyone, welcome to the Convo AI World Podcast. Today's episode is another special face-to-face interview. And this time we are in Singapore. This is a city full of passionate AI innovators, developers and entrepreneurs. All right, let's get started. Today we are very honored to have Bruce join our podcast today. Who's the founder and the CEO of Agnes AI. Bluess, do you want to say hi to our audience? Bruce Yang (01:05) Hi everybody, I'm ready This is Bruce. Nice to meet you, Derek Derek Zheng (01:09) Nice to meet you. So first of all, congratulations on your latest founding round. I heard the Agnes has just reached a valuation of over 100 million US dollars. You must be incredibly busy these days. So thank you very much for taking the time and joining our podcast. Bruce Yang (01:24) Thank you very much, Thank you very much Derek. Derek Zheng (01:26) Cool, cool. Would you like to just start by introducing Agnes to our audience? Like, what does the platform do? What's your vision behind that? Bruce Yang (01:36) Of course, Agenes is just sovereign like a software AI company in the Southeast Asia, originated from Singapore. Well, I myself is, well, I was doing my PhD at the NUS Me, my advisor, a bunch of researchers in the region have seen the huge opportunity growth inspired by the ChatGPT During time of my PhD, when I do a lot of researches, Because of the fast growth of the area, the arena, see that a lot of new things are happening every day. Our professors asked us not to read any paper beyond one quarter. One quarter away is obsolete. So we have the kind of, ⁓ you know, the pulse, the kind of ⁓ feeling that the next big thing is coming and we want to join our effort. to create something from the region, something potentially on par with Chai GPT someday. And this is kind of our origin of the Agnes. But starting from the first day to right now, there's some kind of a positioning difference in terms of how we perceive our product. It used to be a productivity tool, just like how people perceive with maybe Chat GPT or Perplexity Right now we are trying to blend into the social part of it. I want people to interact very well, not only people with AI, but people with people with AI assistance. This is how you see a lot of social features on our Agnes especially after we launched it on the App Store for mobile experience. We have seen a huge growth. I'm not sure if we're to get much about that in the data round. The numbers speaks for itself. We are able to since our launch on the App Store from September. Within about two months time, we are growing something like three million registered users. Yeah. Which is crazy numbers. I haven't seen similar apps anywhere in the world. know, when Instagram came out, takes about two and a half months to reach one million. When Snapchat came out, it takes about 10 months to reach one million. We're reaching three million in like three, in like two months. And we see a lot of people adjusting with our platform very well. retention for the week eight is 30 and above. So that means three out of 10 users from the first week after registered users stay on the platform even after the end of the second month. So this is ⁓ how we perceive our product. Derek Zheng (04:15) Yeah, I remember the last month when I checked the registry users of Agnes it is around two million right? And increased another million. Bruce Yang (04:22) So I think it's about two and a half months The first one month might be one million and another, the second one half month is go to another two million and start to accelerate, especially after we launched from South Asia to other LATAM and Middle East. We've seen people cruising near their product and a lot of them are from just organic growth. Derek Zheng (04:50) Yeah, I think for 3 million users, the product does not only fund its PMF but also build strong community momentum. So can you share what does it mean to you by reaching 3 million users? Bruce Yang (05:04) Well, it means a lot of things. First of all, it's very difficult to grow users very fast. You might see a burst kind of, know, a high kind of product like GBT. GPT is able to reach 1 million very fast because it's only the first AI product in world, right? Only usable, Large language model product in the world. You might see Manus is growing very fast because the first day of launch, everybody's talking about Manus. any kind of organic growth, bit by bit, by week. If you are still able to reach a huge model user base in like two and a half month, it's going to be difficult. So I say that there definitely some kind of magic, which we're trying to understand the magic ourselves at the same time. If you look at our CAC, customer acquisition cost, it's as low as 20 cents. And I haven't seen this kind of product, any kind of product with this kind of, kind of no, you know, customer acquisition cost for a lot of products with this kind of, if they get this kind of low acquisition cost, it's very difficult to retain a business because people are like, you know, have a hype, heard about this app, come to play with it a bit and they're gone forever. For our users As retention is crazy good, like day 30 retention after we've looked into it, it's still as high as 10%, which is on par with what Facebook was able to achieve in the early days. And on the eighth week, week eight of the weekly retention, it's still like 30%. So these numbers does tell us that we might have found a very, very good place to grow our product. Product sales actually fit the requirement with the market variable ⁓ We're trying to make some, you know, some new innovations on the product to hopefully, let the customer stay longer, stay more, and also invite other potential friends on the platform so that they form a kind of social network. Anyway, I think for now, up until now, we're on the right track. Derek Zheng (07:19) Nice, nice, that's amazing. Yeah. So I have also talked with my friends about Agnes, I some friends located in Singapore. They will describe Agnes as a Singapore's local homegrown AI raising star. So how do you personally feel about that label? Bruce Yang (07:36) Well, we kind of thought about that at the beginning, but it does help us to tell a good story because a lot of audience in the South East Asia, they haven't kind of belong, least they haven't kind of pride if it's a product built from the region. Just like how, how Google was able to fend off Uber, like Shopee and Lazada they became the king of eBay. mean, become the king of e-commerce and Amazon where Amazon and Taobao is not able to penetrate. We think that this is a good position at least for now for us to stay potentially as king of AI product in the region to grow as user as possible. I hope that we have very high penetration up to like 50 % or even higher. If we're able to reach that number in the region, we have a baseline to grow. And we could potentially have the same kind of business model, have the same kind of growth model and copy and paste in other emerging hit countries like the Latin America regions like the Middle East, like some part of Europe, part of Africa. So if we are able to grow like that, have a very good chance to become a very important player of the Gen AI consumer app in the next two to five years. Derek Zheng (08:49) I also see Agnes is defined as an AI application made for everyone, but you decided not to build a general purpose AI agent, right? But instead you developed a multi-agent system to make AI accessible and useful for everyone. So what's the thought, what's the reason behind that? Bruce Yang (09:08) Well, idea about general AI, mean, general agent is probably brought by a complex like Manus It's a very good concept in terms of technology because agent itself is a very trendy word. We heard about the name of agent in the early of 2024 as a PhD student, but a lot of other people are talking about the agent. download like I know 2024 or early 2025. We think that it's a very trendy word, it's a good concept, but it does not define a product. That's why, know, maybe from the March to, uh, from March to June, um, a lot of people, it's a very high work like general agent, but from June to, um, to another, the second three month growth from June to September, we have already talked to a new idea of the coworking because agent serving with a purpose. Like how do you have a, um, vibe working? environment with different people using a Gen AI to replace the office suite or Adobe suite. That's the new trend. Starting from the September onwards, the trend would be the multimodal. Like how the GPT-4, NanoBanana, is transforming the entire multimodal ⁓ image and video generation. So we see all this trend and we try to take on a trend we don't want to stick with one of positioning. And the best way to embody all these elements of AI Go would be we want to position ourselves as a vital lifestyle, gigantic mainstream app, potential replacement of maybe tragedy or complexity in the region, that's number one. Number two, we want to be the fruit to embrace social elements into AI, be the intersection of AI and productivity. productivity of AI and the social network. And this is a very big positioning. And if we do it well, if we are able to get a young audience, the first generation of AI native audience using our product for their social life, we might have a much better chance. Derek Zheng (11:18) Yeah, I like that. The first generation of Bruce Yang (11:20) AI Exactly Derek Zheng (11:23) I think you mentioned a few times about manners. Actually, I've also heard a lot of people compare agnes with Manus which is another AI company based in Singapore. Do you think the two should be compared side by side? And what do you think the unique strengths of agnes? Bruce Yang (11:40) I think we are two very different companies. I have a of respect for Manus because they are the first to introduce the idea about general agent. I think they have been doing extremely well by transforming the concepts of how people should build agents. previously the people using agent to validate what are the workflows. Right now with a concept introduced by companies like Manus, believe that autonomous workflows can be generated by AI itself So I feel a lot of respect to this kind of strategy in terms of product development, in terms of product innovation. But all in all, we are very different apps. are a very different company. Agnes, we have our own models. We co-training a lot of models. We not a company heavily relying on the co-sourced model. That's how we put protection in and put it on ourselves as a sovereign AI company in the region. Number two, are not, you know, limit ourselves to the agentic workflows limit ourselves to the positioning of the AI agents. We were very focused on mobile. We want the young audience in the region, especially the high population regions like Indonesia, Philippines, Vietnam. These people want them to stay on the platform and treat us as a very good substitute of Chat GPT in the region, because we are going to support the modern internet, which is better. And we're going to... which is the social features, which is catered for the region. We support the culture, the language, the ethics and a lot of other things. So I definitely think that we and Manus are very different. We mobile focused, we want to be a mainstream network and we have our own models. But I do have a lot of respect to Manus. Derek Zheng (13:33) Do you use the code like the boss platforms of LinkedIn and the web? Bruce Yang (13:38) We do have both mobile and PC. On the PC side, to some extent, we look similar, but I would still say the Enormous are doing better in terms of a lot of complicated cases, which we don't spend a lot of effort on. Most of, maybe 90 % of the tasks we are similar, we might be faster. I don't really want to this kind of comparison this kind of podcast, right? But we are positioned very differently. That's all I would say in public way. Derek Zheng (14:02) Yeah, yeah. You mentioned that Agnes has your own large language model, I know you have the most, I think the most famous one is R1 right? That's right. So, yes, Agnes R1 a fully proprietary model or was it distilled from another base model? Bruce Yang (14:18) So we do have ⁓ our ⁓ own reinforcement learning framework called DSPO, dynamic phase sequential policy algorithm, which is an enhanced version of GRPO from DeepSeek And from our research team, we're able to reach pretty good results. Some of the open benchmarks were able to do better for our ⁓ reasoning model, do better than that of DeepSeek, definitely from the previous version of Deepseek Um, through the time we, launched our, I mean, open source and publish our paper. And for that reason, we're not, um, uh, we're not training our models only from distillation. By the way, we still do a lot of distillation in our work because we have in our production, a mixture of the SOTA co-sourced models, like ones from Gemini, from Grok and another, another, you know, self-deploying models, um, a bunch of them, not one. including the Agnes R1 and the recent model, and including some of the dance models, the ImageGen and VideoGen models. We have a lot of data from the platform because over over 200,000 users are using our product everyday. And they are generating a lot of prompts. And by supplying them, these prompts to the SOTA models, we're able to get a lot of data. ⁓ good training data for us to distill our model. This is how we do the first layer training. We have accumulated a lot of covers of data by people using our product, especially from the group chat. I use that data, especially for minority language data with langs local spoken language, not something you find from literature. You do a first round of pre-training. Then during the post-training, we use the distributed data from the SOTA models to go around about safety, supervised fine Tuning And after that, for the last bit, we helped to do our policy optimization reinforcement framework to try to make our models work even better. The SOTA models. It's difficult, but it's achievable. Especially when we try to, you know, define the task, specific task for the category. We don't have to be better than generalize in all the categories, but we can be better than them from a post-training model in the specific task like from research or generate a PowerPoint, be a good assistant on group chat. We have specific model for each of the mode in our systems. That's how we train our model and train our model in our system. Derek Zheng (17:12) that since you have mentioned curious does Agnes R1 achieve a SOTA among its peers and which benchmarks or standards do you use to evaluate? Bruce Yang (17:22) Very good question. So Agnes R1 able to achieve SOTA during the time when we published the paper ⁓ for the 7b, you know, 7b range. ⁓ And from a lot of, quite a few of the benchmarks, including like HAPAQA, including a lot of search questions. These are open benchmarks. By open benchmarks, we mean something of very, very standard. questions like search questions, like Q &A questions. Some of them are related to mathematics or programming. It's very difficult to transition from the results from open bench marks to the real world problems because during a real world problem, real life world problems, you might face very subjective matters. Like you want to judge which research is better. You want to put a comment on which PowerPoint looks more professional. There's no exact right or wrong answer. And it's very hard to give a score. So we have to train a evaluation system from our own side to give a score, to give a reward. Otherwise there's no way for the world to improve by itself. It doesn't even know where to which direction to go. So this is part of the, what we call it universal verifier we train from outside. We haven't closed off this part. That's our trade secret. But anyway, that's for the support of the ⁓ universal verifier. which transition our results from Open benchmarks to the real-life work problems. We're able to have a very good working model to deal with the most subjective problems, doing a good research, write a good PowerPoint, participate very well as a group chat ⁓ assistant, speak at time you should and keep quiet when you shouldn't. So this kind of training, we spent a lot of effort to... migrate what we have, style from the research to what I have defined on production. Derek Zheng (19:24) Nice, thanks for the sharing. So I want to circle back to the users, registered users a bit. ⁓ Free and valid users. It is a great achievement. Can you share with our audience a bit about the journey of building your user base? Was it look smooth from the beginning or what are the major challenges you encountered along the way? Bruce Yang (19:44) Very good question. we do start from, know, that's not much secret about how we build our work user base. We do have to make, you know, voices, make some voices in the region. We are working with, you know, influencers, a group of influencers, who speak their own languages, compare us with other products and try to promote for us. And we, we... We find that it's efficient, very efficient, especially from the ⁓ very effective, especially from the emerging markets. Because we are Singapore company, we tried out first from the Southeast Asia ⁓ to the bigger countries like Indonesia, Vietnam, Philippines. There are a lot of population over 100 million in the region. And a lot of them have not used any kind of products before. If there's any, if any product, it might be tried to be. We just have to be better, our free version. has just been in the free version of ChatGPT And that's how we are able to get a lot of people jump, jump, jump up to our product and try it out. Um, the secret happens when we start to, I mean, the magic happens, always start to launch our mobile app. Because we have seen that a huge population of the region are on their Android. They're not PC users. Maybe through network, they have a PC. Outside of work, they may not even have a PC. And, uh, When Android starts to launch the Android app, we just see a huge group of people jumping, joining our platform everyday on a daily basis. And we return the numbers actually goes up too. Because we're able to form a mutual communication, mutual web communication. We're able to reach out to our users from push notifications. We're able to send a group chat. visual group check to the user introducing the new features. And we just got to bring force in the region. Every few days we have a new feature coming out. You should try it out. It should bring your friends to our platform. I think that just help us get a lot of organic growth. Well, the key sources growth, we identified from our data is ZALO from Vietnam. ZALO is like... the WhatsApp of Vietnam, we're able to get about 10 % of users from that platform. So people are sick of it a lot in their own social network, telling them that this is a good product, you should try it out, it's better than ChatGPT it's better than Perplexity I don't know if they question that So anyway, they find a lot of cool features in the product and we. It's working very fast. We want to make sure that you like the product, you like it better next week. You have to come back one week later to see whether there's a new feature coming up. This is how we retain users in the platform We made a mistake by moving too far to other regions like the LATAM and Middle East. We wanted to try to expand at early days to see how well we can expand elsewhere. But we've seen that the retention is not as good compared to that of Southeast Asia, maybe because we already have a very strong user base in region. So we decided not to put too much effort, not to move too fast outside of the South east asia. We want to find the best, the ultimate model of growth in the region for the Agnes to open another one quarter before we go very aggressively outside of the South east asia Derek Zheng (23:25) Great, great. So yeah, I read the articles and the research. also found that besides the Southeast Asia, you also make a great success. think in Argentina, South America countries. So how's your users feedback so far? And you just mentioned you made some mistakes and what was your plan for that? Bruce Yang (23:47) Well, we do see a very, very good potential growth in ⁓ the LATAM regions. The cost of user acquisition is as low as South East Asia, even lower. And we've seen a lot of users using our product and we've become number one or number two in several countries in the time regions. Beating like Gemini, beating Chat GPT you won't be in that, can do that. But one problem we see is because we were not locally stationed in the region, we don't really hear the feedback from the region very well. We're not able to maintain the growth. very well and we are not able to maintain as high retention of South East Asia as we have imagined. That's why we decided not to be too aggressive for regions outside of South East Asia just recently. just before we were like a team of engineers, a team of data scientists. So we look at data almost every day. I and my team have to look at data every day to identify problems, like identify where our retention drops. how well, what kind of feature might grow our retention. And luckily our retention numbers actually grew pretty well. The 30th day retention used to be like six percent. Right now it goes up to 10. I think that definitely means that we have made a success in terms of PMF, in terms of product development. So that's the same to us. for the time which we grew very fast in the latam regions, this is where the retention drops. And we don't... We don't believe in that retention. I mean, we, we still see a lot of potential in the time regions. It just said we're not ready just yet. We have, we are very smart team. Besides some of the final we did, final reason we are still at a, ⁓ we have to put a lot of effort, you very cost efficient way That's why I think the best thing we should do, spend the money very wisely is to focus on the, Southeast Asia regions be the king of the region as fast as possible. as early as possible, find the kind of gross models, sure that it's transferable to other regions before we take a very bold step forward. ⁓ Derek Zheng (25:54) be the king of the Southeast Asian region. So I want to also touch base a little bit about your pricing strategy. So I'm personally a paying user of several AI tools like charge GPT emission and Manus which often charge for advanced features like ⁓ image generation, search report or data analysis. But I noticed Agnes offers many of these premium functions for free or with enough credits. So at the age where most AI products has decided to move forward, and focus on monetization? Why did you decide to make these features open access? Bruce Yang (26:34) Well, again, a very good question. It's not our strategy of not focusing on monetization We want to focus on monetization too. But it's just that we feel that we are not good enough to charge people. In the process of product development, we want to find a product market fit. We want to confidently tell ourselves that if we... charge the person at this price they're going to be very willing to spend money and stay on the platform even better. If we're not able to reach that status, we don't want to charge early. And we have tried to, similar to the other product, give a very low quota at the beginning and ask them to spend money after the first day of use. It does seem to generate pretty good revenue at least at pay rate. But the problem is it hurts a lot in terms of our daily active user growth So if you look at the end game, like five, 10 years ago, what would be the best product in the world, representing the mainstream AI usage? It's not something with strong AI. If you look at the AI right now, there's a lot of AI marketing, a lot of faith AI. But still, even if have high AI right now, if you don't get a lot of people using your product, it's still a failure despite the early success. So look at the end game. I honestly, I think our North Star metric should be DAU. If DAU goes up, any other metrics we go about with it. There's multiple ways of making money, monetization everywhere in the world. Despite, you know, emerging countries like the South East Asia, South East Asia, like the LATAM not as good as North America in terms of subscription. We still see a lot of successful businesses like, grab like, like, shopeee because people were definitely all the companies will definitely be able to make money from traffic. The more traffic you have at the center of all the people you know, the product you have a say of how to make money. So I, I that's our strategic decision not to, you know, charge people too early when the product is not good enough. We haven't built a kind of correct. You haven't built a kind of trust from your users. I'm certainly going to charge our users. sooner or later but we, we, we take it as a very careful method. My style is maybe $1, $1 charge somewhere, somewhere in the app. Not because we, we lack money. mean, a lot of people want to charge people, charge, but a lot of apps want to charge early because they're not running on their own models. They're costing very high. We want to solve the problem from outside. We, we save the cost. We use our own models, self deployment. We do a lot of effort to reduce our costs. So the The reason we started chart building is not because we're not built to afford it. Just want to try out whether this is a natural step for people to spend maybe $1 here and there. And maybe after they spend the money, they can stay on the platform even longer because a paying customer used to have a higher retention. Yeah. So, so that's the kind of strategy behind our own moves Derek Zheng (29:49) Nice, nice. I like your ambition and I like your strong engineering culture and as a user of Agnes application itself, I really appreciate you put yourself into the customer's shoes. Bruce Yang (30:01) Thank you very much, Derek Derek Zheng (30:03) I'm personally curious, which feature is currently the most loved one by your customers? Bruce Yang (30:11) Well, because our app moves very fast, know, the feature which I mentioned right now may not be the feature which will be loved by all the other, by all our customers download. So at the beginning, I think a lot of users like our slides because we're able to generate pretty good slides. Maybe not the best, but on the top tiers. And we give very high quota Like if you use gamma, you might be able to generate 10 slides. Then let's go. Next month. There's no refresh rate. We are giving users to generate, free users to generate a hundred pages of slides every month. It's high quality slides because we're able to maintain fast load. So this is the study point where people especially from our PC side are getting sticky to our platform. Whenever they thought about building their slides, generating their slides, They thought of Agnes It's very, very good strategy to grow early, early adopters. think for the last time, for my, you know, presentation, ⁓ everywhere in Singapore or anywhere in the world, I just need to half and hour to generate the slides before I go for the talk. So that's very handy, comes very handy, especially if you sometimes, if you don't have your laptop with you, can build the slides on your phone and show that to the investor. That's something I think is very well used by our customers. The ones which I like a lot, I like to believe that people will love about it and they're to be really enjoying using that in their platform. It's our group chat, especially group chat with memory. That would give us a huge advantage. I must have a memory for you and what you did and what kind of questions you asked when you gave a talk for your firewall, you want somebody to hear your voices and so on, playing the role. And if others can remember it and keep the memory for you, you won't, won't, you won't try another product Even if it's better sometimes, because memory is something which will let people stay at stake with your platform. I have a friend. I met a better friend, but he doesn't know me. I stayed with my previous friend. That's the idea behind it. So I definitely think that we have put a lot of effort. Number one research effort is on all one. Derek Zheng (32:17) is very good. Bruce Yang (32:31) proprietary models, level two is agentic memory. And our research team are spending a lot of effort to make our memory the best of the world. That's what we're trying to do. Derek Zheng (32:45) It's nice to memory is definitely very important to the agents. But I also noticed I tried to play Agnes and I did some research as far as I can tell. I think your context window is around 32, seven hundered tokens, not 128. So do you think the context window is good enough for now? And do you have any plan to extend the context window? Bruce Yang (33:08) This is very challenging question. This is time I got this kind question. The reason why our context window is not as big as the other applications is we are post-training our own models and open source models don't have very good support of context window. That way we have to rely on other effort like file systems, vector DB, what we call RAG. and other kind of ways to reduce the problems. So far, think we found pretty good ways to solve the problem of content limit. used to be a problem that when we do editing on the slides, once in a while you talk to analysts, they will not remember what you have said previously because you reached the contact limit. But we're to use a new feature called advanced edit, which people can just drag and drop on Agnes. That way it does not incur any any involvement of the large language model without you know, lengthen the context window that that solves the problem. Long term speaking, I think there are two ways to solve the problem. One way is we need to have very good context window management, Lot of memory, vector DB of maybe file system to keep the heavy lifting work, you know, there and only put the reference point in the context window. The second way of solving the problem is either work with open source, you know, ⁓ more companies like Qwen, deep seek with awesome to have. what, what the other ways are to imagine that, ⁓ then, ⁓ they're there. their context window it because they are the pre correction in models that only we will host here. It would be candid from their side. It's good. It could be very difficult for us to lengthen the. Nice. Derek Zheng (35:04) Thanks for the sharing. I also want to learn, I think the recent research shows that a genetic system shift focus from extreme accuracy to token efficiency, right? Especially in the long dialogues on multi-agent workflows. So what is Agnes's approach on token efficiency and the potential optimization? Bruce Yang (35:25) Thank you very much. So we put a lot of effort to reduce the cost for each of the tasks. One of the reasons is we want to really make our product mainstream And if we keep the cost high, we can't really scale very well. If we keep continuing with free database. Besides deploying our own models, whole chain of own models, to reach the same kind of a SOTA level, it's a multi-agent system routing each of the different tasks to a smaller host from the outside. effort which we are working on is what we call code agents. We use pseudocode to replace natural language or non-parallel multi-agent communication. Pretty straightforward idea, but it takes some time for us to... you know, idea and experiment to make sure that we achieve a very good results. We're aiming to improve the accuracy after 5 10 percent in terms of long open benchmarks like Gaia, the HotpotQA while reducing token usage for about 40 to 70 percent So the idea behind this is if you compile two code snippets, it's always getting similar results. It's very difficult for you to get a wrong without, different without, if you wrong the same code there. But if you speak the same language to another person, it might introduce a lot of misunderstandings because of context, because of just the freedoms of the natural languages. So the merits of code is because it's formal, because it's formal, it's format, and because it's lack of deep real freedoms, it helped the agents to understand each other very well through a lot of Corridors. That way we actually improve on accuracy and reduce the token usage as simple. So this is a magic. Well, there's a lot of other efforts we're trying to optimize and then sometimes we need to balance. There are times we have to balance on the accuracy and speed. We want to respond especially on the mobile side, we want to respond in very fast way. We want to respond in like one minute. If you spend five minutes for deep usage. You can already forget what you're doing. So mobile is very fast paced, especially on group chat, you want to restore everything in about one minute. Our philosophy is train our best to get the best results in one minute. After one minute, we cut it off. That way we achieve a certain level of accuracy, up to maybe 90 % of all the best products with the fastest in the world. This is a very important trade-off for us to make the strategic decision not only from the technical point of view, but from a business point of Derek Zheng (38:26) I'm also curious, like when I play Agnes application, I'm asking Agnes, what are you seeing right now? And show it a picture or a video? I think it gave me an impressively accurate response. Even though I know you don't use a native VLM right? Instead, you are using a large language model and with external APIs for video. And so far you don't have any audio feature. You can correct me if I'm wrong. So what makes you decide not to go for the multimodal At the start, go is a native larger language model with external APIs. And do you have a plan to add only feature? So I think you don't have a VLM or ALM right? It is just Agnes R1 for large language model. When you want to use a video feature, I'm using, like what are you seeing right now? I think you are calling external APIs maybe for another banana or for the others. to make it happen. So what makes you make the decision not go with a multimodal at the very beginning? Bruce Yang (39:30) We do have post-trained image gen and video gen ⁓ models for our own usage. We are post-training on top of some of the best open source models for post image generation and video generation. They might not be as good as maybe a nano banana, but they're doing a good job if we can mix it very well with our large network model. So I believe that the multi-modal ⁓ the framework of generating images is a combination of two important factors. The first one is understand human intent. It's not very easy. You just threw your props into a differential model of, even in a SOTA model like Nanobanana they might not understand your intent very well. Well, Nanobanana is already at a certain stage. They understand pretty well, but this image generation, image ⁓ diffusion of other regression models, they are not the best to understand about human. And that's where we use the agents working on the multi-agent, know, not natural model system to try to understand the intent of human. This is our first step. We call it a prompting agent. The role of this agent is trying to understand human intent. The second part goes to our, you know, in-house image gen and video gen model. which is like a post-training version of maybe one models, a post-training model, post-training version of Flux. The good part of these models is if you give very precise descriptions of what they should do, they are extremely good at, you know, do whatever you ask. But if you give a very abstract, abstract of it, if you give a very little instruction with a lot of underlying data requirement, not mentioned in your description. They were just not doing very well. They were just hallucinating a lot and giving a lousy result at the end. So the second part of the game is what we call a generating agent. It takes an instruction up on human brought from the middleman, the property agent. We translate from human intent into very precise descriptions of what the image generating agent should do. And these two work very well. The third piece. After the image generating, a view traffic agent is not a large language model. Under the position of large language model as a judge, what it does is to evaluate on the image or on the view. I give a stop for number one. If it not really quite fulfill this work, we have to re-generate the process. The second step is to give feedback on where to improve. So this, this, you know, a evaluate agent, also large language model. It's trained to give good feedbacks. And this feedback goes back to the programming agent And programming agent goes through the entire process again with the general agent and the main agent. Until we get enough feedbacks or a lot of retouches on the image or video direction to get results which is satisfactory by the left-leveve model, I mean the left-leveve model at the job, which is calibrated with human experience And that way, we form a system which works better than we have one alone, each one alone. Either it's an imaginary E-gen diffusion algorithm or it's an autoreacted model. This is how we develop our system. Derek Zheng (43:06) And I think you also just corrected me. You do have your in-house models for image generation and video generation. Bruce Yang (43:13) They are not as good as the SOTA models, but for our use cases, especially with an help on the large language model and prompt servicing purpose. Derek Zheng (43:24) Cool, cool, Now let's touch base and talk about your partnership with Agora. I like this part. So can you just share with our audience how did this collaboration between Agnes and Agora start? Bruce Yang (43:25) Okay. Yeah, so we first start about thinking about the group chat features. We thought about starting everything from scratch, but it would take a lot of effort, take a lot of time. And there's a lot of, know, audio and video and group chat functions, we, after a lot of research, we found that is Agora expert. So the best thing we should do is just collaborate with Agora and get Agora on this part of the boat During the process, we worked with Agora, I think it's a very pleasant experience because we have been through a lot of, know, nights together. Our team is very kind of stressed. feel the same kind of stress to the Agora team to have a working-hard support. And that's a very fast response. Only a request is sent, the response will be returned within one day, and ⁓ a proposal will be given within a week, and within two weeks to a month, we should solve the larger problems. So I definitely hope that our partnership will go on continuously to support even bigger audiences. Derek Zheng (44:59) Nice, nice, nice. I think you mentioned that the Agora chat is used in the... Actually it is a very cool name called Co-vibe So what other opportunities do you see for combining Agnes's collaborative intelligence with Agora's real-time interaction infrastructure? Bruce Yang (45:18) Right. I definitely see that, you know, Agora is very, very popular, very famous for the speed, right? Isn't it speed and cost? Let me see. Let me see. Let me see. I think the kind of philosophy lies very well with what Agnes is trying to do. We want to be the fastest AI consumer product mobile in the world. And we need the kind of support from the best, best infrastructure in the world, like Agora. We also want to keep the cost low in terms of cost efficiency. think Agora is doing extremely well. With these two decision factors in our mind, Agora is almost the only choice Derek Zheng (46:02) Thanks, it happens. Thank you for the kind words. So we've seen a growing number of voice-first AI agents emerging in 2025. So I want to get your opinion, like what's your perspective on the future of voice-based interaction? And do you believe the upcoming wave of conversational AI? Bruce Yang (46:22) I don't miss either to trend And I think for the next generation of the AI native, you know, social network, the voice control will be the norm. Maybe a lot of young audience, my daughter at age of nine, she's very literate. She got into the gifted program, gifted education program. That's only 1 % of the entire Singapore private school got into this program. She can read and write extremely well. She, uh, finished all the several books on her daughter series at age nine. But when she respond to me on the social network, she never type anything. She always speak to me. Always voice talk, voice over. And if you, if you understand that, yeah, kind of pattern, uh, that kind of, uh, you know, environment of growth. I don't think they have to know about typing of a phone anymore. AI is doing an extremely good job to understand human from natural language. They don't have to stay on the same kind of platform like us because my daughter doesn't really like WhatsApp or WeChat. She doesn't really have a smart watch just yet. But she has a smart watch. So I feel a lot of young audience growing at that age, very AI native, along with it. And while my father is in Singapore, she can still afford a phone. We can still afford a phone or a PC. A lot of young artists outside of Singapore and Southeast Asia and including a larger emerging markets, they not really need a smartphone in their life. That's how, you know, the voice control, like how Agora can support as APIs to facilitate this kind of low latency, ⁓ know, more and more communication in places where network is not even that good, play a very good role, very powerful. And we hope to, you know, run on this kind of hype for a very, very transformative, very disruptive, you know, era of AI changing the entire lifestyle of consumer internet. If I am able to take this opportunity together with Agora remind me of the next Facebook. I would say remind me of the next Facebook. AI native Facebook. Derek Zheng (48:36) Nice. Yeah. Looking forward to that. Looking forward to that. Yeah. I think you just finished your latest founding and you mentioned that you will focus on regional LLM training and a global commercialization, right? What are the new features that users can expect? Bruce Yang (48:56) Well, by way, we're still in the process of finishing the Firebase. Investors who are interested, feel free to still talk to me. Well, we have received a lot of, I have no doubt that we're this round, but there's still some chance to join us. But it's going to be close by. Anyways, we're in the transition stage of moving from a mainstream, know, summary AI product, just like something very similar to ChatGPT or Perplexity or email address to something very disruptive. We're not going to every customer extremely well because they are very strong contestants like ChatGPT We also want to enter in this area of the world. I'm working on this ChatGPT for OpenAi the emerging markets are their focus starting from 2025 because in their homeland in US, they have faced a lot of competition from Gemini and from Grok home, all that platform. We have to focus on everything. We're not going to compete with them forever. It's not our, at our best advantage to compete with people. So we're in the transition stage to moving from general sovereign AI to social AI. And we see a huge, huge opportunity of that. If everybody is trying to gain a piece of pie in the productivity, it's a messy They're all trying to get a, you know, get get a similar kind of cut from. Microsoft, Google, and maybe Adobe. But the opportunity lies on what Meta is doing right now. You look at Meta don't think Meta is doing too well in AI. And they are dominating in terms of social network. So if we are able to take maybe only 1 % of what Meta is dominating right now, if we can be a very good player, be one of first to enter the AI native social network, we have huge chance to succeed. It requires a very unique combination of the talents and skills. You have to be very strong in terms of understanding about social network. And you have to very strong in terms of AI. We don't have to be the best of AI, AI models kind of change the big key. We just have to be better than Meta. And it's definitely achievable So I think down the road, there will be a lot of social features coming out on our product. starting from the group chat that co-write with Agora There are a couple of new ones, like the newsfeed is very cool. All the news is written by AI, it's assembled from the news from online, but it's very short, in AI, whatever kind of, know, tones you want to tune for it, whatever content you want to assemble for yourself, you can clear the content that the digest from, that it digests from our newsfeed. It's something very inundated. Yeah, you're going to see some other features which I can't read at this point. There's going to be hardware coupled with our group chat. There's potentially a community coming out. You can definitely expect, think all the audience can definitely expect seeing these kinds of products in the next one quarter. We're moving to the first. Derek Zheng (52:08) Thanks for sharing. Yeah, I think this is also my solo of trip to Singapore in the past three months. So I'm surely impressed by how fast the AI ecosystem here is evolving, especially when talking to the leaders and engineers in the AI industry like you. you. So yeah, before we end, do you have any advices, suggestions to our audiences who is trying to ⁓ building regional Bruce Yang (52:25) So. Derek Zheng (52:38) meaningful applications. Bruce Yang (52:40) Yeah, we definitely encourage everybody to start early. Don't have to be brief. think there's a huge amount of opportunity in building the next-gen AI products. When we started to the product earlier this year, you see we have seen a lot of challenges, but the trend, the wave is bigger than we have imagined. the wave products to this stage and we enjoy a lot in terms of running on the waves. And we definitely want to encourage all the potential ⁓ developers and entrepreneurs to join us together to build an ecosystem. Just like how the, you know, the value, you know, ChatGPT is building all that kind of ecosystem with the company and value. We also want to build a kind of ecosystem with smart minds, good products, good companies in Southeast Asia. together we can probably, you know, dominate the market at it. least anything is emerging market Derek Zheng (53:45) Nice, nice. Okay, so thank you, Bruce. Thank you for joining our podcast today. It's been a pleasure having you on the show. I wish Edinis a continued success and growth and your leadership. All right, so it's time to say goodbye. Thanks for everyone for tuning in and I'll you in the next episode. Bruce Yang (54:05) Yeah, thank you very much you guys.