Derek Zheng (00:21) Hi everyone, welcome to the Convo AI World Podcast. This week I'm in Japan. It's an amazing opportunity to meet and talk to a lot of industry leaders who have truly unique perspectives on AI. And it is always special to have face-to-face conversation with people who are shaping the future of this field. And I can't wait to bring their insights to all of you. Alright, let's start today's episode. Today I'm glad to have Jia from AKA Virtual to join on the podcast. Jia has deep expertise in AI-driven virtual economy with a lot of experiences in building the success. So let's dive right in and start today's episode. Jia, would you like to say hi to our audience? Jia Shen (01:06) Hi everybody, my name is Jia. I am originally from the United States, born in Chicago and then San Francisco. But I've been in Japan for 15 years now. Yeah, it's great to be here and The world is changing at such a rapid pace. Every day I wake up and I'm like not sure what to expect. It's pretty crazy these days. Derek Zheng (01:21) Thank you, Jia. Welcome to the show. Why don't you start with a brief introduction about yourself and what put you into the virtual economy? Jia Shen (01:32) Yeah, so I've been doing startups since like 2005. And everything I've been doing is usually around things that are social, like cultural, and then gaming and entertainment-based, And so my first couple of companies was in social gaming. And then when I came to Japan was really to try to pursue the Japanese gaming and IP world. You know, of like enter the kind of an industry that I thought was impossible, being able to do like Pokémon games and to work on like Street Fighter or Dragon Ball Z. I'd like to say that we've been able to do that. I've worked on three Pokémon games. We work with Sega on a bunch of things. We work with Capcom, Bandai Namco for all sorts of things throughout the years. Japan, especially this industry, is hard to on the inside. So that's just always been kind of my dream and my goal, and it's what I've been building companies around. Derek Zheng (02:19) Cool, cool, nice, nice. So I see you and your team also recently joined the TGS, right? Tokyo Game Show. Could you share your experience and the takeaways from the show to our audiences? Jia Shen (02:33) Yeah, I mean Tokyo Game Show, of going off the last question, has always been a dream for me to go as a kid, or like when I grew up. You dreamed about going to E3, you dreamed about going to Tokyo Game Show, and then I didn't even know what ChinaJoy was back in the day, but it became this echelon of events that if you attended it, it meant that you reached a certain level, right? This is our third year attending Tokyo Game Show, and yeah, I mean, on some levels, first of all, dream come true but it's also really been interesting to see how Tokyo Game Show has been changing through the last 10 years. Because, first of all, the gaming and entertainment industry has changed so much right? Originally, I don't think this is controversial but the original Tokyo Game Show is just about Japanese-made games but these days if you think about Nintendo doesn't attend and Sony does put up a presence. But the other thing is that there are a ton of Chinese companies. Chinese companies actually are, maybe they represent a quarter of the floor space. And then the presence is, the presence used to be an unwelcome presence because it just didn't feel right. Quality wasn't there. Just seemed like Chinese companies were spending money, weren't meant to be on the floor. But these days, this year especially, yeah, the Chinese booths, couldn't even tell the difference. They felt very much like at the world level, AAA stage of game companies. You people think about miHoYo and Genshin Impact, but that's just one of the game companies now, right? And so I think that, you know, Tokyo Game Show, you know, before COVID and even right after COVID people were worried about how Tokyo Game Show will continue. Yeah, because E3 is gone for instance, But I mean last year Tokyo Game Show was great. This year was even better. So as the attendee, as a game fan, I was very happy. Like it just feels like gaming is good. Gaming is good in Asia. I think gaming for business, business is good, right? And so I think that's very very exciting just from that world. And obviously for us, it's been also very interesting because we're not a game company, we're a game character or a character company, right? And we're there to access kind of two different audiences. Tokyo Game Show is super different than any other type of game show because it's two days business and two days normal audience. So how do you design a booth and experience that gets business people to want to come to your booth and then afterwards, regular audience? Because we're not a gaming IP. If I have Sonic the Hedgehog or Mario, yeah, both days, I'll come by the booth and go check it out. But on some days, we're kind of a business solution. On somedays, we're an entertainment experience, right? And how to design something that actually fits both sides is a difficult thing. And so I'd like to say that we've done a good job and we can always do better. But Tokyo Game Show is an interesting experience for a company like ours doing characters and AI characters. Derek Zheng (05:16) Cool, cool. This is your first time to join the TGS as the CEO. Jia Shen (05:20) No, no. This is our third year doing Tokyo Game Show. Okay. And the first time we really didn't know what we doing. Last year, we did know what we doing and that went much better. And this year, this year actually changed because Tokyo Game Show changed management. And so a lot of the kind of nuances of how it's done. It was different. If you know the new operators of Tokyo Game Show brought in bigger sponsors, they were at promotion, and they were thinking about things a little bit differently, which is good. In Japan, oftentimes they just stick to the formula, but this time around, a lot of things actually were a little bit different, so it was good. Derek Zheng (05:56) Cool cool cool. I noticed at the TGS you showcased the two fascinating demos, right? AI Witch and the Fortune Teller. So how's the feedback from the visitors? Jia Shen (06:07) So obviously I'll say that the feedback was great, but the feedback was really good. We spent a lot of time thinking about the KPIs, attracting people to come, getting people to interact, and having them feel like they came walked away with something. Like that's, I'd like to say the core challenge of everybody that's attending today's conference, right? For instance, it's like, you can have conversational characters, but the question is always gonna be, 'why am I talking to this character', 'why do I care?' and 'do I walk away with kind of a fun experience?'. And so Tokyo Game Show is a culmination of a bunch of things that we do, but I think it's more interesting to highlight what we have been doing that's allowed us to be able to choose the experiences that we did with Tokyo Game Show. We have a product called DE-AI, and DE-AI is actually two products. One product is an information guidance AI assistant. And so it's like a person in a suit speaking, not entertainment and basically providing information. ⁓ That's something that usually people know why they're going to the character and the character knows exactly what the job is and that's a very straightforward AI solution. You should do it well and then it should be very smart about navigation, communication and that's it. It should be stupid about everything else. And entertainment characters totally different. So putting it for Tokyo Game Show, we ⁓ have three characters, I think, that are very interesting that are all in the wild. We did one for Osaka Expo and that was for Yamanashi with kind of a cool, a big agency here. We did one for this kind of game center out in Yokohama called Happy Pea Land. And then we did this one with this temple called Izumo Taisha, it's famous, like Kyoto temple. And they were actually, they did this big goods pop up in Omotesando. And that one was our most interesting one because that one was a fortune-telling, a fortune-telling character. With an entertainment character, you need to make sure that they have a very specific reason to exist. And so when you're normal, we're thinking about Japanese people or tourists at this point, a child will walk up to a character like, oh, this is very entertaining. What's it doing? Oh, there's a bunch of buttons, a bunch of stuff. But an adult will be like, they didn't even know what to say, right? Like, this little prompting thing. And so with the fortune teller, the experience is very expected. And it actually can be very, very good and very specialized. And then the user has something to take away. And you walk up and the character will just be like, you know, 'What's your zodiac sign?' 'What year were you born?'. You know, ask you some questions and you expect that you answer it and they will role play like for instance, in the case of Izumo Taisha like a specific spirit and in the case of like, you know, for our witch in fortune teller demo for Tokyo Game Show is basically a witch or actually one of them is like one of our handsome guys from one of our Eryx characters. And so that's all actually now a fun experience. It's like a mini game. And when they're done with it, they get to take away a fortune, an omikuji. So the whole experience was actually very good. Because otherwise you'd have an AI character who's like, 'Hi, what's the weather?' 'What do you think about what's happening in global politics?' or like whatever. That experience is terrible. So the design between AI characters, user interfaces is very important. And that's what we spend most of our time on. Derek Zheng (09:14) we tried a lot of events we have our AIoT or other things we want to showcase but we find it is always very struggling with the poor network at the event and maybe the noises on the background. So how do you team handle that? We want to learn from you. Jia Shen (09:31) Well, you know, my background is actually doing social games. Social games, it's interesting because, you know, doing social games and then now doing, you know, we do a lot of stuff in hardcore AI, right? And AI people are mostly engineers. And so engineers are always like, when are you doing smart glasses? Or like, when are you doing like simultaneous translation? like, basically the most technically advanced and hard things to do. I always think about what is the most normal person going to do, right? And so, like, I think about somebody that's using an iPhone 13 or an iPhone SE, or somebody that's like not, you know, thinking too much about even how to use GPT or whatever, like they're just not that tech savvy, but they use TikTok every day. They have Line Messenger, that kind of stuff. And so we think about more of those type of experiences. And so we've tried all sorts of setups. And our setup right now for with two versions that work really well, one is just a big red button. That's it. And that's all they do. No touchscreen, no, like ⁓ camera automatically recognizes somebody's there microphone auto-detection. No, none of that stuff is trustworthy. You need to have the lowest common to name denominator like best chance type of thing. Okay. That's how we always think about these type of experiences. Because obviously they are engineered with like I'm the directional microphone, build an LLM out of the basically to detect who the speaker is and like it's just not reliable enough. Right, like we have to make it work in a noisy environment and so also it's really we have a very directional mic we really like. That's pretty affordable and a big red button. And that's like our most reliable form of it. For Tokyo Game Show, we tried something different. We have three big buttons so you can like input things, but no We're not big on touchscreens. It's more expensive. People don't want to use it and they break all the time. And so usually it's a button. Our second version though is actually more fun. It's actually QR code. You can use your own phone. Right, and so you have your character. ⁓ All Asians are fine with scanning the QR code. And then they just like you can talk using your microphone or you can type and any controls. There's like a game controller. You can do anything you want on your phone And that's our second like interaction model and those are two models that work best for us. Derek Zheng (11:36) That's interesting. That's interesting. Thank you for the information. Yeah, so we also want to learn a little bit more about the Japanese style that you have built. So we know for many other platforms. They are pushing for to be more like human being, right? For the lip-sync, for the body gestures but you are doing differently, right? You are pushing for a Japanese anime-inspired style. So why do you choose a different way? Jia Shen (12:03) There's a lot of reasons for it. The easiest reason to say is that we are in Japan and that animated characters is a widely accepted communication format. The real thing is what we've learned from our VTuber days. Our background is AKA that we are a virtual VTuber, virtual character company from a technology and an agency standpoint. And when we entered the business like five years ago, like for instance, everybody from America to China was investing big in virtual influencers actually. There's a called Genies, China was like plowing money into it, Korea was spending a lot of money into it, and everybody was trying to do the same thing. They wanted to do lifelike, realistic virtual celebrities. And none of that worked. You can't say a single one's worked, right? Nobody's exists today. And it's easy to say for us, but the real reason is that The whole point of cartoons is to be a caricature. So first of all, if I would draw you as a cartoon character, I don't need a digital clone of you. I need a cartoon character version. I need a specific, just like where I was saying you want to do the fortune teller, it was a very concise experience. If I replicated you as an AI Assistant, I don't need you I need the assistant version of you, right? Both your skill set of this subset of you as well as your visuals should be a subset of you and I think that's a very important thing to understand. And then so when then you present it to people they will also basically understand that it's an AI assistant. We all know that assistants that we have aren't a replacement for a human. They can't solve everything. And that's important for expectation setting for what you actually put out there. So for us, virtual characters, the biggest piece for our influencer side is go more caricature, go more cartoon. Make sure that they understand this is another world. And then because of that, they're willing to dive more into it, believe more into it because now they don't care about the surface. The surface is this beautiful surface, but like, it's a cartoon, right? And it actually is much more powerful when you are animated. I strongly believe in that. Derek Zheng (14:02) Yeah, I agree. Like when you push for the human-like or life-like, it always distracts me like, is it real enough? Yeah. Jia Shen (14:10) That's what I'm focusing just thinking about it the whole time. But if the character like, for us like, when we do VTubers, we want to make them look like anime, right? But when we're doing mascot characters, like we're working on some prefecture characters, they're just like hand drawn and actually lower frame rate. is better. You don't want it to look that animated. You want it to have that high frame rate. The more it feels more like cartoon, the better it is, right? And I think a lot of people even agree with like. Derek Zheng (14:35) . Jia Shen (14:39) For example, if you Crayola Shin-Chan movie, we just had the CG movie come out. But the cartoon is better. If you wanted to an AI character of him, the CG version of him would be okay. But it would much more of a good experience where it was just the cartoon character talking to you. So I firmly believe in animated avatars. Derek Zheng (15:00) I agree, and it proves you made a great success. Okay, so you also mentioned the DE-AI platform several times. So can you explain to our audience what it is and how it serves the framework of building an AI avatar? Jia Shen (15:04) Yeah. Yeah, so our company like, you know, the founders like me and my brother were both technology background, right? And so we built companies based off of technology first. The fundamental base for AKA Virtual is an animation engine for characters. And that allows us to do, you know, generally whatever we want, but we had applied it to VTubers animation. And that's why we actually started looking at AI to power these characters pretty early on, I said, like three years ago. Because for us, as an engineer, I think, I have a great character engine. We animation. We know how to make an AI character look very, very human. And then the next thing you think is, how do you power it? How do you give it decision making? How do you give it personality? How do you give it intelligence? So DE-AI is kind of a product manifestation of that. It allows us internally, initially, to build out, like I said, the two different types of AI concierge and assistance. One is very functional, which is to do a job in this case, which is like information in a shopping mall or an airport or navigation through Shibuya or a train station. The other one is basically basically checkout counter, self-register. And so you basically get help you can use go with like in a convenience store, where is the milk? What's a word listing like that thing and then helping you walk through the sales process. And that and then another key benefit is that all of our characters are like multilingual. Japan right now it's like 45 million tourists this year, right? Which is like if you think about Japan has 130 million people and like 45 million tourists every year is like there's a language problem too. So that's one basically functional, vocational AI assistants. The other one is entertainment ones. Right? And is where, you know, like for instance, the fortune teller is usually like a mascot. It's still about information, but they have more personality and what they need to basically communicate is more like cultural information, right? So Izumo Taisha is literally a temple. Obviously, the fortune-telling thing is fun, but you need to talk about Buddhism, need to talk about why the temple exists, why you want to do this type of fortune-telling, why this temple is specifically relevant for this one versus this other. And that aspect of it is a different expertise, right? Because the vocational one is like some personality and mostly a RAG database. But the cultural one is not, right? Because have to make sure that it really understands the cultural stuff and can understand it and explain it properly. So those are the two pillars of DE-AI. Derek Zheng (17:40) It sounds very powerful and very impressive. Do you have a plan to outsource it so everybody can try it out? Jia Shen (17:45) yeah, we're building out actually the platform version of it. The part where we get stuck on is that everybody wants, the key thing for us is that we have... good looking characters. And so the first thing that everybody asked for was like, how can I build my own character? Do you have an avatar builder? And that's something that we don't have. So our core platform that we will expose at some point is you have to bring your own character or you have to use some basic framework to bring your visual character, but then the rest of it works. So we follow for instance the VRM, like 3D character, like a standard. And so if you bring a VRM, you can use our platform. If you use a live 2D for your characters for the 2D version of it you can use our platform. And then the cool part about the backend side of it ⁓ is that it is infrastructure agnostic. This is why Agora is super interesting and important for us because we do do our own LLMs but the infrastructure piece of it and we know new models both from voice and from an intelligence standpoint is coming on online all the time. We don't intend to lock our platform to one model or other. Because we change it ourselves all the time. So we have a little LLM called Shisa, and it's actually basically the strongest Japanese language model. But it's not a model that can rival GPT, as far as internet research and all this kind of stuff. And so for instance, one of our more advanced agents, good-looking character, do all this stuff. The first layer it hits is actually a Shisa, because it's fast, it's really fast. It'll come back in sub-second while it waits for GPT to think. And GPT is like two seconds, three seconds, 10 seconds, whatever. The smaller AI buys you time. You're oh, let me see. And it basically does the whole conversation things. And so this is where it's important to be smart about how you mix and match with your AI infrastructures because it's just changing all the time. Derek Zheng (19:33) Nice, nice. Thank you for the sharing. So the next I want to learn a little bit more about your business model. As far as I understand, you are not only creating the characters, right? You are also managing and operating them. So if you were to choose, what would be the best description of your company? Would it be a technical innovation company, a creative studio, or maybe a talent management agency? See Jia Shen (20:00) Yeah, we're a complicated company, but it's actually pretty easy to understand, in the same way where like Amazon's not a book company anymore, right? Yeah, they built out all this You know, have AWS. AWS is the infrastructure that supports their core business right and AKA is built off of similar concepts and so think of it as You know really three layers the base layer is just fundamental technology, right? So we have great, you know 3d character infrastructure AI LLM infrastructure But that as a raw ingredient is useless unless you figure out what product is right? And so the next layer on top of that is just product and so when I think about product there You know, there are two types of companies that we want to be And so we're pursuing both one is the dream is to be an IP company. You everyone's like I want to be Disney I want to be You Shoei, Shonen Jump or whatever it is. You're like, okay Like, how do you do that? It's a difficult thing. So for me, that's like, you have to build it. You have to keep trying. You have to build culture or whatever. That's a core piece of what AKA has been for a long time. And that will continue. I'll talk more about that later. But the other side is the AWS Basically, SaaS services providing generally tools that are important and useful outside of entertainment that leverages the same technology like foundation. So that's why there's those two pieces, right? And so it's pretty straightforward, great technology. One side is to help us push towards IP because we really want to have IP that lasts forever. We do have a lot of good stuff, but it's not at the level that I'd be like, yeah, we're like here for a long time. It's not Pokémon, it's not Disney. And then the other side is building technology that's actually services that we sell to people that they need. And so that leverages character technology and language technology. And not both is required, but we sell that that Derek Zheng (21:42) Gotcha, gotcha. I'm curious like which one comes first? Like when you start a business you're thinking about, okay I'm going to start with a talent management agency but I don't have the technique so let's look for solution or you start with, I have the solution, I have the technique why don't I just do the management myself? Jia Shen (22:02) It's pretty straightforward. I said, when you go into entertainment... Okay, let's say you're building solutions for entertainment. know, we are connected to lots of gaming companies and lots of entertainment companies. But you have to to learn how they do it. Like for me, like, ⁓ I'm going to make this special camera for the most professional camera man for like Hollywood movies. I got to how to do it. Right? And the best way to do it is first try to start your own business to do that. That's where our agency business came from. It's not because we wanted to do agency. Like I love entertainment. but I have no ambition to be big in entertainment from an agency standpoint. But we built a team to understand how content is created, how content gets popular, and then what are the challenges to it, right? Because if I can't get a basically Southeast Asian creator to use our stuff to become famous. you know I can't get anybody else to do it. So that's the thing. In startups we call it dog fooding. So we're dog fooding our own product but the dog fooding actually became a successful business unit. So it was a bit of an accident but it's going well and that's why we went into that space. Derek Zheng (23:05) Cool cool cool. Thanks for the sharing and it is nice to meet you in Japan and Tokyo I know you are doing a lot of businesses here in Japan, but I also learned you are Expanded beyond Japan especially Going to Indonesia. All right, can you tell us a little bit more about your operations there and what's your strategic thinking about that? Jia Shen (23:26) Yeah, so Indonesia is the agency that I was talking about, right? Which is like the number agency. Indonesia represents... represents our entire business unit for Southeast Asia. So before they were Indonesia we represent JKT48, which is a spin out of AKB48 here in Japan, a very popular group. And then we are also the most popular virtual influencers in Southeast Asia, period. And by certain metrics, male Vtubers are the most popular Vtubers in the world, the bigger than any... VTuber in Japan and in the United States as well. The cooler part is that that business expanded outside of the scope of what virtual and virtual characters are. We are basically a marketing agency and network for anything Japanese and gaming in Indonesia. And so anybody that wants to do promotion for that stuff, they basically come to us for it. Anybody that wants to do music, Japanese or whatever, they work with us because... Our characters are like the most influential characters in the space. in Indonesia. I'd like to say that it's because of me. The person that's running it is a guy named Wally. He's based here as well. He's super cool. He's Indonesian-Chinese, but he's basically super Indonesian. Been based in Japan for a long, time. And he's built a great young team that does... that's very self-independent, that's very hungry, and actually just creating a business for the new world. The interviews are super interesting because of it. One thing I like to highlight for people, sometimes people are like, why are you looking at Indonesia? Derek Zheng (24:59) So yeah, we have been talking a lot about the DE-AI platforms and the characters. JKT48 you just mentioned. Why don't you just show us some virtual character or virtual character groups to our audiences to let them have a look. Jia Shen (25:14) The first thing I'd to show everybody is actually kind one of our JKT48 concerts in Indonesia. So I'll put it here. I'll turn the sound down a bit. And so for people actually listening, there's a lot of cool things about this video. One is, this is actually the JKT48 girls performing on stage, as well as with the JKT48 V girls. So it's like both virtual and real. This is also a ticketed event. It was reported 25,000 people attended this event, paid tickets to go to it. And then the other cool thing is that you can hear that they're singing the AKB48 song, Fortune Cookie. So it's like the Japanese licensed song all the way to Indonesia sung by the artist there performed by a virtual artist as well as performed by a virtual artist online too. And what's cool about this is that the virtual artists are additional characters on the group, not virtual replacements of existing characters. they're new characters in it. And so we've done tons of concerts over the years. Everything from big concerts in Japan, know, like big concerts in the United States. We produce all those things through kind of our cool live technology. I'm actually trying to curious about what else I can show because a lot of this is proprietary. As in you have to pay to be there to see it. ⁓ yeah. Derek Zheng (26:23) People are crazy about that. I wish I could be there too. Jia Shen (26:26) Yeah, well, I mean, like we've had, we've done, this is one of our artists in Indonesia. You can see, this is just like a capture of like, can see where, how many people are online, how many people are donating at the time, and kind of what the, kind of what they say the parasocial relationship is, even in these kinds of concerts. This is one of our more produced JKT48 concerts. those are if you watch this. So they have obviously the real concerts. The virtual concerts are much higher quality. So you can see how much the difference in fidelity is. And these shows are actually done much like Japanese variety shows where it's concert, they talk, they play some games. Yeah, like this. And it's a whole deal. Derek Zheng (27:03) This is very high quality. Jia Shen (27:04) Yeah, so we spend a lot of time to produce this stuff and it's actually you know a lot of people that are outside of this world don't necessarily realize it but it's wildly I have this other video that I usually show people that do not have, do not understand Vtubing. Because like do you know, you've heard of Vtubing but you probably don't know the scale of Vtubing right? This is probably not important for the podcast, but I have this video that kind of like explains how large VTube VTubing is because it's, people don't understand. Let me see if I have this video. I'll look for that video, but there's another video of a concert more recently of just the JKT48 V-Girls performing. You can kind see how many people are on the audience. That's crazy. Yeah, just a lot of people. And the quality is awesome. Again, they're on this, like, one of those age where LED screens are super affordable. And so doing virtual characters at life size is actually very easy. Derek Zheng (28:00) And they say in Japanese, right? And Indonesian people are still crazy about that. Jia Shen (28:05) well, mean, so the girls are the girls ⁓ behind are Indonesian. and so they speak mostly Bahasa, English, and then they sing in Japanese. And so that's where it's like, yeah, like very, I mean, they're very talented people. That's really good stuff. I can send you a video later about how big Vtubing is because Vtubing is actually quite crazy. As in like, I don't know what the ranking is right now, but maybe like before if you looked at YouTube live and you ranked the top channels by revenue out of the top 20 like From a game? no, no, no, no, Derek Zheng (28:36) They'll be big influence. Jia Shen (28:38) I should send you this video because I think you don't understand how big it is. In Japan, Hololive and Nijisanji, they're all doing okay. the only server is Japan. They're reporting a billion dollar revenue on virtual influencers. That's all they do. And so it must be at least doing okay, right? And America has a very different type of for it, but... It's also just monetized quite well. I'm going show you this video just to talk you through it. You can figure out whether you want to use it for later. Very small. So this is my introduction video about VTubers. About virtual influencers. So 50 billion views annually. 6 % of all global views, not Japan, are VTubers. 60 out of top 20 channels on VTubers are live streaming VTubers. This is a Pekora. She makes 1 million in ad revenue, 3 million in super chat. This is US dollars. One more time. Yes, she 2.7 million subscribers. So she makes $4 million per annum, based off of her. And this girl, her peak stream had 735,000 viewers. That stream made a quarter million dollars. Yeah, at the time it's hot. Maas has at her peak, 326 subscribers on Twitch. Wow. That's paid subscribers. Monthly. Yeah, like top female Twitch streamer, period. Not just virtual. Okay. Derek Zheng (29:40) you Monthly. Jia Shen (29:54) All these girls make millions of dollars. Lots of them. That's awesome. They were part of the way we helped out with the app. They were at the Dodgers game this year too and last year. This was our ticket to 48B. Derek Zheng (30:06) Well... Jia Shen (30:07) Yeah, they're meaningful influencers across the entire world. It's still niche, but it's not as big. Derek Zheng (30:15) Do you see anybody using VTuber in technology sharing or education besides entertainment? Like I'm thinking if we can do the VTuber sharing as well but about how to build developer applications, something like that. Jia Shen (30:23) What do you mean? yeah, mean VTubers are being used everywhere now. I think a lot of people do use VTubing for all sorts of content creation. For virtual? Well, I mean, as virtual character doing... When we first started doing VTubing, I was excited, obviously, for bunch of animation things and whatever, but I was excited for... for women actually. you think about it, so many influencers, women, they're popular because they're pretty or something else. But they're basically getting harassed and their identity is pretty dangerous. And so what's cool about Vtubing is that you can be a great creator, you can look cute, but you don't have to show your face, right? Yes. And so that's awesome. You don't have to be like, I mean it's good for me as a guy too, I'm like, I don't have to look like Brad Pitt to be an actual influencer. To me that's very meaningful, right? judge me for my content and my personality don't judge me for how I look. And so obviously for all sorts of educational content, Vtubing is being used. People use software to just be a virtual character but then all sorts of tutorials. Like programming tutorials are even using just AI voices, right? Because people, they want to use their own voice, right? And that's more effective. I mean, this stuff's already here, definitely. I think it's more interesting where you could get a famous character to do like a tutorial on how to work on something. And with virtual, it's a lot easier to do that. Derek Zheng (31:48) Nice nice nice and it's very exciting to see how you connect with virtual characters with the real human beings. For the others, we see a lot of AI applications. It's just for AI. The phone now, I see for the cluster, you put AI and human being physically together. Jia Shen (31:56) Right. Right. It's amazing. Yeah immersion. Yeah. Yeah, there are people want to believe And that's why animation is also an important piece, right? Okay Derek Zheng (32:12) Yeah. We want to learn a little bit more about the history, like ⁓ how do you see the large language models has influenced this space? Does like virtual idols exist before that or the LLMs actually create this economy? Jia Shen (32:19) Okay. Yeah, to be very clear, virtual idols and LLMs don't mix. If you want to think about VTubers, which are different than virtual influencers, VTubers are strictly human. At least that's what they think about anyway. But the concept of virtual influencers where different types of creation tools help you become a better influencer, that's proven, that's here. And there's a lot of different... cool models to look at. think LLMs are in some ways that people can be scared of them, but to me it's always been a way to improve your skills, to augment your ability to create content. Today, VTubers biggest problem is actually your typical live streamer problem, which is burnout. They have to stream like four to eight hours every day. And it's not just a day job, it's this in front of a camera, like 10,000 people talking to you, you have to be cool, or whatever, you have to go pee, you have to go walk away somewhere else. You can do it for some time, but they always burn out. They just can't do it forever. And so a lifespan of an influencer is like three years usually, on average. And social media requires you to be online all the time. And so one of things that we are... we're still experimenting with it, but we do think that it will be successful is basically mixed influencers. core hours, human, like streaming, and then rest of it actually AI. Cause that's what we're thinking about. And I think that pairs very well for us is like a lot of the e-commerce stuff and a lot of kind of other just more like basic interactions with it, with their fans. Right? And so it's really about. how to move on from virtual influencers or Vtubers is I think too limiting. think virtual characters is the definition I'd like to go with because... know Mickey Mouse, he's everywhere. You can go to Disneyland. know Disneyland, there can only be one Mickey Mouse at Disneyland. You know that, Yeah, right? So you can't have like five Mickey Mouse's around here. Oh, would turn around the corner, there's another Mickey Mouse. But you can have Mickey Mouse in other Disneylands, right? And then he could be online and he be on TV. That's okay. And then being able to take this concept of where, oh, I'm not an influencer, I'm a character. Where the live streaming element is not the key point of your character. That is, I think, where... things change. And so doing virtual characters for the sake of content creation, being IP and other types of kind of influencer strategy, like I think AI, LLM is very, very important for it. Derek Zheng (34:47) Cool, yeah, I think we have the same idea for Agora. We're also working a lot of live e-commerce platforms and we do see the challenges. The host cannot be always online, right? They need to take a rest. But the revenue generated is based So we provide them the RTC SDK, audio, video, chat, signaling, maybe for ordering. For now, we are also trying to... Jia Shen (35:00) When they're on Derek Zheng (35:10) help them with the AI technology to do the sale. Let's circle back to the virtual avatar a little bit. So we see the concert. We want to learn from technical perspective, singing. Technically, is it harder than just normally speaking? Or it is just the same? Yeah, AI singing. Jia Shen (35:27) You mean AI singing? Well we don't do AI singing for our concerts but the AI singing is, I mean in real time no, but like I mean right at this point AI singing generated is very good. It's the same thing as songwriting, you just need to make sure you write specifically for the song that you want. Like the, yeah I'm not sure your actual question, because like the from a voice model standpoint, Derek Zheng (35:52) That's my question from voice model perspective. Jia Shen (35:54) Yeah, voice and music models right now are great. They're very good. They're a little slow. That's our bigger problem. We focus mostly on the speaking part because we want our AI guys and girls to feel real. the Western models, like 11 Labs or whatever, they're really good, except their Japanese is terrible. It's just really not good. ⁓ And then the the models that are good at AI Japanese sound like usually the Chinese models You know we have our own model too that we think is pretty good But you know it's always in it's like an instrument really it's like the good at certain types of voices But other types of voices no right like and so because we're in the business of creating characters. Yes It's not just like ⁓ this is the best male Japanese voice great. That's okay, but I need to have This anime girl or this like this like famous character. That's really pitchy or you're replicating this celebrity voice that you have to really make sure you hit the comedic nuances of the voice and that's a lot harder. And that requires more, really more programming on our part, like I said. We're very, we know all the details about it. Yeah, this is why we got to this part of AI because entertainment version of Japanese is different than the core version of Japanese where, you know, people just don't speak like anime characters all the time. But if you're to an anime character or like this kind of even whatever, like they role play the voice and even the words that they use. And that's a lot of fun and that's where a lot of specialization can happen. Derek Zheng (37:22) Cool, cool, cool, Yeah, it is very for Agora to be a partnership with AKA Virtual, like you mentioned. We also want to get some ideas from you, like what do you see that we can build together in the future? Jia Shen (37:39) Yeah, we are looking at basically spinning up new agents every day, right? And this is where like, you know, we're building on our general DE-AI platform to allow us to, you know, add our specific special sauce, which is, you know, the characters in front, kind of the general, like, features that our clients want. But we are also getting asked to build new agents really every day. It's kind of crazy. often with that type of stuff, it's like what is the plumbing underneath for it and not like a At some level, we don't want to worry about that stuff. That's why an infrastructure company that's thinking about, you go to AWS or you go to like, we talk to everybody. Really, really. And they'll give you plumbing up until a certain point. But we just need a little bit more. It's really about, need to get something that, basically an agent needs to have a brain, needs have a voice, needs to have some sort of response stuff. And then it has to have some sort of SLA to it. Because our human, our character needs to talk at a pace that's important. So that is why agora as a service is very important. Because I think the last time I came to your event, had this, you have a couple companies that basically have Agora on a chip. And that really accentuates the value after. Because I know I want to do the stuffed animal. I know which model I want to have in it. How do I get into this teddy bear? You're like, those companies are already taking it, put the system infrastructure-wide plugged in, I can just calibrate it online. That's stuff that we could probably do ourselves, but I don't want to. And at scale, that's the whole point, right? Infrastructure, but infrastructure further, right up to the agent is what's important. Derek Zheng (39:15) Cool, cool, cool. Thank you for the opinions. Finally, we also would like to hear your thoughts about the future of virtual economy. So do you think it is good enough or it is still going to increase? Jia Shen (39:28) The virtual economy, it's such a broad term, right? I had a good conversation with our Indonesia team today. our business there is robust, but Indonesia is not a big money spending country. They don't spend a lot of money. People are always like, what's your average revenue per user, that kind of stuff. And the truth of it is that they're not understanding how to approach the common person. The funny thing, the thing that I point out to is that, know, TikTok, EC, e-commerce, basically their online store, their second largest country in the entire world, I guess third, guess outside of, is China, America, Indonesia. Really? Indonesia's huge. ranks at the same level as the United States as far as revenue annually. Okay. Right. And it's a pure volume thing. They just have a lot of young people that want to spend on very small, affordable things. And like, you can buy a lot of small affordable things, guess, like cards and like whatever, but in the end, the best small affordable thing that they should be buying is virtual. Right. And so, yeah, I think virtual economy... I can answer this question like 3,000 different ways. The way to do things digitally is much better than the offline way. And I think when we originally entered the virtual character space, it was a very important realization. Young people don't care if your influencer is a real person or not. They just want to believe in the value and that is they believe that it's real to them. And I think that's the same for any product that they want to buy. Right. So if you have the proper virtual product, virtual experience, whatever it is, like, yeah, they'd do it virtually. Like this world is, you know, changing so drastically. And I think that's, you know, I thought that, you know, five years ago, but this, even in this month, it's actually been very, very obvious to me. You look at all the video to AI generation, it's good. It's too good. You can't tell. You can see something like where like, I can't tell if this is real or not. Did this really happen? Is this the real news? And so what's going to happen is anything that looks pretty real, nobody's going to believe it. Right? So the concept of what people value, what they perceive now. It's gonna be fundamentally changed. It's really gonna change. Before it like, I filmed you doing something crazy in the street. You're like, can't believe you did it. Now you're like, yeah, I don't know if you did that. You no longer care about that video from your iPhone anymore because you're just like, that could have been faked. It's at a level where it's so not believable. You just have to basically believe in kind of a core system that you fundamentally care about. So entertainment systems, other type of virtual, like subsystems. systems, those are all the important pieces. The next generation's concept of what's real, what's valuable is going be very, very different. Derek Zheng (41:59) Cool cool cool. Thank you Jia for joining today's podcast and sharing your ideas and insights as the CEO and founder of AKA Virtual. It's a real pleasure to learn about AKA Virtual's journey and your vision to the next generation of virtual economy. Before we close out, do you have any words to our audiences? Any advices, suggestions to AI creators? Jia Shen (42:23) I think I talk a lot already. I think the easy thing is that everybody should just start playing with tools all the time. That's it. There's so much power in your hands now. It's really kind of crazy. Derek Zheng (42:34) Thank you Jia again. We wish you and your team continued success in the virtual economy and we're excited to see what comes next. Awesome. Awesome. So thanks for everyone for tuning in and see you in the next episode. Bye bye. Jia Shen (42:48) Bye bye, thank you very much.