Yongle Yang(00:00) Dify is an AI platform that can help non-developers to build AI applications in just several minutes and also retains all the flexibility for tech background users to implement all the advanced features in their mind. this, I'd say intuitive interface makes everything much easier I think I will not be able to tell if it's AI or human maybe in one year or two Derek Zheng (00:35) Hi everyone, welcome to the Convo AI podcast. This week, I'm in Japan. It's an amazing opportunity to meet so many industry leaders who have truly unique perspectives on AI. It's always special to have face-to-face conversation with people who are shaping the future of this industry. And I can't wait to bring that insights to all of you. Today, we have Yang from Dify to join our podcast. Yang has a very strong expertise in building agentic AI and made a lot of success. So Yang, do you want to say hi to our audience? Yongle Yang (01:11) Hi Derek. Thank you for having me today. So my name is Yang. I am a Solution Architect in Dify. I handle integration design and commercial terms with our clients. Derek Zheng (01:24) Ok, let's kick off with your story. Could you brief introduce yourself and what put you into Dify team? Yongle Yang (01:31) Okay, so the story starts from, I think it's the beginning of the AI era like the first time I tried ChatGPT, I fell in love with it even I was using GPT-3. It was just so amazing and I started to look into how to actually build this kind of stuff. I started to look into ways neural networks, what is machine learning. I started to dig in to what is free works like TensorFlow or PyTorch. And then I just moved on to how to build an AI application. I start to learn frameworks like LangChain. And then I know about Dify, I instantly joined the company. And then in 2011, I never stopped. Derek Zheng (02:16) Wow, okay. So for someone hearing about Dify for the first time, can you just describe what is Dify in one sentence? Yongle Yang (02:25) So Dify is an AI platform that can help non-developers to build AI applications in just several minutes and also retains all the flexibility for tech background users to implement all the advanced features in their mind. Derek Zheng (02:44) Very good one, very good one. So we are also curious about the meaning behind the Dify. So we heard from someone they are talking about Dify stands for Define Simplify. Is that true? What's the story behind that? Yongle Yang (02:58) There are actually many versions of what does the acronym stand for. So a motto from our CEO is if you want to learn about something you have to always keep getting your hands dirty. So from my point of view, Dify stands for Do It For Yourself. Derek Zheng (03:16) Do it for yourself? That's a good one. Okay, so I also did some research about Dify before this podcast. I see Dify is always recognized as the LLM Ops, right? Can you just try to explain to our audience what is LLM Ops and how good, efficient Dify is in practice? Yongle Yang (03:40) Okay, so basically when you're trying to build something with AI. There are some components it usually takes for example prompts, knowledge, context and you also have to like implement some code logic into this application. So what LLM Ops mean is you integrate all these components into one place and after everything is developed, after everything is done, you also need to monitor to analyze the application you've built. So we also have this analytic dashboard for you to inspect how this application performed in terms of maybe token consumption or active user kind of stuff. So integrating all this together makes LLM Ops platform. Derek Zheng (04:31) That's LLM Ops. As you mentioned, the developers need to integrate so many things to build an AI agent or application. So what was the specific pain point that you get from your developers, from your users that made them say, we need a Dify? Yongle Yang (04:52) So Dify has been two years and one of the major challenges we've concluded by talking with our clients is that companies are really having a hard time to do technical onboarding for any kind of AI usage. That's where Dify comes in. I can always give anyone a 10-minute lesson on what it takes and how to actually to build an AI chatbot with Dify. And I can also let a developer build an AI workflow that integrates all the complex features they want to include in their AI application in several minutes. Derek Zheng (05:36) That's awesome, that's awesome. And as a SA of Dify right? Can you just walk us through the architecture of Dify? What makes it so efficient? Like in a few minutes, you can build an AI agent with complex features. Yongle Yang (05:53) Okay, so first of all, the solution is open-source. It's on our GitHub repository. Everyone can just inspect our code repository. And you can just set up your platform by Docker compose in like a minute. Then, or if you don't want to do all the setting up, you can go to our cloud version. So it's free to use. And once you're in here, you could create something like a chatbot workflow or you can just integrate all your private data into our knowledge base and we also have something called Marketplace that can build API calling to any third party which makes API calling reliable, secure and integrates into your own workflow. The components you should work through in Dify is you should first build a chatbot and then you might find building an AI chatbot might not satisfy your current needs and then you just move on to the workflow. And in some circumstances, you will need some context engineers to make AI more reliable. So you integrate that into our knowledge base. And then if you want to like build this application upon any third-party service, you just integrate our plugin in the market. Derek Zheng (07:15) Okay, yeah, I see you have so many features, tools on your platform, like how the developers could immediately identify. Okay, this is something I need. Yongle Yang (07:26) I think it all comes from what's in the developer's mind. The feature itself is very easy to use. They just have to decide what should be suitable for the right overall structure of your AI Application. For example, in terms of AI workflow, we have agentic workflow and AI workflow. So these two kinds of framework has its own weakness and strengths, and they just need to pick up the right methodology. At the beginning, to make the whole process much easier. And if you're actually building something with Dify, you'll find all of our features are very easy to use. You just need to worry about the whole picture of your project. Derek Zheng (08:12) So it is all these objects, is like visual AI studio, or they integrate with SDKs, APIs, how does it work? Yongle Yang (08:22) So just like I mentioned, I can teach anyone how to build an AI chatbot or workflow in several minutes. All thanks to we have a chatbot developing interface so all they have to do is just write some prompt, integrate that into your private data and place some guardrails and if they're using our workflow canvas, it basically just give you a bunch of nodes that can handle HTTP requests that can execute Python or JavaScript code or of course, interact with large language model. Yeah, so this, I'd say intuitive interface makes everything much easier. Derek Zheng (09:04) Nice, nice, nice. Okay, we also want to learn and get your idea, like what's your philosophy on agent design if you make an advice to our developers, would you prefer a single smart agent or multiple functional agents to build the app? Yongle Yang (09:21) I'm really a big fan of agentic system like leave everything to AI, make AI 100% autonomous. So in most of our cases, if you want to be an office assistant, you're going to need a multi-agent system. What I mean by that is you need a master agent to assign tasks according to user's query. And then distribute that task to the sub-agent system and each agent system is specialized in any type of task. For example, handle email, writing email drafts, sending email and also schedule a meeting kind of stuff. So I'm a big fan of multi-agent system and the essential idea of multi-agent system is actually you integrate all the single-agent system into one place so there's no pros and cons between single and multi-agent system, they're just looking into each other and use the main one, use the master one to distribute all the work that the single agent should be doing. Derek Zheng (10:36) Okay, the Dify will take care of the synchronization and workflow between the agents, right? Yongle Yang (10:43) Yes, So it basically assigns everything automatically upon the judgment by the large language model. Derek Zheng (10:51) Okay, okay. And I also heard that you mentioned you have like an analytics dashboard. So is that to help the developers or the company to measure the agent quality after they build an application? Yongle Yang (11:06) It's not actually work like that, but we do have an open interface for tracing the performance, the actual performance of our language model. So we do have an annotation feature that helps you to mark on the desired answer from large language model. So if large language model is saying, it's answering some question, right? I would just annotate that. And next time, if they're doing the same job, it will just wake up the annotation and they will just find out this is the perfect answer and I should try to answer like this one. So we do have this feature and we also open up an interface for tracing. So you can always integrate with LangChain or LangFlows to inspect what is actual output from the large language model. This inspection on the output can prompt back to how you could design your prompt, how you could make the system more agentic. Derek Zheng (12:13) So It is a feedback system. Yongle Yang (12:14) But it doesn't work automatically. have to modify it, have to remediate it by human, of course. Derek Zheng (12:21) You mentioned there are some metrics in the feedback system. Can you just name a few of them and do some introduction to us? Like in Convo AI system, we do measure the metrics of TTFB, something like that for latency. So what's the metrics you are using? Yongle Yang (12:40) Latency is one thing that definitely would matter. You have to decide which large language model is the best model for your task. You don't want the reasoning or the output token consumption to be reduced though. Another metric we would do is the accuracy of the desired answer. So basically, you will have an evaluator which is a large language model that helps to look into if the answer generated by the other large language model is fit into your desired answer. So we usually have a credit system which can put score on each type of answer and score between 9 or 8 to 10 would be more desirable for most of your work. Derek Zheng (13:37) I got you, I got you. So you are more focusing on the response quality. Yongle Yang (13:41) Yes, and everything will be done by AI. Derek Zheng (13:44) Okay, that's great. That's great. Okay, we're talking to a lot of customers, partners, we are seeing the trend of one-sentence-to-app is emerging, right? So you mentioned in Dify right now, platform right now, you need to pull up the features functions you need in the workflow to make it work. I think the one-sentence-to-app idea is for the developers, they just need to use native language to describe what they want. For example, I want to build an application for live broadcasting and it may automatically integrate Agora, audio, video, broadcasting SDK into it. So do you think this will come true? Yongle Yang (14:28) I think this is a very amazing feature. First of all, if you're using cursor or any other prompt ID, you just have to tell them what you're trying to build. They might automatically use Agora or Convo AI for the whole project. In Dify, we are an application builder. We have workflow to implement any kind of use cases you have. So we are also trying to build a workflow prompt IDE. So all you have to do is you just type into what kind of feature you want. Or you don't have to type into the feature, you just type into your use case, what is input, what is output and everything will be integrated into together. You will have a fully ready workflow canvas that implements all the features you mentioned. I think it is still in progress in developing. It might be ready in two to three months. Derek Zheng (15:24) Two to three months? That would be awesome. So you will automatically integrate the most popular extensions into it. Yongle Yang (15:31) As long as it is seeing our plug-in system, any third party too will be integrated into that. We also have this code execution box, so you're free to use any dependencies and packages from any available dependencies and packages. Yes, as long as it's in our plugin system, all the extensions and integrations will be bundled into this workflow. Derek Zheng (16:02) Nice, that would be awesome. So you mentioned the plugin system of Dify, in Agora, we call it extension marketplace. So I see you have already onboarded a lot of partners, tools, applications on that. So is plugin system a big focus of Dify? Yongle Yang (16:22) Yes, so we are having many many use cases from our clients and essentially what they're trying to do is they first need an AI workflow to get the internal data right and then they just want this data to be integrated into their own system which typically are some third-party tools like from Microsoft, from Google or from some office software provider. So the big plan is to make every third party to available in our marketplace and of course, we think we now have Agora also ready in our market. Derek Zheng (17:03) Yeah, so how's the feedback about that? Yongle Yang (17:06) Alright, so about that, I mean, we can do TTS and STT interaction by just playing Dify application, but compared to the Agora integration, is very poor. I can just build a real-time chatbot with private knowledge in just five minutes, and you can use that on your mobile, on your computer and everything is just so fast and accurate. I always wonder how do you guys do that? My pronunciation is sometimes still poor but it's recognized somehow. Derek Zheng (17:43) Thank you for the kind words. From your perspective and based on the feedback you just mentioned on Convo AI, do you think in the near future the conversations with AI will eventually become as frequent and natural as human to human? Do you think that will happen? Yongle Yang (18:03) Yeah, I mean if you're actually building some real-time voice project, you might find out it's sometimes it's very hard to to tell if it's a human or AI talking to me and as long as the voice part is getting more interactive, it's getting more case by case. I think I will not be able to tell if it's AI or human maybe in one year or two. Derek Zheng (18:30) What do you think is the biggest challenge for now? Is it Turn Detection? Is it the voice? Like a very robotic or the recognition? What's the biggest challenge for now? Yongle Yang (18:41) I think it's the text they're sometimes large language model gets very boring. Boring? So what they're saying is basically doesn't make sense or it's sometimes very flat. Derek Zheng (18:48) Boring Yongle Yang (18:57) Make you feel like talking to a large language model is very tedious. Yeah, I think if someone can design a system that's more interactive, that talks in a more human way, that would be awesome. Derek Zheng (19:11) Gotcha, it requires a lot of engineering efforts and the business strategy, I think. Cool, so for Dify, let's circle back to Dify, for your next big release, what would be the scene that would surprise us? Yongle Yang (19:26) Something that's already in our pre-release repository is a Webhook and Trigger feature. So you basically can start a workflow by any activation of Webhook and also as long as you touch a trigger, you will also be able to activate the whole workflow. So this is one and also the one I mentioned the workflow prompt IDE which will be maybe available in two to three months and I'm definitely looking forward to that. Derek Zheng (20:01) I will definitely give it a try. It sounds very awesome. Would it be like a free tier or that everybody can try? Yongle Yang (20:04) Yeah, thank you. I think it will be free. I don't know the price. I think it will be available in our open-source version. As long as you integrate into your own large language model, IDE will be ready. Derek Zheng (20:24) Nice, nice, good to know. So we also want to talk a little bit about the developer community. So Dify has made great success in the developer community. I see your focus in APAC regions, especially Japan, where we are right now. So can you just share with us your insights, like why Japan is so important to you guys? Yongle Yang (20:45) So the community like starting from GitHub we currently have 110,000 GitHub stars so it's a very big project and when we opened up the free cloud version, I think that was 23rd November, the number of subscription and registered users goes up very very quick. I think it's because of hype of building AI stuff and we see clearly there is a unique increment in Japan. No matter if it's free user or ⁓ subscription. The figure looks very good back to that time. For now, think we are around 500,000 users and I think half of that comes from Japan. Really? Yeah. So that would be Derek Zheng (21:41) 250 250,000. Yongle Yang (21:43) ⁓ I don't know if the number is up to date, but we do have lots of users in Japan and that's why we had a major event in Japan in Tokyo. This October it was called Ifcon Tokyo and there were around 200 people registered and came in for a one-day presentation of how those industry leaders are actually implementing Dify. And we also have some community meetups. We also have ambassadors in Japan that help us promote everything, help us organize community meetups, hackathons, kind of stuff. And I think people here are really amazing. Every time we release a feature and they will just try and go hands-on in like a minute and post Twitter and talk about how do they like this feature kind of stuff. Derek Zheng (22:53) Nice nice nice. So what are the maybe the top three most popular feature applications that developers are building right now? Yongle Yang (23:02) I think the top one feature in Dify is the knowledge base. If you're doing context engineering, you would integrate an embedded module, integrate a vector database, and your private data interface into together. Derek Zheng (23:07) Knowledge base Yongle Yang (23:22) It's very hard to understand what is a vector database. But in Dify, you don't have to worry about that. You just need to upload your PDF file or connect to maybe your Notion, your Notion knowledge base, your OneDrive, your Google Drive knowledge base. The documents will just be embedded into the vector database. And if you want to query on that, you just connect that into your workflow. And there is nothing very hard to understand. You just need to do the uploading part and you will find the accuracy of the retrieval is very good. Derek Zheng (24:00) Okay, that's pretty straightforward. Cool, cool, cool. So, yeah, I think it's the time for us to close the podcast. So before we close, the teams or developers, when they do build their first application as an experienced SA, what would be your advice to them? Yongle Yang (24:18) So I think for most of us, you should always try to build a project that can replace yourself. So you should find the right prompt, find the right workflow you're using, and ⁓ you just find out as the model, as large language model keeps evolving, they're just being better at doing your jobs. And at some point, you will find 'wow this guy can 100% replace me' and I can maybe do something more meaningful. I can maybe do a side hustle and then this goes on and on. I can let AI do my side hustle and I can just instantly looking for a new side hustle. Yeah, so basically you just try to make AI replace yourself. Derek Zheng (25:10) Okay, okay, okay, that's great. That's great. Any final words to the developers watching the show today? Yongle Yang (25:18) So yeah, you guys should definitely try out Dify and also Agora. So I've been experimenting with Agora this week. And I'm trying to show developers some use cases you could build with Dify and Agora. And these are the right to just try to abstract all the complete stuff for you. And yeah, you should definitely try that if you got a minute. Derek Zheng (25:48) Nice, thank you so much. It's a pleasure having you here today to join our podcast show. I wish you and the Dify team continued success in the developer community and your business. Yongle Yang (26:00) Okay, thank you, Derek. Thanks for having me. Nice. Derek Zheng (26:03) So yeah, thanks for everyone for tuning in. I hope you enjoyed the episode today and see you in the next one.