A NOVEL MULTIMODAL DEEP LEARNING FRAMEWORK FOR DRUG-TARGET INTERACTION USING LLM AND KAN

1 Dr A Praveen, 2 A Shiva Vaibhav, 3 B Harshith, 4 A Aditya, 5 A Sivakrishna

doi:10.62643/

Authors

1 Dr A Praveen, 2 A Shiva Vaibhav, 3 B Harshith, 4 A Aditya, 5 A Sivakrishna Author

DOI:

https://doi.org/10.62643/

Abstract

This project presents an AI-based image generation system with user-controlled attributes using the Stable Diffusion model to produce high-quality, customizable images from textual prompts. The system integrates a latent diffusion model with a conditional guidance mechanism that allows users to manipulate attributes such as color, style, and object features. The methodology involves text encoding using CLIP, latent space noise initialization, and iterative denoising guided by user-defined parameters. A fine-tuned Stable Diffusion model is trained on publicly available datasets such as LAION-5B and custom curated datasets to improve attribute control accuracy. The algorithm follows a prompt-based conditioning approach combined with classifier-free guidance to balance creativity and precision. Experimental results demonstrate that the proposed system achieves improved image fidelity, attribute consistency, and generation speed compared to baseline models. Quantitative evaluation using FID score and qualitative user feedback confirm enhanced performance and usability. The system is efficient, scalable, and suitable for realworld applications such as digital art, content creation, and design automation.