OPTIMIZING VIRAL DNA SEQUENCE CLASSIFICATION WITH DEEP LEARNING AND GENETIC ALGORITHMS

Authors

  • S. Bavankumar Author
  • Dr. V. Rathikarani Author
  • Dr. R. Santhoshkumar Author

DOI:

https://doi.org/10.62643/

Keywords:

Deep learning, viral genome classification, convolutional neural networks, genetic algorithm, DNA sequence encoding

Abstract

DNA sequence classification plays a vital role in biological data analysis, especially in identifying and categorizing novel viral genomes. Accurate classification of these sequences is essential for mitigating the risks of viral outbreaks, such as COVID-19, by expediting vaccine development. This study introduces a hybrid deep learning model designed to improve the efficiency and accuracy of viral DNA sequence classification. The proposed model combines Convolutional Neural Networks (CNN) with Long Short-Term Memory (LSTM) and Bidirectional CNN-LSTM architectures. To further enhance performance, a Genetic Algorithm (GA) was employed for optimizing the weights of the CNN. GA was selected due to its capability to navigate complex search spaces effectively, boosting the model's feature extraction capabilities. Three encoding techniques were investigated to transform DNA sequences into numerical formats suitable for model input: k-mer encoding, label encoding, and one-hot vector encoding. Additionally, an advanced oversampling method was applied to address the issue of imbalanced datasets. Among the tested configurations, the GA-optimized CNN hybrid model using label encoding achieved the highest classification accuracy of 94.88%, outperforming other encoding methods.

Downloads

Published

07-03-2025

How to Cite

OPTIMIZING VIRAL DNA SEQUENCE CLASSIFICATION WITH DEEP LEARNING AND GENETIC ALGORITHMS. (2025). International Journal of Engineering Research and Science & Technology, 21(1), 164-170. https://doi.org/10.62643/