Improving Data Entry Accuracy Using Distil-BERT: An Efficient Extension to BERT-Based NLP Models

E. Monisree; Mr.R. Althaf

doi:10.62647/

Authors

E. Monisree Department of AIML, MJR College of Engineering and Technology, Piler, India Author
Mr.R. Althaf Assistant Professor, Department of CSE, MJR College of Engineering and Technology, Piler,India Author

DOI:

https://doi.org/10.62647/

Keywords:

NLP, BERT, classification, data validation, risk management.

Abstract

In corporate applications, correct data input is essential for optimizing process efficiency and facilitating sound decision-making. The current systems mostly use old-fashioned machine learning methods like TF-IDF+SVM and Word2Vec+SVM, which don't give a lot of contexts, therefore the accuracy of data validation jobs is only modest. To get over these problems, the suggested system uses modern NLP and deep learning methods, especially the BERT model, to automatically check the quality of data submission. BERT's bidirectional transformer design captures deep semantic and contextual linkages, which makes it much easier to find missing or wrong data. Also, an improved version called Distil-BERT is developed to make things more efficient by lowering model complexity and computation time while keeping or enhancing classification accuracy. This model is light yet powerful, which makes data validation faster, more scalable, and less resource-intensive. It also works better and is more useful for real-time business applications.