ISSN :2582-9793

Biomedical Named Entity Identification using Machine Learning

Original Research (Published On: 12-Mar-2026 )
DOI : https://doi.org/10.54364/AAIML.2026.62288

Saba Mahmood, Mehroz Sadiq, Fatima Khalique, Riad Alherby, Sachi Arafat and Ali Daud

Adv. Artif. Intell. Mach. Learn., XX (XX):-

1. Mehroz Sadiq: Bahria University

2. Fatima Khalique: Bahria University Islamabad

3. Saba Mahmood: Bahria University Islamabad

4. Riad Alherby: University of Jeddah

5. Sachi Arafat: Sarafat@kau.edu.sa

6. Ali Daud: Adaud@ra.ac.ae

Download PDF Here

DOI: 10.54364/AAIML.2026.62288

Article History: Received on: 03-Dec-25, Accepted on: 05-Mar-26, Published on: 12-Mar-26

Corresponding Author: Saba Mahmood

Email: smahmood.buic@bahria.edu.pk

Citation: Mehroz Sadiq, et al. Biomedical Named Entity Identification using Machine Learning. Advances in Artificial Intelligence and Machine Learning. 2026. (Ahead of Print). https://dx.doi.org/10.54364/AAIML.2026.62288


Abstract

    

In recent times, teething dispute in recognizing the drug, chemical name entities and automatic extracting of relevant information from biological literature causes difficulties for the experts. There is an essential need of data mining techniques to develop a system which can help in automatic extraction of information so that the problem to manually find the information could be minimized. To handle this assortment, this paper focuses on the proposed methodology of recognizing the biological entities, in which five chemical entities (Protein, DNA, RNA, Cell type, Cell line) are recognized accurately. The presented Conditional Random Fields (CRFs) in the core of solution, Biomedical Name Entity Recognizer, are trained on orthographic and contextual features to segment and label the sequence data. The system is also capable of interpreting chemical formulas. The system is successful in annotating the chemical entities containing 3000 abstracts as training data, 3500 abstracts as development data sets, and 14000 records containing 7000 subset records as test data. The obtained results are encouraging, with 92.2\% of precision, 93.2\% of recall, and 92.48\% of F-score measures for Chemical Entity Mention in Patent (CEMP) and 92\% of precision, 95.21\% of recall, 93.4\% of F-score for Chemical Passage Detection (CPD).

Statistics

   Article View: 45
   PDF Downloaded: 3