Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)

Authors

Molodchenkov A.

Annotation

This article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the best extraction quality was achieved by a model based on BioBERT.

External links

DOI: 10.22363/2658-4670-2023-31-1-64-74

Download PDF from the Discrete and Continuous Models and Applied Computational Science journal website: https://journals.rudn.ru/miph/article/view/34463

Download PDF from the Proceeding of the Institute for Systems Analysis of the Russian Academy of Science journal (in Russian): http://www.isa.ru/proceedings/images/documents/2023-73-1/159-166.pdf

ResearchGate: https://www.researchgate.net/publication/370171808_Methods_of_extracting_biomedical_information_from_patents_and_scientific_publications_on_the_example_of_chemical_compounds

Presentation by Nikolai Kolpakov at the DAMDID 2022 conference:

Reference link

Kolpakov N. A., Molodchenkov A. I., Lukin A. V. Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) // Discrete and Continuous Models and Applied Computational Science. 2023. Vol. 31. N. 1. P. 64–74.