The paper discusses the main scientific and technical problems of creating methods and software tools for processing unstructured textual data and providing support for search and rescue operations. We review systems that leverage data and messages from social media for information and analytical support of response and recovery operations in emergency situations. The paper considers methods for focused crawling, for preliminary parsing of crawled data, as well as methods for information extraction from natural language texts. With the review as the background, we propose approaches for solving tasks arising during development of methods and software tools for processing unstructured data and providing search and analytical support of search and rescue operations. We propose an intelligent (ontology-based) crawl strategies for focused crawling and machine learning techniques for classifying individual pages of target resources. Natural language processing and indexing of texts will be performed by means of the Exactus platform. Rule-based approaches as well as machine learning techniques will be adapted to solve the problem of extracting information from natural language texts related to emergency situations. Ontological resources and lexicons will be created to extract from texts geographical objects, names of ships and aircrafts. The problem of storing structured data will be solved by means of a distributed scalable NO-SQL databases that provide the ability to load and process huge amounts of data on the sufficient hardware. The requirements for the software tools for support of search and rescue operations are suggested. To satisfy these requirements, we propose a distributed service-based architecture. It provides the ability to process big streams of information gathered online, scalability, information security, and low cost of implementation and maintenance of intelligent data processing systems. We are planning to perform experimental evaluation of the considered methods and software tools on the free-access retrospective data about emergences occurred in the Arctic zone.
e-LIBRARY: https://elibrary.ru/item.asp?id=25454472
At the Radiotekhnika publishing house: http://radiotec.ru/en/journal/Highly_available_systems/number/2015-4/article/17248
Devyatkin D. A., Shelmanov A. O. Processing unstructured textual data for support of search and rescue operations // Highly Available Systems. - 2015. - Vol. 11. - No. 4. – Page 45-60.