A prototype system utilizing video, audio, and text data for recognizing states of fatigue and reduced human performance is described. For this purpose, the task of Visual Question Answering (VQA) has also been studied and elaborately outlined, along with features of its implementation based on examples from another research. Experiments have been conducted on datasets with a wide range of tasks: the standard VQA task on the VQA v2 dataset, complex scenarios on CLEVR CoGenT, and analysis of cash receipts on Receipt-AVQA-2023.
DOI: 10.56304/S2949609823010045
Download PDF or read online at the FizMat journal website (in Russian): https://sciencejournals.ru/issues/fizmat/2023/vol_1/iss_1/FizMat2301004Veitsenfeld/FizMat2301004Veitsenfeld.pdf
Download PDF from eLibrary (in Russian, registration required): https://www.elibrary.ru/item.asp?id=57140023
Weizenfeld D. A., Kiselev G. A., Korovin Y. S., Makov S. V. Prototype system for recognizing human fatigue states using video, audio, and text data // FizMat, 2023, vol. 1, № 1, pp. 65–73.