Abstract:
Non-contact video-based physiological measurement has many applications in health
care and human-computer interaction. Contactless health monitoring has become increasingly
important during the SARS-CoV-2 pandemic, and this will have a long-lasting effect on
health care practices used nowadays. These tools can help reduce the risk of exposing
patients and medical staff to infection, make healthcare services more accessible, and allow
providers to see more patients. Apart from that, this can add up to the cognitive ability for
personal care robots to monitor people's health. However, objective measurement of vital
signs is challenging without direct contact with a patient.
In this research, we present a video-based cardiovascular vital sign measurement
approach. We used a multi-task temporal shift convolutional attention network (MTTS-CAN)
to train on real-world data collected to predict cardiovascular measurement. We created the
video-based dataset with multiple input sources (High Frame rate webcam with 120 Hz
framerate, Phone front camera with 30 Hz framerate, and Infrared Camera with 9 Hz
framerate) directed to the participant. Systematic experimentation on datasets reveals that our
approach leads to substantial (5%-10%) reductions in error compared to the basic
Convolutional Attention Networks (3D-CAN, Conv-LSTM).