Re-identify people with wearable health data and machine learning


A new type of privacy attack based on portable health data has been identified by researchers at the University of Massachusetts Lowell. The Person Reidentification Attack (PRI-Attack) uses publicly available, HIPAA compliant data from wearable health devices to establish the identity of individuals based on heart rate, respiration, and posturing data. hands, among others.

The vulnerability is made possible in the United States by the fact that the Health Insurance Portability and Accountability Act (HIPAA), while requiring that medical data remain anonymous, does not consider raw data from sensors (such as skin temperature). and Accelerometer Data (ACC)) as privacy sensitive, and therefore does not require publicly shared data of this type to be encrypted or subject to the same general protections that it affords to traditional forms of data on patients, such as health records.

From vector to visual

A PRI attack uses interpreted image data to discern common patterns that correlate with other types of health data. A person’s skin response, for example, can be assessed from video (photoplethysmography) and correlated to what should be completely anonymous vector information from health monitoring devices such as wearable watches and other types of monitoring devices. Photoplethysmography provides heart rate data, which can be combined with unidentified portable heart data.

Gesture recognition is another ‘key’ that can be trivially translated from vector data into a visual matrix which, again, allows the interpreted image / video data to be correlated with the apparently anonymous information from the accelerometer within. health data.

Hand gesture information from portable data. Source:

Sensor data as PII

The research, by UML assistant professor Mohammad Arif Ul Alam, argues that physiological sensing data may indeed constitute personal information and is in fact a biological analogue of browser fingerprinting techniques currently believed to undermine the news. initiatives to protect user privacy on the web.

To test the hypothesis, the researcher developed a hand gesture recognition and localization framework that interprets gesture data (recorded vector movement) from a portable accelerometer, and translates the movements into a visual recording that can be correlated with movements recorded by wearable health. devices.

A multimodal Siamese neural network (mm-SNN) was built to interpret the gestural information classified via the support vector machine (SVM). One network processes vector information (interpreted as image information in 3D space) and the second network processes physiological data recorded from the sensor data.


The system was tested on a variety of data sets, including a ‘Gamer Fatigue Data Set’ obtained by collecting data on five student volunteers, aged 19 to 25, who played video games for seven years. days while wearing the Empatica E4 bracelet. The watch is equipped with ACC, electrodermal context (EDA), skin temperature and photoplethysmography (PPG) sensors.

E4 was also used in a new ‘restaurant data’ dataset, in which eight volunteers made and ate sandwiches for twenty minutes, and in a ‘elderly’ dataset, where 22 older subjects, aged 75 to 95, performed 13 scripted activities while wearing the watch.

Finally, the researchers used the publicly available healthy adult fatigue data set that followed 28 healthy men and women with an average age of 42 for 1 to 219 consecutive days while wearing a Multi-sensor handheld device broadly similar to the data collection capabilities of the E4, including a 3-axis ACC, galvanic skin response electrode, temperature and photo sensors, and barometer.

The results indicate that heart rate and respiratory rate are the safest ways to re-identify, scoring an average accuracy rate of> 66% +.

PRI-Attack methodology test results.  Cradle: PPG: photoplethysmography;  HR: heart rate;  BR: respiratory rate;  PVP: Blood Volume Pulse (obtained from PPG);  IBI: Inter Beat Interval (obtained from PPG);  TC: Tonic component of the EDA signal;  Phasic component of EDA data (Ibid);  Temp: Temperature.

PRI-Attack methodology test results. Cradle: PPG: photoplethysmography; HR: heart rate; BR: respiratory rate; PVP: Blood Volume Pulse (obtained from PPG); IBI: Inter Beat Interval (obtained from PPG); TC: Tonic component of the EDA signal; Phasic component of EDA data (Ibid); Temp: Temperature.

The research concludes:

“While modern computer vision technology can be easily used to learn hand gestures and the corresponding physiological signal (heart rate, respiratory rate) from a public surveillance camera, these huge amounts of recorded video can be easily used by attackers to learn user-specific biometrics to reveal identity from portable HIPPA-compliant detection data. ‘

HIPAA considers PHR data “anonymized by default”

The US government has recognized the growth of personal health records (PHRs) and classifies such records (including data from portable health devices) as “An electronic record of an individual’s health information whereby the individual controls access to information and may have the ability to manage, monitor and participate in their own health care”.

However, since this is a private sector phenomenon, the government does not admit any official monitoring of this data, having established that it does not contain Personally Identifiable Information (PII). A June 2016 report on HIPAA entities not covered by the US Department of Health and Human Services states:

‘[Large] gaps in policies regarding access, security and privacy persist, and confusion persists among consumers and innovators. Wearable fitness trackers, health social networks, and mobile health apps are all built on the idea of ​​consumer engagement. However, our laws and regulations have not kept pace with these new technologies. ‘

Leave A Reply

Your email address will not be published.