Tensorflow implementation of Supervied Pre-training with LODO:
This work has been published in SmartComp2023.
Combining Public Human Activity Recognition Datasets to Mitigate Labeled Data Scarcity [Paper]
Riccardo Presotto, Sannara Ek, Gabriele Civitarese, François Portet, Philippe Lalanda, Claudio Bettini
If our project is helpful for your research, please consider citing :
@inproceedings{presotto2023combining,
title={Combining Public Human Activity Recognition Datasets to Mitigate Labeled Data Scarcity},
author={Presotto, Riccardo and Ek, Sannara and Civitarese, Gabriele and Portet, Fran{\c{c}}ois and Lalanda, Philippe and Bettini, Claudio},
booktitle={2023 IEEE International Conference on Smart Computing (SMARTCOMP)},
pages={33--40},
year={2023},
organization={IEEE}
}
11/07/2023 Initial commit: Code of LODO is released.
This code was implemented with Python 3.7, Tensorflow 2.10.1 and CUDA 11.2. Please refer to the official installation. If CUDA 11.2 has been properly installed :
pip install tensorflow==2.10.0
Another core library of our work is Hickle for the data storage management. Please launch the following command to be able to run our data partitioning scripts:
pip install hickle==5.0.2
To run our training and evaluatioin pipeline, additional dependecies are needed. Please launch the following command:
pip install -r requirements.txt
Our baseline experiments were conducted on a Debian GNU/Linux 10 (buster) machine with the following specs:
CPU : Intel(R) Xeon(R) CPU E5-2623 v4 @ 2.60GHz
GPU : Nvidia GeForce Titan Xp 12GB VRAM
Memory: 80GB
We provide scripts to automate downloading (With the exception of the MobiAct dataset which requires manual request from the authors) and proprocessing the datasets used for this study. See scripts in dataset folders. e.g, for the UCI dataset, run DATA_UCI.py
For running the 'Combined' dataset training and evaluation pipeline, all datasets must first be downloaded and processed. Please run all scripts in the 'datasets' folder.
Tip: Manually downloading the datasets and placing them in the 'datasets/dataset' folder may be a good alternative for stabiltiy if the download pipeline keeps failing VIA the provided scripts.
UCI
https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones
MotionSense
https://github.com/mmalekzadeh/motion-sense/tree/master/data
HHAR
http://archive.ics.uci.edu/ml/datasets/Heterogeneity+Activity+Recognition
RealWorld
https://www.uni-mannheim.de/dws/research/projects/activity-recognition/#dataset_dailylog
PAMAP2
https://archive.ics.uci.edu/dataset/231/pamap2+physical+activity+monitoring
MobiAct
The Mobiact Dataset is only available upon request from the authors. Please contact and request them at:
https://bmi.hmu.gr/the-mobifall-and-mobiact-datasets-2/
We provide both a jupyter notebook (.ipynb) and a python script (.py) versions for all the codes.
After downloading and running all the DATA processing scripts in the dataset folder, launch the LODO.ipynb jupyter notebook OR LODO.py script to partition the datasets as used in our study.
After running the provided LODO scripts, launch the Supervised_Pretraining.ipynb jupyter notebook OR Supervised_Pretraining.py script to launch our pre-training pipeline.
An example to launch the script is below:
python3.7 Supervised_Pretraining.py --architecture hart --testingDataset MotionSense --SSL_epochs 200 --SSL_batch_size 64 --finetune_epoch 50 --finetune_batch_size 128
To select different architectures for the pre-training, change the value of the 'architecture' flag to one of the following:
mobilehart_xs,hart,ispl,deepConvlstm
To select the left-out dataset, change the value of the 'testingDataset' flag to one of the following:
HHAR,MobiAct,MotionSense,RealWorld_Waist,UCI,PAMAP
Due to file storage limit constraints, the pre-trained models are available at the follow link:
https://drive.google.com/file/d/1W7eE-DV_tLraSdTIiGVet38EPi2_C-lc/view?usp=drive_link
This work has been partially funded by Naval Group, by MIAI@Grenoble Alpes (ANR-19-P3IA-0003), and granted access to the HPC resources of IDRIS under the allocation 2023-AD011013233R1 made by GENCI.
Part of this research was also supported by projects SERICS (PE00000014) and by project MUSA – Multilayered Urban Sustainability Action, funded by the European Union – NextGenerationEU, under the National Recovery and Resilience Plan (NRRP) Mission 4 Component 2 Investment Line 1.5: Strengthening of research structures and creation of R&D “innovation ecosystems”, set up of “territorial leaders in R&D”.