Source code of our ICMR 2020 paper "Deep Semantic-Alignment Hashing for Unsupervised Cross-Modal Retrieval"
Deep hashing methods have achieved tremendous success in cross-modal retrieval, due to its low storage consumption and fast retrieval speed. In real cross-modal retrieval applications, it's hard to obtain label information. Recently, increasing attention has been paid to unsupervised cross-modal hashing. However, existing methods fail to exploit the intrinsic connections between images and their corresponding descriptions or tags (text modality). In this paper, we propose a novel Deep Semantic-Alignment Hashing (DSAH) for unsupervised cross-modal retrieval, which sufficiently utilizes the co-occurred image-text pairs. DSAH explores the similarity information of different modalities and we elaborately design a semantic-alignment loss function, which elegantly aligns the similarities between features with those between hash codes. Moreover, to further bridge the modality gap, we innovatively propose to reconstruct features of one modality with hash codes of the other one. Extensive experiments on three cross-modal retrieval datasets demonstrate that DSAH achieves the state-of-the-art performance.
- Python: 3.x
- other dependencies: env.yaml
- Update the setting.py with your
data_dir
. And change the valueEVAL
, for train setting it withFalse
- run the
train.py
python train.py
- For test, set the value
EVAL
withTrue
. And the model will load thecheckpoint/DATASET_CODEBIT_bit_best_epoch.pth
For datasets, we follow Deep Cross-Modal Hashing's Github (Jiang, CVPR 2017). You can download these datasets from:
-
Wikipedia articles, [Link]
-
MIRFLICKR25K, [OneDrive], [Baidu Pan, password: 8dub]
-
NUS-WIDE (top-10 concept), [OneDrive], [Baidu Pan, password: ml4y]
If you find this code useful, please cite our paper:
@inproceedings{10.1145/3372278.3390673,
author = {Yang, Dejie and Wu, Dayan and Zhang, Wanqian and Zhang, Haisu and Li, Bo and Wang, Weiping},
title = {Deep Semantic-Alignment Hashing for Unsupervised Cross-Modal Retrieval},
year = {2020},
isbn = {9781450370875},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3372278.3390673},
doi = {10.1145/3372278.3390673},
booktitle = {Proceedings of the 2020 International Conference on Multimedia Retrieval},
pages = {44–52},
numpages = {9},
keywords = {cross-modal hashing, cross-media retrieval, semantic-alignment},
location = {Dublin, Ireland},
series = {ICMR ’20}
}
All rights are reserved by the authors.