biclustlib is a Python library of biclustering algorithms, evaluation measures and datasets distributed under the GPLv3 license.
This library is under constant update. We expect to review its code and release a first version soon.
First you need to ensure that all packages have been installed.
- See
requirements.txt
- R >= 3.5;
- biclust R package;
- isa2 R package;
- Other specific libraries may be required by third party implementations that are wrapped in this package;
If you miss something you can simply type:
pip install -r requirements.txt
If you have all dependencies installed:
python setup.py install
If you use biclustlib in a scientific publication, we would appreciate citations of our paper where this library was first mentioned and used.
To cite biclustlib use: Padilha, V. A. & Campello, R. J. G. B. (2017). A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics, 18(1):55.
For TeX/LaTeX:
@article{padilha2017,
title={A systematic comparative evaluation of biclustering techniques},
author={Padilha, Victor A and Campello, Ricardo J G B},
journal={BMC Bioinformatics},
volume={18},
number={1},
pages={55},
year={2017},
publisher={BioMed Central}
}
- Bi-Correlation Clustering Algorithm (BCCA);
- Bit-Pattern Biclustering Algorithm (BiBit);
- Cheng and Church's Algorithm (CCA);
- Large Average Submatrices (LAS);
- Plaid;
- Conserved Gene Expression Motifs (xMOTIFs);
- Factor Analysis for Bicluster Acquisition (FABIA) (wrapper for the pyfabia package);
- Spectral Biclustering (wrapper for the scikit-learn implementation);
- Binary Inclusion-Maximal Biclustering Algorithm (Bimax) (wrapper for the biclust package);
- Cheng and Church's Algorithm (CCA) (wrapper for the biclust package);
- Plaid (wrapper for the biclust package);
- Iterative Signature Algorithm (ISA) (wrapper for the isa2 package);
- Conserved Gene Expression Motifs (xMOTIFs) (wrapper for the biclust package);
- Bayesian BiClustering (BBC) (wrapper for the executable of the authors' original implementation);
- Binary Inclusion-Maximal Biclustering Algorithm (Bimax) (wrapper for the executable of the authors' original implementation);
- Order-Preserving Submatrix (OPSM) (wrapper for the BicAT software using the executable jar file available here);
- QUalitative BIClustering (QUBIC) (wrapper for the executable of the authors' original implementation);
- RInClose (wrapper for the executable of the authors' original implementation);
All the binaries are available with biclustlib and are compiled for the x86_64 architecture.
- Saccharomyces cerevisiae microarray dataset from Tavazoie et al. (1999) which was used in (Cheng and Church, 2000);
- Saccharomyces cerevisiae and Arabidopsis thaliana microarray datasets used in (Prelić et al. 2006);
- Benchmark of 17 Saccharomyces cerevisiae microarray datasets compiled and preprocessed by Jaskowiak et al. (2013);
- Benchmark of 35 cancer microarray datasets compiled and preprocessed by Souto et al. (2008);
- The Relative Non-Intersecting Area (RNIA) and Clustering Error (CE) measures proposed by Patrikainen and Meila (2006);
- The recovery and relevance scores proposed by Prelić et al. (2006);
- The match score proposed by Liu and Wang (2007).
import numpy as np
from biclustlib.algorithms import ChengChurchAlgorithm
from biclustlib.datasets import load_yeast_tavazoie
# load yeast data used in the original Cheng and Church's paper
data = load_yeast_tavazoie().values
# missing value imputation suggested by Cheng and Church
missing = np.where(data < 0.0)
data[missing] = np.random.randint(low=0, high=800, size=len(missing[0]))
# creating an instance of the ChengChurchAlgorithm class and running with the parameters of the original study
cca = ChengChurchAlgorithm(num_biclusters=100, msr_threshold=300.0, multiple_node_deletion_threshold=1.2)
biclustering = cca.run(data)
print(biclustering)
biclustlib: A Python library of biclustering algorithms and evaluation measures.
Copyright (C) 2017 Victor Alexandre Padilha
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.