-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
53 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Unipept Index | ||
|
||
![Codecov](https://img.shields.io/codecov/c/github/unipept/unipept-index?token=IZ75A2FY98&logo=Codecov) | ||
|
||
The unipept index written entirely in `Rust`. This repository consists of multiple different Rust projects that depend on | ||
each other. More information about each project can be found in their respective `README.md` file. | ||
|
||
## Installation | ||
|
||
Clone this repository with the following command: | ||
|
||
```bash | ||
git clone https://github.com/unipept/unipept-index.git | ||
``` | ||
|
||
And build the projects using: | ||
|
||
```bash | ||
cargo build --release | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Functional Annotation Compression | ||
|
||
![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/unipept/unipept-index/test.yml?logo=github) | ||
![Codecov](https://img.shields.io/codecov/c/github/unipept/unipept-index?token=IZ75A2FY98&flag=fa-compression&logo=codecov) | ||
![Static Badge](https://img.shields.io/badge/doc-rustdoc-green) | ||
|
||
The `fa-compression` library offers compression for Unipept's functional annotation strings. These strings follow a very specific | ||
format that the compression algorithm will use to achieve a guaranteed minimal compression of **50%** for both very large and very | ||
small input strings. The compression ratio will often situate around **60-70%**. | ||
|
||
The compression algorithm never has to allocate extra memory to build an encoding table or something similar. We can encode each | ||
string separately. This is particullary useful when all strings have to be encoded/decoded on their own. There is no need to decode | ||
an entire database to only fetch a single entry. | ||
|
||
## Example | ||
|
||
```rust | ||
use fa_compression; | ||
|
||
fn main() { | ||
let encoded: Vec<u8> = fa_compression::encode( | ||
"IPR:IPR016364;EC:1.1.1.-;IPR:IPR032635;GO:0009279;IPR:IPR008816" | ||
); | ||
|
||
// [ 44, 44, 44, 189, 17, 26, 56, 173, 18, 116, 117, 225, 67, 116, 110, 17, 153, 39 ] | ||
println!("{:?}", encoded); | ||
|
||
let decoded: String = fa_compression::decode(&encoded); | ||
|
||
// "EC:1.1.1.-;GO:0009279;IPR:IPR016364;IPR:IPR032635;IPR:IPR008816" | ||
println!("{:?}", decoded); | ||
} | ||
``` |