Skip to content

Commit

Permalink
Implemented suffix array construction of a long 16-bit array (libsais…
Browse files Browse the repository at this point in the history
…16x64).
  • Loading branch information
IlyaGrebnov committed Jun 13, 2024
1 parent d4f940b commit f8c7124
Show file tree
Hide file tree
Showing 10 changed files with 7,958 additions and 15 deletions.
5 changes: 4 additions & 1 deletion CHANGES
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
Changes in 2.8.3 (June 11, 2024)
- Implemented suffix array construction of a long 16-bit array (libsais16x64).

Changes in 2.8.2 (May 27, 2024)
- Implemented suffix array construction of a long integer array (libsais64).
- Implemented suffix array construction of a long 64-bit array (libsais64).

Changes in 2.8.1 (April 5, 2024)
- Fixed out-of-bound memory access issue for large inputs (libsais64).
Expand Down
4 changes: 3 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cmake_minimum_required(VERSION 3.10)

project(libsais VERSION 2.8.2 LANGUAGES C DESCRIPTION "libsais is a library for linear time suffix array, longest common prefix array and burrows wheeler transform construction based on induced sorting algorithm.")
project(libsais VERSION 2.8.3 LANGUAGES C DESCRIPTION "libsais is a library for linear time suffix array, longest common prefix array and burrows wheeler transform construction based on induced sorting algorithm.")

set(CMAKE_C_STANDARD 99)
set(CMAKE_C_STANDARD_REQUIRED ON)
Expand All @@ -20,9 +20,11 @@ add_library(libsais ${LIBSAIS_LIBRARY_TYPE})
target_sources(libsais PRIVATE
include/libsais.h
include/libsais16.h
include/libsais16x64.h
include/libsais64.h
src/libsais.c
src/libsais16.c
src/libsais16x64.c
src/libsais64.c
)

Expand Down
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,10 @@ The libsais provides simple C99 API to construct suffix array and Burrows-Wheele
The libsais is released under the [Apache License Version 2.0](LICENSE "Apache license")

## Changes
* June 11, 2024 (2.8.3)
* Implemented suffix array construction of a long 16-bit array (libsais16x64).
* May 27, 2024 (2.8.2)
* Implemented suffix array construction of a long integer array (libsais64).
* Implemented suffix array construction of a long 64-bit array (libsais64).
* April 5, 2024 (2.8.1)
* Fixed out-of-bound memory access issue for large inputs (libsais64).
* March 3, 2024 (2.8.0)
Expand Down Expand Up @@ -65,11 +67,12 @@ The libsais is released under the [Apache License Version 2.0](LICENSE "Apache l

## Versions of the libsais
* [libsais.c](src/libsais.c) (and corresponding [libsais.h](include/libsais.h)) is for suffix array, PLCP, LCP, forward BWT and reverse BWT construction over 8-bit inputs smaller than 2GB (2147483648 bytes).
* This version of the library could also be used to construct suffix array of an integer array (with a caveat that input array must be mutable).
* [libsais64.c](src/libsais64.c) (and corresponding [libsais64.h](include/libsais64.h)) is optional extension of the library for inputs larger or equlas to 2GB (2147483648 bytes).
* [libsais16.c](src/libsais16.c) (and corresponding [libsais16.h](include/libsais16.h)) is independent version of the library for 16-bit inputs.
* [libsais64.c](src/libsais64.c) (and corresponding [libsais64.h](include/libsais64.h)) is optional extension of the library for inputs larger or equlas to 2GB (2147483648 bytes).
* This versions of the library could also be used to construct suffix array of an integer array (with a caveat that input array must be mutable).
* [libsais16.c](src/libsais16.c) + [libsais16x64.c](src/libsais16x64.c) (and corresponding [libsais16.h](include/libsais16.h) + [libsais16x64.h](include/libsais16x64.h)) is independent version of the library for 16-bit inputs.
* This version of the library could also be used to construct suffix array and BWT of a set of strings by adding a unique end-of-string symbol to each string and then computing the result for the concatenated string.

## Examples of APIs (see [libsais.h](include/libsais.h), [libsais16.h](include/libsais16.h) and [libsais64.h](include/libsais64.h) for complete APIs list)
## Examples of APIs (see [libsais.h](include/libsais.h), [libsais16.h](include/libsais16.h), [libsais16x64.h](include/libsais16x64.h) and [libsais64.h](include/libsais64.h) for complete APIs list)
```c
/**
* Constructs the suffix array of a given string.
Expand Down Expand Up @@ -124,7 +127,7 @@ The libsais is released under the [Apache License Version 2.0](LICENSE "Apache l
CPMAddPackage(
NAME libsais
GITHUB_REPOSITORY IlyaGrebnov/libsais
GIT_TAG v2.8.1
GIT_TAG v2.8.3
OPTIONS
"LIBSAIS_USE_OPENMP OFF"
"LIBSAIS_BUILD_SHARED_LIB OFF"
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.8.2
2.8.3
4 changes: 2 additions & 2 deletions include/libsais.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ Please see the file LICENSE for full copyright information.

#define LIBSAIS_VERSION_MAJOR 2
#define LIBSAIS_VERSION_MINOR 8
#define LIBSAIS_VERSION_PATCH 2
#define LIBSAIS_VERSION_STRING "2.8.2"
#define LIBSAIS_VERSION_PATCH 3
#define LIBSAIS_VERSION_STRING "2.8.3"

#ifdef _WIN32
#ifdef LIBSAIS_SHARED
Expand Down
29 changes: 27 additions & 2 deletions include/libsais16.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ Please see the file LICENSE for full copyright information.

#define LIBSAIS16_VERSION_MAJOR 2
#define LIBSAIS16_VERSION_MINOR 8
#define LIBSAIS16_VERSION_PATCH 2
#define LIBSAIS16_VERSION_STRING "2.8.2"
#define LIBSAIS16_VERSION_PATCH 3
#define LIBSAIS16_VERSION_STRING "2.8.3"

#ifdef _WIN32
#ifdef LIBSAIS_SHARED
Expand Down Expand Up @@ -83,6 +83,18 @@ extern "C" {
*/
LIBSAIS16_API int32_t libsais16(const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq);

/**
* Constructs the suffix array of a given integer array.
* Note, during construction input array will be modified, but restored at the end if no errors occurred.
* @param T [0..n-1] The input integer array.
* @param SA [0..n-1+fs] The output array of suffixes.
* @param n The length of the integer array.
* @param k The alphabet size of the input integer array.
* @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance).
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
LIBSAIS16_API int32_t libsais16_int(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs);

/**
* Constructs the suffix array of a given 16-bit string using libsais16 context.
* @param ctx The libsais16 context.
Expand All @@ -107,6 +119,19 @@ extern "C" {
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
LIBSAIS16_API int32_t libsais16_omp(const uint16_t * T, int32_t * SA, int32_t n, int32_t fs, int32_t * freq, int32_t threads);

/**
* Constructs the suffix array of a given integer array in parallel using OpenMP.
* Note, during construction input array will be modified, but restored at the end if no errors occurred.
* @param T [0..n-1] The input integer array.
* @param SA [0..n-1+fs] The output array of suffixes.
* @param n The length of the integer array.
* @param k The alphabet size of the input integer array.
* @param fs Extra space available at the end of SA array (can be 0, but 4k or better 6k is recommended for optimal performance).
* @param threads The number of OpenMP threads to use (can be 0 for OpenMP default).
* @return 0 if no error occurred, -1 or -2 otherwise.
*/
LIBSAIS16_API int32_t libsais16_int_omp(int32_t * T, int32_t * SA, int32_t n, int32_t k, int32_t fs, int32_t threads);
#endif

/**
Expand Down
Loading

0 comments on commit f8c7124

Please sign in to comment.