Skip to content

Lexical, to- and from-string conversion routines.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

mdrach/rust-lexical

 
 

lexical

Build Status Latest Version Rustc Version 1.37+

Fast lexical conversion routines for both std and no_std environments. Lexical provides routines to convert numbers to and from decimal strings. Lexical is simple to use and focuses on performance and correctness. Finally, lexical-core is suitable for environments without a memory allocator, not requiring any internal allocations by default. And, as of version 2.0, lexical uses minimal unsafe features, limiting the chance of memory-unsafe code.

Table of Contents

Getting Started

Add lexical to your Cargo.toml:

[dependencies]
lexical = "^5.1"

And get started using lexical:

extern crate lexical;

// Number to string
lexical::to_string(3.0);            // "3.0", always has a fraction suffix, 
lexical::to_string(3);              // "3"

// String to number.
let i: i32 = lexical::parse("3").unwrap();   // Ok(3), auto-type deduction.
let f: f32 = lexical::parse("3.5").unwrap(); // Ok(3.5)
let d = lexical::parse::<f64, _>("3.5");     // Ok(3.5), error checking parse.
let d = lexical::parse::<f64, _>("3a");      // Err(Error(_)), failed to parse.

Lexical has both partial and complete parsers: the complete parsers ensure the entire buffer is used while parsing, without ignoring trailing characters, while the partial parsers parse as many characters as possible, returning both the parsed value and the number of parsed digits. Upon encountering an error, lexical will return an error indicating both the error type and the index at which the error occurred inside the buffer.

// This will return Err(Error(ErrorKind::InvalidDigit(3))), indicating 
// the first invalid character occurred at the index 3 in the input 
// string (the space character).
let x: i32 = lexical::parse("123 456").unwrap();

For floating-points, Lexical also includes parse_lossy, which may lead to minor rounding error (relative error of ~1e-16) in rare cases (see implementation details for more information), without using slow algorithms that may lead to serious performance degradation.

let x: f32 = lexical::parse_lossy("3.5").unwrap();   // 3.5

In order to use lexical in generics, the type may use the trait bounds FromLexical (for parse``), ToLexical(forto_string), or FromLexicalLossy(forparse_lossy`).

/// Multiply a value in a string by multiplier, and serialize to string.
fn mul_2<T>(value: &str, multiplier: T) 
    -> Result<String, lexical::Error>
    where T: lexical::ToLexical + lexical::FromLexical
{
    let value: T = lexical::parse(value)?;
    Ok(lexical::to_string(value * multiplier))
}

Benchmarks

Most of the following benchmarks measure the time it takes to convert 10,000 random values, for different types. The values were randomly generated using NumPy, and run in both std (rustc 1.29.2) and no_std (rustc 1.31.0) contexts (only std is shown) on an x86-64 Intel processor. More information on these benchmarks can be found in the benches folder and in the source code for the respective algorithms. Adding the flags "target-cpu=native" and "link-args=-s" were also used, however, they minimally affected the relative performance difference between different lexical conversion implementations.

For cross-language benchmarks, they measure the time it takes to convert a digit series of near-halfway decimal floating-point representations. The C++ benchmarks (RapidJSON, strtod, and double-conversion) were done using GCC 8.2.1 with glibc/libstdc++ using Google Benchmark and the -O3 flag. The Python benchmark was done using IPython on Python 3.6.6. The Go benchmark was done using go1.10.4. All benchmarks used the same data. For RapidJSON, the benchmark was done by publicly exposing the ParseNumber method with a custom handler.

For all the following benchmarks, lower is better.

Float to String

ftoa benchmark

Integer To String

itoa benchmark

String to Integer

atoi benchmark

String to f64 Simple, Random Data

atof64 benchmark

String to f64 Complex, Large Data Cross-Language Comparison

atof64 simple language benchmark

String to f64 Complex, Denormal Data Cross-Language Comparison

Note: Rust was unable to parse all but the 10-digit benchmark, producing an error result of ParseFloatError { kind: Invalid }. It performed ~2,000x worse than lexical for that benchmark.

atof64 simple language benchmark

Backends

For Float-To-String conversions, lexical uses one of three backends: an internal, Grisu2 algorithm, an external, Grisu3 algorithm, and an external, Ryu algorithm (~2x as fast).

Documentation

Lexical's documentation can be found on docs.rs. For detailed background on the algorithms and features in lexical, see lexical-core. Finally, for information on how to use lexical from C, C++, or Python, see lexical-capi.

Roadmap

Ideally, Lexical's float-parsing algorithm or approach would be incorporated into libcore. Although Lexical greatly improves on Rust's float-parsing algorithm, in its current state it's insufficient to be included in the standard library, including numerous "anti-features":

  1. It supports non-decimal radices for float parsing, leading to significant binary bloat and increased code branching, for almost non-existent use-cases.
  2. It supports rounding schemes other than round-to-nearest, tie-even.
  3. It inlines aggressively, producing significant binary bloat.
  4. It contains effectively dead code for efficient higher-order arbitrary-precision integer algorithms, for rare use-cases requiring asymptotically faster algorithms.

Versioning and Version Support

Version Support

The currently supported versions are:

  • v5.x
  • v4.x (Maintenace)

Rustc Compatibility

v5.x is tested to work on 1.37+, including stable, beta, and nightly. v4.x is the last version to support Rustc 1.24+, including stable, beta, and nightly.

Please report any errors compiling a supported lexical version on a compatible Rustc version.

Versioning

Lexical uses semantic versioning. Removing support for older Rustc versions is considered an incompatible API change, requiring a major version change.

Changelog

All changes since 2.2.0 are documented in CHANGELOG.

License

Lexical is dual licensed under the Apache 2.0 license as well as the MIT license. See the LICENCE-MIT and the LICENCE-APACHE files for the licenses.

Contributing

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in lexical by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

Lexical, to- and from-string conversion routines.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 72.1%
  • Rust 25.8%
  • Python 0.7%
  • RenderScript 0.6%
  • C 0.4%
  • Makefile 0.3%
  • Other 0.1%