Skip to content

Releases: mulias/possum_parser_language

v0.12.0

11 Dec 21:54
Compare
Choose a tag to compare
  • Preserve stack trace info in prebuilt binaries. This should make some error messages more helpful if possum fails unexpectedly. The trade-off is that some binaries (mostly linux) will now be larger.
  • Implement @dbg(p) which prints the value returned by the parser p. The printed output does not yet include context such as source code, the name of the parser getting ran, etc.
  • Implement @Add(A, B), @Subtract(A, B), @Multiply(A, B), @Divide(A, B), and @Power(A, B) which all perform common algebraic operations on numbers. Any null values are converted to 0 or 1 depending on the operation's identity.
  • Allow parsers and value names to use . as a character to indicate scope.
  • Add Num.Add, Num.Sub, Num.Mul, Num.Div, and Num.Pow as aliases for the builtin functions.
  • Also add Num.Abs(N) which returns the absolute value of a number and does not depend on the new builtin functions.

Full Changelog: v0.11.0...v0.12.0

v0.11.0

03 Dec 12:38
Compare
Choose a tag to compare

Language

  • Allow trailing commas in function params, arrays, and objects.
  • Add @NumberOf(V) for converting a string value to a number value
  • Patterns can include variables in place of object keys. If the variable is bound then the variable value will be substituted in the pattern. This is the first step in implementing proper pattern matching for objects, and makes it possible to access values in objects via an arbitrary key. For example a utility function to get a value from an object could now be written as ObjectGet(Obj, Key) = Obj -> {Key: V} & V.

Standard Library

  • Add repeat2(p) through repeat9(p) as an alternative to the more general repeat(p, N).
  • Rename scan(p) to find(p) in stdlib
  • Update find_all(p) to include all text after the last parsed p
  • Remove maybe_find_all(p)
  • Add find_before(p, stop) and find_all_before(p, stop) which only search until stop
  • Add Filter(A, Pred) for filtering array elements by a predicate function

Full Changelog: v0.10.0...v0.11.0

v0.10.0

30 Oct 17:02
Compare
Choose a tag to compare
  • Fix compiling negative number parsers. Parsers like -100 were not compiling correctly because - is now treated as a prefix operator instead of part of the number.
  • Implement basic spread syntax and merge patterns for objects. This means things like const({"a": 1, "b": 2}) -> {"a": 1, ...B} and const({"a": 1, "b": 2}) -> (B + {"a": 1}) will now compile and pattern match correctly.

Full Changelog: v0.9.0...v0.10.0

v0.9.0

28 Oct 22:08
Compare
Choose a tag to compare
  • Introduces $ as a value label prefix. The value label indicates that the prefixed expression is a value. Any value may be labeled, but the value label is required for string literals, number literals, true, false, and null, but only when used in a context where the value could accidentally be interpreted as a parser. For example record2(Key1, value1, Key2, value2) used to be callable as record2("foo", foo_value, "bar", bar_value), but now must be invoked as record2($"foo", foo_value, $"bar", bar_value). This makes it explicit that "foo" and "bar" are string values, not parsers.
  • Improve json_string parser, add a bunch of tests for handling encoding/decoding edge cases.
  • Add @Codepoint(HexStr) meta function to convert a hexadecimal number encoded as a string into a UTF-8 unicode codepoint.
  • and @SurrogatePairCodepoint(High, Low) meta functions to convert two hexadecimal numbers encoded as strings into a UTF-16 surrogate pair codepoint.
  • Add hex_numeral, hex_digit, find_all(p), maybe_find_all(p), chars_until(stop), json_number, json_boolean, and json_null to the standard library.
  • Add True, False, Null, Inc(N), Dec(N) to the standard library.
  • Change ast_op_precedence(op_node, BindingPower) to AstOpPrecedence(OpNode, BindingPower) and change ast_infix_op_precedence(op_node, LeftBindingPower, RightBindingPower) to AstInfixOpPrecedence(OpNode, LeftBindingPower, RightBindingPower). Unfortunately the value label change made precedence parsing code look noisier, since all the binding power numbers had to be prefixed with $. Using value functions is a bit better.

Full Changelog: v0.8.0...v0.9.0

v0.8.0

24 Oct 23:02
Compare
Choose a tag to compare

v0.7.0

19 Sep 19:25
Compare
Choose a tag to compare
  • Fix @Crash which was missing a bytecode op. Oops.
  • 1-2 is now parsed as 1 - 2 instead of 1 -2.
  • Values and patterns can use unary negation, like --5 or -X. Number parsers can still only use a single negative sign, since parsing numbers relies on matching the exact characters of the number, and --5 is not a valid JSON number.
  • NumberStrings can be negated an arbitrary number of times without having to convert to a different number representation.
  • Range patterns can have bound and unbound variables. When a pattern is matched the unbound variable is bound to the matching codepoint or integer.

v0.6.0

16 Sep 18:47
Compare
Choose a tag to compare
  • Allow range parsers to have an open upper or lower bound. For example 0.. is a parser for integers greater than or equal to zero.
  • Range parsers may be used in patterns. For example int -> 0.. fails if the parsed integer is not greater than or equal to zero.
  • Add meta function @Crash(Message) to immediately halt the program with a custom error message.
  • Add stdlib parsers repeat(p, N), repeat_between(p, N, M), tuple(elem, N), and tuple_sep(elem, sep, N) for repeating a parser a fixed number of times.

In addition to enabling the new standard library parsers, with range patterns we can write a naive nth-digit Fibonacci function as a suitably feral one-liner

$ possum -p '0.. -> N $ Fib(N) ; Fib(N) = N -> ..1 | (Fib(N - 1) + Fib(N - 2))' -i 11
89

~~(##)'>

v0.5.0

17 Jul 15:37
Compare
Choose a tag to compare
  • Improve formatting of JSON returned by CLI and WASM
  • Add Tabular(Headers, Rows) value function for transforming a table of data into an array of objects
  • Add json, json_string, json_array, and json_object parsers to make it easier to parse text that contains valid JSON.

v0.4.0

07 Jul 01:46
Compare
Choose a tag to compare

~~~(##)'> After a year and four months it's finally time for a new release!

Goodbye OCaml

The previous v0.3.0 release was based on the old OCaml implementation of Possum. This implementation proved out the core idea of Possum, but was starting to collapse under the weight of its own haphazard abstractions and nested function closures. On top of that, building OCaml cross-platform was an absolute mess. Time for a fresh start!

Hello Zig

As much as I'll try to justify my choices, I really just wanted to build something in Zig. The new implementation uses a bytecode virtual machine instead of a tree walk interpreter, and is based on the later half of Crafting Interpreters by Robert Nystrom. The VM in Nystrom's book is implemented in C, so Zig felt like a more natural choice than something like Rust or OCaml. This new version of Possum has hugely benefited from the pain points I discovered the first time around, resulting in a code base that's lower-level while simultaneously being easier to understand. Zig also makes cross-platform compilation almost effortless. You can now build Possum for a wide range of platforms with a single command, including WASM which is what I'm using to run interactive Possum examples on my website. Neat!

Where are we now?

The new implementation is pretty much at feature parity. The primary regression from v0.3.0 is in error messages, which have gone from cute and usually helpful to practically nonexistent. In exchange we're getting a handful of new goodies and conceptual simplifications to the language. There's string interpolation and value functions now. Pattern matching has changed from Pattern <- parser to parser -> Pattern. Literally no one but me knows how Possum used to work so trying to list everything that has changed and why seems silly.

What's Left?

The big ticket items:

  • Finish implementing pattern matching. Most common uses for Possum's destructure syntax are now fully supported, but destructuring against objects is still buggy. There are also a bunch of obscure edge cases in pattern matching that should never come up in practice but I'm going to implement for the sake of completeness.
  • Error messages. Some error message cases should be pretty easy to implement. The difficult case is reporting when a parser fails to match on an input. Combinator-based parser libraries frequently struggle with reporting parsing failure, but Possum is a full language so it should be possible to produce detailed failure reports that emphasize the parsing paths that got closest to successfully parsing the input.
  • Garbage collection. Currently Possum doesn't free any memory allocated at runtime. Text parsing is rarely a long running process so this should be fine in the majority of cases, but we're going to add a garbage collector anyways. Possum gotta eat trash.
  • Support operator precedence parsing out of the box. Possum should be a viable tool for prototyping language syntax, and one wrinkle that comes up frequently in this context is operators with varied precedence and associativity. Possum should either have language features or stdlib functions to handle precedence parsing, but I haven't quite figured out what this looks like yet.

v0.3.0

22 Feb 19:16
5faeac6
Compare
Choose a tag to compare
v0.3.0