Releases: mulias/possum_parser_language
v0.12.0
- Preserve stack trace info in prebuilt binaries. This should make some error messages more helpful if possum fails unexpectedly. The trade-off is that some binaries (mostly linux) will now be larger.
- Implement
@dbg(p)
which prints the value returned by the parserp
. The printed output does not yet include context such as source code, the name of the parser getting ran, etc. - Implement
@Add(A, B)
,@Subtract(A, B)
,@Multiply(A, B)
,@Divide(A, B)
, and@Power(A, B)
which all perform common algebraic operations on numbers. Anynull
values are converted to0
or1
depending on the operation's identity. - Allow parsers and value names to use
.
as a character to indicate scope. - Add
Num.Add
,Num.Sub
,Num.Mul
,Num.Div
, andNum.Pow
as aliases for the builtin functions. - Also add
Num.Abs(N)
which returns the absolute value of a number and does not depend on the new builtin functions.
Full Changelog: v0.11.0...v0.12.0
v0.11.0
Language
- Allow trailing commas in function params, arrays, and objects.
- Add
@NumberOf(V)
for converting a string value to a number value - Patterns can include variables in place of object keys. If the variable is bound then the variable value will be substituted in the pattern. This is the first step in implementing proper pattern matching for objects, and makes it possible to access values in objects via an arbitrary key. For example a utility function to get a value from an object could now be written as
ObjectGet(Obj, Key) = Obj -> {Key: V} & V
.
Standard Library
- Add
repeat2(p)
throughrepeat9(p)
as an alternative to the more generalrepeat(p, N)
. - Rename
scan(p)
tofind(p)
in stdlib - Update
find_all(p)
to include all text after the last parsedp
- Remove
maybe_find_all(p)
- Add
find_before(p, stop)
andfind_all_before(p, stop)
which only search untilstop
- Add
Filter(A, Pred)
for filtering array elements by a predicate function
Full Changelog: v0.10.0...v0.11.0
v0.10.0
- Fix compiling negative number parsers. Parsers like
-100
were not compiling correctly because-
is now treated as a prefix operator instead of part of the number. - Implement basic spread syntax and merge patterns for objects. This means things like
const({"a": 1, "b": 2}) -> {"a": 1, ...B}
andconst({"a": 1, "b": 2}) -> (B + {"a": 1})
will now compile and pattern match correctly.
Full Changelog: v0.9.0...v0.10.0
v0.9.0
- Introduces
$
as a value label prefix. The value label indicates that the prefixed expression is a value. Any value may be labeled, but the value label is required for string literals, number literals,true
,false
, andnull
, but only when used in a context where the value could accidentally be interpreted as a parser. For examplerecord2(Key1, value1, Key2, value2)
used to be callable asrecord2("foo", foo_value, "bar", bar_value)
, but now must be invoked asrecord2($"foo", foo_value, $"bar", bar_value)
. This makes it explicit that"foo"
and"bar"
are string values, not parsers. - Improve
json_string
parser, add a bunch of tests for handling encoding/decoding edge cases. - Add
@Codepoint(HexStr)
meta function to convert a hexadecimal number encoded as a string into a UTF-8 unicode codepoint. - and
@SurrogatePairCodepoint(High, Low)
meta functions to convert two hexadecimal numbers encoded as strings into a UTF-16 surrogate pair codepoint. - Add
hex_numeral
,hex_digit
,find_all(p)
,maybe_find_all(p)
,chars_until(stop)
,json_number
,json_boolean
, andjson_null
to the standard library. - Add
True
,False
,Null
,Inc(N)
,Dec(N)
to the standard library. - Change
ast_op_precedence(op_node, BindingPower)
toAstOpPrecedence(OpNode, BindingPower)
and changeast_infix_op_precedence(op_node, LeftBindingPower, RightBindingPower)
toAstInfixOpPrecedence(OpNode, LeftBindingPower, RightBindingPower)
. Unfortunately the value label change made precedence parsing code look noisier, since all the binding power numbers had to be prefixed with$
. Using value functions is a bit better.
Full Changelog: v0.8.0...v0.9.0
v0.8.0
- Implemented
ast_with_operator_precedence(value, prefix, infix, postfix)
, a parser that can parse an entire AST with mixed operator precedence. - Created a new documentation page for AST parsing.
- Implemented
possum.possum
, a full parser for Possum's syntax. It's a parser that can parse its own source code! It's 168 lines long!
v0.7.0
- Fix
@Crash
which was missing a bytecode op. Oops. 1-2
is now parsed as1 - 2
instead of1 -2
.- Values and patterns can use unary negation, like
--5
or-X
. Number parsers can still only use a single negative sign, since parsing numbers relies on matching the exact characters of the number, and--5
is not a valid JSON number. NumberString
s can be negated an arbitrary number of times without having to convert to a different number representation.- Range patterns can have bound and unbound variables. When a pattern is matched the unbound variable is bound to the matching codepoint or integer.
v0.6.0
- Allow range parsers to have an open upper or lower bound. For example
0..
is a parser for integers greater than or equal to zero. - Range parsers may be used in patterns. For example
int -> 0..
fails if the parsed integer is not greater than or equal to zero. - Add meta function
@Crash(Message)
to immediately halt the program with a custom error message. - Add stdlib parsers
repeat(p, N)
,repeat_between(p, N, M)
,tuple(elem, N)
, andtuple_sep(elem, sep, N)
for repeating a parser a fixed number of times.
In addition to enabling the new standard library parsers, with range patterns we can write a naive nth-digit Fibonacci function as a suitably feral one-liner
$ possum -p '0.. -> N $ Fib(N) ; Fib(N) = N -> ..1 | (Fib(N - 1) + Fib(N - 2))' -i 11
89
~~(##)'>
v0.5.0
- Improve formatting of JSON returned by CLI and WASM
- Add
Tabular(Headers, Rows)
value function for transforming a table of data into an array of objects - Add
json
,json_string
,json_array
, andjson_object
parsers to make it easier to parse text that contains valid JSON.
v0.4.0
~~~(##)'>
After a year and four months it's finally time for a new release!
Goodbye OCaml
The previous v0.3.0
release was based on the old OCaml implementation of Possum. This implementation proved out the core idea of Possum, but was starting to collapse under the weight of its own haphazard abstractions and nested function closures. On top of that, building OCaml cross-platform was an absolute mess. Time for a fresh start!
Hello Zig
As much as I'll try to justify my choices, I really just wanted to build something in Zig. The new implementation uses a bytecode virtual machine instead of a tree walk interpreter, and is based on the later half of Crafting Interpreters by Robert Nystrom. The VM in Nystrom's book is implemented in C, so Zig felt like a more natural choice than something like Rust or OCaml. This new version of Possum has hugely benefited from the pain points I discovered the first time around, resulting in a code base that's lower-level while simultaneously being easier to understand. Zig also makes cross-platform compilation almost effortless. You can now build Possum for a wide range of platforms with a single command, including WASM which is what I'm using to run interactive Possum examples on my website. Neat!
Where are we now?
The new implementation is pretty much at feature parity. The primary regression from v0.3.0
is in error messages, which have gone from cute and usually helpful to practically nonexistent. In exchange we're getting a handful of new goodies and conceptual simplifications to the language. There's string interpolation and value functions now. Pattern matching has changed from Pattern <- parser
to parser -> Pattern
. Literally no one but me knows how Possum used to work so trying to list everything that has changed and why seems silly.
What's Left?
The big ticket items:
- Finish implementing pattern matching. Most common uses for Possum's destructure syntax are now fully supported, but destructuring against objects is still buggy. There are also a bunch of obscure edge cases in pattern matching that should never come up in practice but I'm going to implement for the sake of completeness.
- Error messages. Some error message cases should be pretty easy to implement. The difficult case is reporting when a parser fails to match on an input. Combinator-based parser libraries frequently struggle with reporting parsing failure, but Possum is a full language so it should be possible to produce detailed failure reports that emphasize the parsing paths that got closest to successfully parsing the input.
- Garbage collection. Currently Possum doesn't free any memory allocated at runtime. Text parsing is rarely a long running process so this should be fine in the majority of cases, but we're going to add a garbage collector anyways. Possum gotta eat trash.
- Support operator precedence parsing out of the box. Possum should be a viable tool for prototyping language syntax, and one wrinkle that comes up frequently in this context is operators with varied precedence and associativity. Possum should either have language features or stdlib functions to handle precedence parsing, but I haven't quite figured out what this looks like yet.