-
Notifications
You must be signed in to change notification settings - Fork 0
Compiler design
XP Compiler aims to make PHP syntax and features from newer versions run in older runtimes. To achieve this, PHP code is first parsed into an intermediate representation in the form of an abstract syntax tree (AST):
use lang\ast\{Language, Tokens};
use io\streams\FileInputStream;
// Load language implementation and parse the tokenized code
$parse= Language::named('PHP')->parse(new Tokens(new FileInputStream('Source.php')));
echo $parse->tree()->toString();
// lang.ast.ParseTree(source: (string))@{
// scope => lang.ast.Scope {
// parent => null
// package => null
// imports => []
// types => []
// }
// children => [lang.ast.nodes.EchoStatement {
// kind => "echo"
// expressions => [lang.ast.nodes.Literal {
// kind => "literal"
// expression => "PHP_VERSION"
// line => 1
// }]
// line => 1
// }]
// }
In the second step, this AST is emitted as code in the respective target runtime using emitters.
use lang\ast\{Emitter, Result};
use io\streams\FileOutputStream;
$parse= ... // see above
// Load emitter implementation for the given PHP version
$emit= Emitter::forRuntime('PHP.'.PHP_VERSION)->newInstance();
foreach ($lang->extensions() as $extension) {
$extension->setup($lang, $emit);
}
// Emit result to a given file
$emit->emitAll(new Result(new FileOutputStream('Compiled.php')), $parse->stream());
This generated code can then be loaded and run by the PHP runtime.
XP Compiler can operate in two modes:
- As a standalone command line tool which compiles input sources to given output, e.g.
xp compile src/Fixture.php dist/Fixture.php
. - As part of the class loading chain, compiling sources when PHP's autoloading triggers it.
Compiling on demand enables the PHP-idiomatic "code - save - reload/rerun" development process. To make this an enjoyable experience, the compiler aims to be as fast possible in this mode. The command line version runs at the same speed but might include more expensive optimizations.
- Support PHP only, no HTML mode and subsequently, no alternative control structures syntax
- Adapt PHP RFCs once they're voted upon successfully, offer support for experimental features through compiler extensions
- Support PHP's autoloading and compile on demand
- Integrate well with the XP Framework
- Keep it fast (~50 milliseconds per 1000 LOC)
- If in question, offer a degraded experience for older PHP versions
- If in question, defer error handling to the PHP runtime instead of performing extensive checks
- Have a great unittest coverage for all of the features
- Offer migration paths through major releases
These design principles manifest themselves in the following examples:
Most language features present in newer PHP versions can easily be rewritten to code that works in older PHP versions. One easy example are non-capturing catch statements, introduced in PHP 8.0:
try {
// ...
} catch (Throwable) {
// ...
}
By simply emitting a temporary variable after the Throwable type, this will run in any PHP 7.x version:
try {
// ...
} catch (Throwable $_0) {
// ...
}
These kind of rewriting operations are performed in the emitter class for the specific PHP version they target. Other language features like the ?->
operator, throw expressions or fn
-style closures require generating more extensive code, but follow the same general principle. Although the generated code may no longer be easily readable, it behaves exactly the same as its newer counterpart except for a small and typically negligible performance decrease.
Some language features cannot be rewritten to the exact equivalent, or would generate code which is noticeably slower at runtime. One example are typed properties:
// This:
class Fixture {
public string $name;
}
// ...would require __get() and __set() magic along the lines of the following:
class Fixture {
private $__properties= ['name' => [null, 'string']];
public function __get($name) {
return $this->__properties[$name][0] ?? null;
}
public function __set($name, $value) {
$this->__properties[$name][0]= cast($value, $this->__properties[$name][1]);
}
}
In this case, XP Compiler currently opts for simply omitting the type check. We might rethink this behavior in a future (major) release though!
Another example of degraded support for older PHP versions are named arguments. Re-ordering them would require quite some runtime overhead:
function divide($a, $b) { return $a / $b; }
// Flipped order
echo divide(b: 1, a: 2);
This would require either expensive lookups during compilation paired with runtime introspection of target invokeable in cases where static code analysis fails. Thus, for older PHP versions, the argument names are simply omitted, as if they had not been put there in the first place. As long as the order is correct, named arguments work.