Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/develop' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
MatthewSteen committed Sep 16, 2024
2 parents cd5e949 + e24d741 commit 219c9c8
Show file tree
Hide file tree
Showing 11 changed files with 371 additions and 36 deletions.
76 changes: 71 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
name: continuous integration

on:
push
push:
pull_request_review:
types: [submitted]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
Expand All @@ -10,6 +12,7 @@ concurrency:
jobs:

styling:
if: github.event_name == 'push'
runs-on: ubuntu-latest
steps:
- name: checkout
Expand All @@ -31,7 +34,10 @@ jobs:
poetry run isort . --check
poetry run black . --check
# The "testing" job verifies the base SDK functionality across
# all supported Python versions.
testing:
if: github.event_name == 'push'
needs: styling
runs-on: ubuntu-latest
strategy:
Expand Down Expand Up @@ -62,19 +68,79 @@ jobs:
run: poetry run mypy --ignore-missing-imports
- name: unit tests
run: poetry run pytest tests/unit --cov=./ --cov-report=xml
- name: build tests
run: poetry build

# We only run "integration" tests on the latest Python version.
# These tests detect if changes to ontologies, libraries, models and BuildingMOTIF
# affect correct operation of notebooks and BACnet scans. Library integration testing
# is a separate job
integration:
if: github.event_name == 'push'
needs: styling
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.11']
steps:
- name: checkout
uses: actions/checkout@v4
- uses: actions/setup-java@v4 # for topquadrant shacl support
with:
distribution: 'temurin'
java-version: '21'
- name: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: install-poetry
uses: snok/install-poetry@v1
with:
version: 1.4.0
virtualenvs-in-project: false
virtualenvs-path: ~/.virtualenvs
- name: poetry install
run: poetry install --all-extras
- name: integration tests
run: poetry run pytest tests/integration
- name: library tests
run: poetry run pytest tests/library
- name: bacnet tests
run: |
cd tests/integration/fixtures/bacnet
docker compose build device buildingmotif
docker compose run -d device
docker compose run buildingmotif poetry run pytest -m bacnet
docker compose down
- name: build tests
run: poetry build
# We only run "library" tests on the latest Python version.
# These tests detect if changes to ontologies, libraries, models and BuildingMOTIF
# affect correct operation of templates, shapes, and validation
libraries:
if: github.event.review.state == 'approved' || github.ref == 'refs/heads/main' || github.ref == 'refs/heads/develop'
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.11']
steps:
- name: checkout
uses: actions/checkout@v4
- uses: actions/setup-java@v4 # for topquadrant shacl support
with:
distribution: 'temurin'
java-version: '21'
- name: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: install-poetry
uses: snok/install-poetry@v1
with:
version: 1.4.0
virtualenvs-in-project: false
virtualenvs-path: ~/.virtualenvs
- name: poetry install
run: poetry install --all-extras
- name: library tests
run: poetry run pytest tests/library

coverage:
needs: testing
Expand Down
32 changes: 15 additions & 17 deletions buildingmotif/dataclasses/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,28 +256,26 @@ def test_model_against_shapes(

results = {}

targets = model_graph.query(
f"""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?target
WHERE {{
?target rdf:type/rdfs:subClassOf* <{target_class}>
}}
"""
)
# skolemize the shape graph so we have consistent identifiers across
# validation through the interpretation of the validation report
ontology_graph = ontology_graph.skolemize()

for shape_uri in shapes_to_test:
targets = model_graph.query(
f"""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?target
WHERE {{
?target rdf:type/rdfs:subClassOf* <{target_class}>
}}
"""
)
temp_model_graph = copy_graph(model_graph)
for (s,) in targets:
temp_model_graph.add((URIRef(s), A, shape_uri))

temp_model_graph += ontology_graph.cbd(shape_uri)

# skolemize the shape graph so we have consistent identifiers across
# validation through the interpretation of the validation report
ontology_graph = ontology_graph.skolemize()

valid, report_g, report_str = shacl_validate(
temp_model_graph, ontology_graph, engine=self._bm.shacl_engine
)
Expand Down
10 changes: 9 additions & 1 deletion buildingmotif/dataclasses/template.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,15 @@ def inline_dependencies(self) -> "Template":
deptempl_opt_args.update(deptempl.parameters)

# convert our set of optional params to a list and assign to the parent template
templ.optional_args = list(templ_optional_args.union(deptempl_opt_args))
# 1. get required parameters from the original template
# 2. calculate all optional requirements from the dependency template and the original template
# 3. remove required parameters from the optional requirements
# This avoids a bug where an optional dependency makes reference to a required parameter, and then
# subsequent inlining of the dependency without optional args would remove the required parameter
required = templ.parameters - templ_optional_args
templ.optional_args = list(
templ_optional_args.union(deptempl_opt_args) - required
)

# append the inlined template into the parent's body
templ.body += deptempl.body
Expand Down
1 change: 1 addition & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ parts:
- file: explanations/templates.md
- file: explanations/shapes-and-templates.md
- file: explanations/shacl_to_sparql.md
- file: explanations/point-label-parsing.md
- caption: Appendix
chapters:
- file: bibliography.md
210 changes: 210 additions & 0 deletions docs/explanations/point-label-parsing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# Point Label Parsing

The purpose of this explanation is to describe the framework for defining point label parsing rules and provide examples of how to use it.

One common source of building metadata are the "point labels" used in building management systems to label or tag the input and output data points with some human-readable description.
It is often useful to extract structured information from these labels to help with constructing a semantic model of the building.

BuildingMOTIF provides a framework for defining point label naming conventions and parsing them into structured data.
The output of this process is a set of typed <a href="../reference/apidoc/_autosummary/buildingmotif.label_parsing.html#buildingmotif.label_parsing.Token">Token</a> objects that can be input into a "Semantic Graph Synthesis" process to generate a semantic model of the building.

```{admonition} Semantic Graph Synthesis
This feature is coming soon! This label parsing framework is just part of the larger BuildingMOTIF toolkit for generating semantic models of buildings.
```

## Background

The point label parsing framework in BuildingMOTIF is based on the concept of "parser combinators".
Parser combinators are a way of defining parsers by combining smaller parsers together.
In BuildingMOTIF, the "combinators" are defined as Python functions that take a string as input and return a list of <a href="../reference/apidoc/_autosummary/buildingmotif.label_parsing.html#buildingmotif.label_parsing.TokenResult">TokenResult</a>s.
These combinators can be combined together to create more complex parsers.


Here is a short example:

```python
def parse_ahu_label(label: str) -> List[TokenResult]:
return sequence(
string("AHU", Constant(BRICK.Air_Handling_Unit)),
string("-", Delimiter),
regex(r"\d+", Identifier)
)(label)
```

This defines a parser that matches strings like "AHU-1" or "AHU-237" and returns a list of `Token`s.
The `sequence` combinator combines the three parsers together, and the `string` and `regex` combinators match specific strings or regular expressions.
Using parser combinators in this way allows you to define complex parsing rules in a concise and readable way.

The example output of the `parse_ahu_label` function might look like this:

```python
parse_ahu_label("AHU-1")
# [TokenResult(value='AHU', token=Constant(value=rdflib.term.URIRef('https://brickschema.org/schema/Brick#Air_Handling_Unit')), length=3, error=None, id=None),
# TokenResult(value='-', token=Delimiter(value='-'), length=1, error=None, id=None),
# TokenResult(value='1', token=Identifier(value='1'), length=1, error=None, id=None)]

parse_ahu_label("AH-1")
# [TokenResult(value=None, token=Null(value=None), length=0, error='Expected AHU, got AH-', id=None)]
```

## Parser Combinators

The `buildingmotif.label_parsing.combinators` module provides a set of parser combinators for defining point label parsing rules.
Here are some of the most commonly used combinators:

- `string`: Matches a specific string and returns a `Token` with a constant value.
- `regex`: Matches a regular expression and returns a `Token` with the matched value.
- `choice`: Matches one of a list of parsers. Uses the first one that matches.
- `sequence`: Matches a sequence of parsers and returns a list of `Token`s.
- `constant`: Returns a `Token` with a constant value. Does not consume any input.
- `many`: Matches zero or more occurrences of a parser.
- `maybe`: Matches zero or one occurrence of a parser.
- `until`: Matches a parser until another parser is matched.


### Defining New Combinators

These are all just Python functions, so you can define your own combinators as needed.

```python
delimiters = regex(r"[._:/\- ]", Delimiter)
identifier = regex(r"[a-zA-Z0-9]+", Identifier)
named_equip = sequence(equip_abbreviations, maybe(delimiters), identifier)
named_point = sequence(point_abbreviations, maybe(delimiters), identifier)
```

More generally, a combinator is any function that takes a string as input and returns a list of `TokenResult`s.
The methods above (`regex`, `sequence`, `delimiters`) are functions that *return* a combinator as an argument.

### Abbreviations

Abbreviations are a common feature of point labels.
Strings like "AHU" for "Air Handling Unit" or "VAV" for "Variable Air Volume" are often used to save space on labels.
You can use the `abbreviations` combinator to define a set of abbreviations and automatically expand them in the input string.

We can define a dictionary of abbreviations like this:

```python
my_abbreviations = {
"AHU": BRICK.Air_Handling_Unit,
"FCU": BRICK.Fan_Coil_Unit,
"VAV": BRICK.Variable_Air_Volume_Box,
"CRAC": BRICK.Computer_Room_Air_Conditioner,
"HX": BRICK.Heat_Exchanger,
"PMP": BRICK.Pump,
"RVAV": BRICK.Variable_Air_Volume_Box_With_Reheat,
"HP": BRICK.Heat_Pump,
"RTU": BRICK.Rooftop_Unit,
"DMP": BRICK.Damper,
"STS": BRICK.Status,
"VLV": BRICK.Valve,
"CHVLV": BRICK.Chilled_Water_Valve,
"HWVLV": BRICK.Hot_Water_Valve,
"VFD": BRICK.Variable_Frequency_Drive,
"CT": BRICK.Cooling_Tower,
"MAU": BRICK.Makeup_Air_Unit,
"R": BRICK.Room,
}

my_abbreviations_parser = abbreviations(my_abbreviations)
```

Then we can use `my_abbreviations_parser` in our label parsing rules to automatically expand abbreviations.
Note how the key of the `my_abbreviations` dictionary is the abbreviation and the value is the RDF Brick class that the abbreviation expands to.

To expand our earlier example to work for other abbreviations, we can rewrite the parser like this:

```python
def parse_label(label: str) -> List[TokenResult]:
return sequence(
my_abbreviations_parser,
string("-", Delimiter),
regex(r"\d+", Identifier)
)(label)

parse_label("AHU-1")
# [TokenResult(value='AHU', token=Constant(value=rdflib.term.URIRef('https://brickschema.org/schema/Brick#Air_Handling_Unit')), length=3, error=None, id=None),
# TokenResult(value='-', token=Delimiter(value='-'), length=1, error=None, id=None),
# TokenResult(value='1', token=Identifier(value='1'), length=1, error=None, id=None)]

parse_label("FCU-1")
# [TokenResult(value='FCU', token=Constant(value=rdflib.term.URIRef('https://brickschema.org/schema/Brick#Fan_Coil_Unit')), length=3, error=None, id=None),
# TokenResult(value='-', token=Delimiter(value='-'), length=1, error=None, id=None),
# TokenResult(value='123', token=Identifier(value='123'), length=3, error=None, id=None)]

parse_label("AH-1")
# [TokenResult(value=None, token=Null(value=None), length=0, error='Expected
# AHU, got AH- | Expected FCU, got AH- | Expected VAV, got AH- | Expected CRAC,
# got AH-3 | Expected HX, got AH | Expected PMP, got AH- | Expected RVAV, got
# AH-3 | Expected HP, got AH | Expected RTU, got AH- | Expected DMP, got AH- |
# Expected STS, got AH- | Expected VLV, got AH- | Expected CHVLV, got AH-3 |
# Expected HWVLV, got AH-3 | Expected VFD, got AH- | Expected CT, got AH |
# Expected MAU, got AH- | Expected R, got A', id=None)]
```

### Error Handling

The parser combinators in BuildingMOTIF provide detailed error messages when a parsing rule fails.
This can be useful for debugging and understanding why a particular label did not match the expected format.
The error messages include information about what was expected and what was found in the input string.

If any `TokenResult` in the list has an `error` field, it means that the parsing rule failed at that point.

## Example

Consider these point labels:

```
:BuildingName_02:FCU503_ChwVlvPos
:BuildingName_02:FCU510_EffOcc
:BuildingName_02:FCU507_UnoccHtgSpt
:BuildingName_02:FCU415_UnoccHtgSpt
:BuildingName_01:FCU203_OccClgSpt
:BuildingName_02:FCU529_UnoccHtgSpt
:BuildingName_01:FCU243_EffOcc
:BuildingName_01:FCU362_ChwVlvPos
```

We can define a set of parsing rules to extract structured data from these labels.
This is essentially just an expression of the building point naming convention.

```python
equip_abbreviations = abbreviations(COMMON_EQUIP_ABBREVIATIONS_BRICK)
# define our own for Points (specific to this building)
point_abbreviations = abbreviations({
"ChwVlvPos": BRICK.Position_Sensor,
"HwVlvPos": BRICK.Position_Sensor,
"RoomTmp": BRICK.Air_Temperature_Sensor,
"Room_RH": BRICK.Relative_Humidity_Sensor,
"UnoccHtgSpt": BRICK.Unoccupied_Air_Temperature_Heating_Setpoint,
"OccHtgSpt": BRICK.Occupied_Air_Temperature_Heating_Setpoint,
"UnoccClgSpt": BRICK.Unoccupied_Air_Temperature_Cooling_Setpoint,
"OccClgSpt": BRICK.Occupied_Air_Temperature_Cooling_Setpoint,
"SaTmp": BRICK.Supply_Air_Temperature_Sensor,
"OccCmd": BRICK.Occupancy_Command,
"EffOcc": BRICK.Occupancy_Status,
})

def custom_parser(target):
return sequence(
string(":", Delimiter),
# regex until the underscore
constant(Constant(BRICK.Building)),
regex(r"[^_]+", Identifier),
string("_", Delimiter),
# number for AHU name
constant(Constant(BRICK.Air_Handling_Unit)),
regex(r"[0-9a-zA-Z]+", Identifier),
string(":", Delimiter),
# equipment types
equip_abbreviations,
# equipment ident
regex(r"[0-9a-zA-Z]+", Identifier),
string("_", Delimiter),
maybe(
sequence(regex(r"[A-Z]+[0-9]+", Identifier), string("_", Delimiter)),
),
# point types
point_abbreviations,
)(target)
```
Loading

0 comments on commit 219c9c8

Please sign in to comment.