+ +
+

Point Label Parsing#

+

The purpose of this explanation is to describe the framework for defining point label parsing rules and provide examples of how to use it.

+

One common source of building metadata are the “point labels” used in building management systems to label or tag the input and output data points with some human-readable description. +It is often useful to extract structured information from these labels to help with constructing a semantic model of the building.

+

BuildingMOTIF provides a framework for defining point label naming conventions and parsing them into structured data. +The output of this process is a set of typed Token objects that can be input into a “Semantic Graph Synthesis” process to generate a semantic model of the building.

+
+

Semantic Graph Synthesis

+

This feature is coming soon! This label parsing framework is just part of the larger BuildingMOTIF toolkit for generating semantic models of buildings.

+
+
+

Background#

+

The point label parsing framework in BuildingMOTIF is based on the concept of “parser combinators”. +Parser combinators are a way of defining parsers by combining smaller parsers together. +In BuildingMOTIF, the “combinators” are defined as Python functions that take a string as input and return a list of TokenResults. +These combinators can be combined together to create more complex parsers.

+

Here is a short example:

+
def parse_ahu_label(label: str) -> List[TokenResult]:
+    return sequence(
+        string("AHU", Constant(BRICK.Air_Handling_Unit)),
+        string("-", Delimiter),
+        regex(r"\d+", Identifier)
+    )(label)
+
+
+

This defines a parser that matches strings like “AHU-1” or “AHU-237” and returns a list of Tokens. +The sequence combinator combines the three parsers together, and the string and regex combinators match specific strings or regular expressions. +Using parser combinators in this way allows you to define complex parsing rules in a concise and readable way.

+

The example output of the parse_ahu_label function might look like this:

+
parse_ahu_label("AHU-1")
+# [TokenResult(value='AHU', token=Constant(value=rdflib.term.URIRef('https://brickschema.org/schema/Brick#Air_Handling_Unit')), length=3, error=None, id=None), 
+# TokenResult(value='-', token=Delimiter(value='-'), length=1, error=None, id=None),
+# TokenResult(value='1', token=Identifier(value='1'), length=1, error=None, id=None)]
+
+parse_ahu_label("AH-1")
+# [TokenResult(value=None, token=Null(value=None), length=0, error='Expected AHU, got AH-', id=None)]
+
+
+
+
+

Parser Combinators#

+

The buildingmotif.label_parsing.combinators module provides a set of parser combinators for defining point label parsing rules. +Here are some of the most commonly used combinators:

+
    +
  • string: Matches a specific string and returns a Token with a constant value.

  • +
  • regex: Matches a regular expression and returns a Token with the matched value.

  • +
  • choice: Matches one of a list of parsers. Uses the first one that matches.

  • +
  • sequence: Matches a sequence of parsers and returns a list of Tokens.

  • +
  • constant: Returns a Token with a constant value. Does not consume any input.

  • +
  • many: Matches zero or more occurrences of a parser.

  • +
  • maybe: Matches zero or one occurrence of a parser.

  • +
  • until: Matches a parser until another parser is matched.

  • +
+
+

Defining New Combinators#

+

These are all just Python functions, so you can define your own combinators as needed.

+
delimiters = regex(r"[._:/\- ]", Delimiter)
+identifier = regex(r"[a-zA-Z0-9]+", Identifier)
+named_equip = sequence(equip_abbreviations, maybe(delimiters), identifier)
+named_point = sequence(point_abbreviations, maybe(delimiters), identifier)
+
+
+

More generally, a combinator is any function that takes a string as input and returns a list of TokenResults. +The methods above (regex, sequence, delimiters) are functions that return a combinator as an argument.

+
+
+

Abbreviations#

+

Abbreviations are a common feature of point labels. +Strings like “AHU” for “Air Handling Unit” or “VAV” for “Variable Air Volume” are often used to save space on labels. +You can use the abbreviations combinator to define a set of abbreviations and automatically expand them in the input string.

+

We can define a dictionary of abbreviations like this:

+
my_abbreviations = {
+    "AHU": BRICK.Air_Handling_Unit,
+    "FCU": BRICK.Fan_Coil_Unit,
+    "VAV": BRICK.Variable_Air_Volume_Box,
+    "CRAC": BRICK.Computer_Room_Air_Conditioner,
+    "HX": BRICK.Heat_Exchanger,
+    "PMP": BRICK.Pump,
+    "RVAV": BRICK.Variable_Air_Volume_Box_With_Reheat,
+    "HP": BRICK.Heat_Pump,
+    "RTU": BRICK.Rooftop_Unit,
+    "DMP": BRICK.Damper,
+    "STS": BRICK.Status,
+    "VLV": BRICK.Valve,
+    "CHVLV": BRICK.Chilled_Water_Valve,
+    "HWVLV": BRICK.Hot_Water_Valve,
+    "VFD": BRICK.Variable_Frequency_Drive,
+    "CT": BRICK.Cooling_Tower,
+    "MAU": BRICK.Makeup_Air_Unit,
+    "R": BRICK.Room,
+}
+
+my_abbreviations_parser = abbreviations(my_abbreviations)
+
+
+

Then we can use my_abbreviations_parser in our label parsing rules to automatically expand abbreviations. +Note how the key of the my_abbreviations dictionary is the abbreviation and the value is the RDF Brick class that the abbreviation expands to.

+

To expand our earlier example to work for other abbreviations, we can rewrite the parser like this:

+
def parse_label(label: str) -> List[TokenResult]:
+    return sequence(
+        my_abbreviations_parser,
+        string("-", Delimiter),
+        regex(r"\d+", Identifier)
+    )(label)
+
+parse_label("AHU-1")
+# [TokenResult(value='AHU', token=Constant(value=rdflib.term.URIRef('https://brickschema.org/schema/Brick#Air_Handling_Unit')), length=3, error=None, id=None),
+# TokenResult(value='-', token=Delimiter(value='-'), length=1, error=None, id=None),
+# TokenResult(value='1', token=Identifier(value='1'), length=1, error=None, id=None)]
+
+parse_label("FCU-1")
+# [TokenResult(value='FCU', token=Constant(value=rdflib.term.URIRef('https://brickschema.org/schema/Brick#Fan_Coil_Unit')), length=3, error=None, id=None),
+# TokenResult(value='-', token=Delimiter(value='-'), length=1, error=None, id=None),
+# TokenResult(value='123', token=Identifier(value='123'), length=3, error=None, id=None)]
+
+parse_label("AH-1")
+# [TokenResult(value=None, token=Null(value=None), length=0, error='Expected
+# AHU, got AH- | Expected FCU, got AH- | Expected VAV, got AH- | Expected CRAC,
+# got AH-3 | Expected HX, got AH | Expected PMP, got AH- | Expected RVAV, got
+# AH-3 | Expected HP, got AH | Expected RTU, got AH- | Expected DMP, got AH- |
+# Expected STS, got AH- | Expected VLV, got AH- | Expected CHVLV, got AH-3 |
+# Expected HWVLV, got AH-3 | Expected VFD, got AH- | Expected CT, got AH |
+# Expected MAU, got AH- | Expected R, got A', id=None)]
+
+
+
+
+

Error Handling#

+

The parser combinators in BuildingMOTIF provide detailed error messages when a parsing rule fails. +This can be useful for debugging and understanding why a particular label did not match the expected format. +The error messages include information about what was expected and what was found in the input string.

+

If any TokenResult in the list has an error field, it means that the parsing rule failed at that point.

+
+
+
+

Example#

+

Consider these point labels:

+
:BuildingName_02:FCU503_ChwVlvPos
+:BuildingName_02:FCU510_EffOcc
+:BuildingName_02:FCU507_UnoccHtgSpt
+:BuildingName_02:FCU415_UnoccHtgSpt
+:BuildingName_01:FCU203_OccClgSpt
+:BuildingName_02:FCU529_UnoccHtgSpt
+:BuildingName_01:FCU243_EffOcc
+:BuildingName_01:FCU362_ChwVlvPos
+
+
+

We can define a set of parsing rules to extract structured data from these labels. +This is essentially just an expression of the building point naming convention.

+
equip_abbreviations = abbreviations(COMMON_EQUIP_ABBREVIATIONS_BRICK)
+# define our own for Points (specific to this building)
+point_abbreviations = abbreviations({
+    "ChwVlvPos": BRICK.Position_Sensor,
+    "HwVlvPos": BRICK.Position_Sensor,
+    "RoomTmp": BRICK.Air_Temperature_Sensor,
+    "Room_RH": BRICK.Relative_Humidity_Sensor,
+    "UnoccHtgSpt": BRICK.Unoccupied_Air_Temperature_Heating_Setpoint,
+    "OccHtgSpt": BRICK.Occupied_Air_Temperature_Heating_Setpoint,
+    "UnoccClgSpt": BRICK.Unoccupied_Air_Temperature_Cooling_Setpoint,
+    "OccClgSpt": BRICK.Occupied_Air_Temperature_Cooling_Setpoint,
+    "SaTmp": BRICK.Supply_Air_Temperature_Sensor,
+    "OccCmd": BRICK.Occupancy_Command,
+    "EffOcc": BRICK.Occupancy_Status,
+})
+
+def custom_parser(target):
+    return sequence(
+        string(":", Delimiter),
+        # regex until the underscore
+        constant(Constant(BRICK.Building)),
+        regex(r"[^_]+", Identifier),
+        string("_", Delimiter),
+        # number for AHU name
+        constant(Constant(BRICK.Air_Handling_Unit)),
+        regex(r"[0-9a-zA-Z]+", Identifier),
+        string(":", Delimiter),
+        # equipment types
+        equip_abbreviations,
+        # equipment ident
+        regex(r"[0-9a-zA-Z]+", Identifier),
+        string("_", Delimiter),
+        maybe(
+            sequence(regex(r"[A-Z]+[0-9]+", Identifier), string("_", Delimiter)),
+        ),
+        # point types
+        point_abbreviations,
+    )(target)
+
+
+
+
+ + + + +