Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tagset Inference misses some Classes #72

Open
david-waterworth opened this issue Sep 14, 2021 · 1 comment
Open

Tagset Inference misses some Classes #72

david-waterworth opened this issue Sep 14, 2021 · 1 comment

Comments

@david-waterworth
Copy link

There's an issue with the order in which inference first computes the most likely target, then filters to remove points or equips which can result in missing classes. I noticed this when I tried to adapt the HaystackInferenceSession for my own use.

In particular, I believe Chilled Water Valve Command should map to ['Chilled_Water_Valve', 'Valve_Command'] since there's equipment and a point implied by the tags.

The issue is in infer_entity the method most_likely_tagsets is called twice, first with the set of tags minus {"point","sensor","command","setpoint","alarm","status","parameter","limit"} plus {"equip"} with the result filtered to retain only equipment, and then again with the original set of tags, with the result filtered to only retain points.

The problem is there's filtering inside most_likely_tagsets which results in it returning Chilled Water Valve in both cases, so the point is never inferred.

The easiest fix is to lift the filtering by class type from infer_entity into most_likely_tagsets so it only considers candidates of the correct type.

The code below seems to work, I added is_valid_class as an argument and use this to pre-filter. It may be better to simply pass a string representation, plus I couldn't be bothered thinking about how to make it a list comprehension.

    def most_likely_tagsets(self, orig_s, num=-1, is_valid_class=lambda x:True):
        """
        Returns the list of likely classes for a given set of tags,
        as well as the list of tags that were 'leftover', i.e. not
        used in the inference of a class

        Args:
            tagset (list of str): a list of tags
            num (int): number of likely tagsets to be returned; -1 returns all

        Returns:
            results (tuple): a 2-element tuple containing (1)
            most_likely_classes (list of str): list of Brick classes
            and (2) leftover (set of str): list of tags not used

        """
        s = set(map(_to_tag_case, orig_s))
        tagsets_ = self.lookup_tagset(s)

        # filter classes which aren't of required type
        tagsets = []
        for classes,tagset in tagsets_:
            classes = [c for c in classes if is_valid_class(c)]
            if len(classes) > 0:
                tagsets.append((classes,tagset))

        if len(tagsets) == 0:
            # no tags
            return [], orig_s
        # find the highest number of tags that overlap
        most_overlap = max(map(lambda x: len(s.intersection(x[1])), tagsets))

        # return the class with the fewest tags >= the overlap size
        candidates = list(
            filter(lambda x: len(s.intersection(x[1])) == most_overlap, tagsets)
        )

        # When calculating the minimum difference, we calculate it form the
        # perspective of the candidate tagsets because they will have more tags
        # We want to find the tag set(s) who has the fewest tags over what was
        # provided
        min_difference = min(map(lambda x: len(x[1].difference(s)), candidates))
        most_likely = list(
            filter(lambda x: len(x[1].difference(s)) == min_difference, candidates)
        )

        leftover = s.difference(most_likely[0][1])
        most_likely_classes = list(set([list(x[0])[0] for x in most_likely]))
        # return most likely classes (list) and leftover tags
        # (what of 'orig_s' wasn't used)
        if num < 0:
            return most_likely_classes, leftover
        else:
            return most_likely_classes[:num], leftover
@gtfierro
Copy link
Member

gtfierro commented Oct 7, 2021

Thanks for the detailed bug report @david-waterworth ! I'll try to take a look soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants