Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mol2any: get feature names from elements #106

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

JochenSiegWork
Copy link
Collaborator

- Feature names could not be extracted from fingerprint pipeline elements.
- Added common interface to get names for fingerprints and descriptors.

    - Feature names could not be extracted from fingerprint
      pipeline elements.
    - Added common interface to get names for fingerprints
      and descriptors.
Copy link
Collaborator

@c-w-feldmann c-w-feldmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should discuss how we should handle potential non-uniqueness of feature names in the MolToConcatenatedVector. Beside that, we can merge.

@@ -32,6 +32,7 @@ class MolToConcatenatedVector(MolToAnyPipelineElement):
def __init__(
self,
element_list: list[tuple[str, MolToAnyPipelineElement]],
feature_names_prefix: Optional[str] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a default MorganFP and a FeatureMorganFP both would get the same prefix, making them indistinguishable.
I think it would make more sense to make this a dict. which maps the element name to a feature_prefix. But being honest, I would drop the feature_names_prefix.

conc_elem = MolToConcatenatedVector(
list(elements_subset), feature_names_prefix=feature_names_prefix
)
feature_names = conc_elem.feature_names
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check if feature names are unique?

morgan_elem = (
"MorganFP",
MolToMorganFP(n_bits=16),
)
Copy link
Collaborator

@c-w-feldmann c-w-feldmann Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add

 feature_morgan_elem = (
            "MorganFP",
            MolToMorganFP(n_bits=16, use_features=True),
        )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants