Skip to content

v0.7.0

Compare
Choose a tag to compare
@x-tabdeveloping x-tabdeveloping released this 23 Oct 13:16
· 78 commits to main since this release

New in version 0.7.0

Component re-estimation, refitting and topic merging

Some models can now easily be modified after being trained in an efficient manner,
without having to recompute all attributes from scratch.
This is especially significant for clustering models and $S^3$.

from turftopic import SemanticSignalSeparation, ClusteringTopicModel

s3_model = SemanticSignalSeparation(5, feature_importance="combined").fit(corpus)
# Re-estimating term importances
s3_model.estimate_components(feature_importance="angular")
# Refitting S^3 with a different number of topics (very fast)
s3_model.refit(n_components=10, random_seed=42)

clustering_model = ClusteringTopicModel().fit(corpus)
# Reduces number of topics automatically with a given method
clustering_model.reduce_topics(n_reduce_to=20, reduction_method="smallest")
# Merge topics manually
clustering_model.join_topics([0,3,4,5])
# Resets original topics
clustering_model.reset_topics()
# Re-estimates term importances based on a different method
clustering_model.estimate_components(feature_importance="centroid")

Manual topic naming

You can now manually label topics in all models in Turftopic.

# you can specify a dict mapping IDs to names
model.rename_topics({0: "New name for topic 0", 5: "New name for topic 5"})
# or a list of topic names
model.rename_topics([f"Topic {i}" for i in range(10)])

Saving, loading and publishing to HF Hub

You can now load, save and publish models with dedicated functionality.

from turftopic import load_model

model.to_disk("out_folder/")
model = load_model("out_folder/")

model.push_to_hub("your_user/model_name")
model = load_model("your_user/model_name")