Improve ludwig feature dict #3904

dennisrall · 2024-01-21T13:56:25Z

Some minor improvements to the LudwigFeatureDict:

remove strange __next__ method
remove internal_key_to_original_name_map dict as the values were never read
use MutableMapping from the collections.abc → get useful and more performant methods
added tests for the new functionality and split up the old test (I kept the old test to show it is still running, but I think it can be removed?)

github-actions · 2024-01-21T14:14:32Z

Unit Test Results

  4 files -   2   4 suites - 2 2m 58s ⏱️ - 11m 42s
12 tests ±  0   6 ✔️ -   3   5 💤 +2 1 ❌ +1
40 runs - 20 16 ✔️ - 26 20 💤 +2 4 ❌ +4

For more details on these failures, see this check.

Results for commit acfa198. ± Comparison against base commit 0a24d0a.

This pull request skips 2 tests.

tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.gbm.yaml]

♻️ This comment has been updated with latest results.

alexsherstinsky · 2024-01-21T23:50:03Z

ludwig/models/llm.py

@@ -65,13 +65,13 @@ def __iter__(self) -> None:
        return iter(self.obj.keys())

    def keys(self) -> List[str]:
-        return self.obj.keys()
+        return self.obj.key_list()


@dennisrall Do you think it might be simpler to retain the coding pattern of returning list(my_object.keys()) or list(my_object.items()), etc., instead of introducing a special additional methods item_list() and value_list()? It seems to me that doing so will be consistent with other cases in these Ludwig modules as well as general Python collection patterns. Thank you.

alexsherstinsky · 2024-01-22T16:27:07Z

ludwig/features/feature_utils.py

@@ -157,7 +158,7 @@ def get_name_from_module_dict_key(key: str, feature_name_suffix_length: int = FE
    return name[:-feature_name_suffix_length]


-class LudwigFeatureDict(torch.nn.Module):
+class LudwigFeatureDict(torch.nn.Module, MutableMapping):


@dennisrall Thank you for incorporating my previous suggestion -- I think that part looks clean now. Thank you for the idea and the implementation!

For this one, I am not sure if the benefits due to adding the MutableMapping subclassing justify taking the risk brought about the multiple inheritance. Do the test cover all the eventualities that might happen with this change?

Thank you. /cc @justinxzhao @arnavgarg1 @Infernaught

No problem, I was just playing around a bit.

I can't think of any problems about the multiple inheritance, but you know the code better than me😉

It is also possible to remove the MutableMapping inheritance and implement the other methods by hand. But I think this way it is a bit cleaner, if it doesn't cause any problems...

alexsherstinsky · 2024-01-22T16:28:11Z

tests/ludwig/features/test_feature_utils.py

+
+
+@pytest.fixture
+def type_module() -> torch.nn.Module:


@dennisrall This fixture and the to_module() one appear the same, and just instantiate the PyTorch Module() class. Without a docstring, it is a bit difficult to justify it having them. Thank you.

Thanks for your feedback. I added a docstring for both fixtures and also for the hash method of the LudwigFeatureDict

…wig-feature-dict

justinxzhao

Overall PR LGTM, thanks for the change @dennisrall.

I pushed up a couple of commits, largely around fixing the CI.

It appears we'll need to bump up integration tests CI to torch==2.1.0, which we're absolutely ok also since that's what our Docker images are using now.

mhabedank · 2024-10-21T20:32:13Z

LGTM can be merged as soon as possible

dennisrall added 4 commits January 21, 2024 12:59

tests: split test into multiple tests

c27d50d

feat: improvate implementation of feature dict

f8d7982

feat: use MutableMapping of collections.abc for feature dict

9dc7ef8

tests: add tests for the key-name mapping methods

4ca184a

dennisrall requested review from w4nderlust, tgaddair, justinxzhao, arnavgarg1, geoffreyangus, jeffkinnison, Infernaught and alexsherstinsky as code owners January 21, 2024 13:56

dennisrall changed the title ~~Improve ludwig feature dict~~ [WIP] Improve ludwig feature dict Jan 21, 2024

alexsherstinsky reviewed Jan 21, 2024

View reviewed changes

dennisrall added 2 commits January 22, 2024 14:16

feat: use static hash value for feature dict

c5dcda9

refactor: remove list methods

3edaa08

alexsherstinsky reviewed Jan 22, 2024

View reviewed changes

docs: add docstrings to tests and to hash func

cd36fed

dennisrall changed the title ~~[WIP] Improve ludwig feature dict~~ Improve ludwig feature dict Jan 22, 2024

justinxzhao added 2 commits January 23, 2024 16:30

Merge branch 'master' of github.com:ludwig-ai/ludwig into improve-lud…

4a9aa22

…wig-feature-dict

Upgrade integration tests to use torch 2.1.0.

326c40d

justinxzhao approved these changes Jan 24, 2024

View reviewed changes

dennisrall and others added 3 commits January 28, 2024 15:28

Merge branch 'ludwig-ai:master' into improve-ludwig-feature-dict

1c685cf

Try adding sox to requirements.

23a2da7

Merge branch 'ludwig-ai:master' into improve-ludwig-feature-dict

acfa198

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ludwig feature dict #3904

Improve ludwig feature dict #3904

dennisrall commented Jan 21, 2024

github-actions bot commented Jan 21, 2024 •

edited

Loading

alexsherstinsky Jan 21, 2024 •

edited

Loading

alexsherstinsky Jan 22, 2024

dennisrall Jan 22, 2024

alexsherstinsky Jan 22, 2024 •

edited

Loading

dennisrall Jan 22, 2024

justinxzhao left a comment

mhabedank commented Oct 21, 2024

Improve ludwig feature dict #3904

Are you sure you want to change the base?

Improve ludwig feature dict #3904

Conversation

dennisrall commented Jan 21, 2024

github-actions bot commented Jan 21, 2024 • edited Loading

Unit Test Results

alexsherstinsky Jan 21, 2024 • edited Loading

Choose a reason for hiding this comment

alexsherstinsky Jan 22, 2024

Choose a reason for hiding this comment

dennisrall Jan 22, 2024

Choose a reason for hiding this comment

alexsherstinsky Jan 22, 2024 • edited Loading

Choose a reason for hiding this comment

dennisrall Jan 22, 2024

Choose a reason for hiding this comment

justinxzhao left a comment

Choose a reason for hiding this comment

mhabedank commented Oct 21, 2024

github-actions bot commented Jan 21, 2024 •

edited

Loading

alexsherstinsky Jan 21, 2024 •

edited

Loading

alexsherstinsky Jan 22, 2024 •

edited

Loading