Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance when loading a given module with numerous alternative versions or non-modulefile #561

Open
xdelaruelle opened this issue Jan 14, 2025 · 2 comments

Comments

@xdelaruelle
Copy link
Collaborator

As a follow-up to a discussion on the modules-interest mailing-list, bad performance is observed on module load when targeting a qualified name/version module if the modpath/name directory contains:

  • numerous version files (400+)
  • large amount of non-modulefile

The ancient 3.2 version of Modules was not affected by such performance issue as only the specified modulefile were analyzed. Newer version of modules evaluate all files within modpath/name directory to correctly fetch all symbols applying to the specified modulefile.

Introducing an option to avoid looking at other files next to specified modulefile is interesting to restore performance similar to version 3.2 on setup where modulefiles are mixed with a large amount of versions for same module and non-modulefiles.

@xdelaruelle
Copy link
Collaborator Author

Additional ideas come up on the mailing-list regarding this issue: new modulefile commands may be introduced to define the path pattern to ignore or to only take into account.

A new modulepath-ignore command may take a list of file path pattern as arguments (1..N). When walking down the content of modulepath directory, entries that match any of these patterns are ignored (files are not evaluated to check if they are modulefiles, directories are not searched down to find other modulefiles)

In the issue described on the mailing-list, modulefiles are a few files among large amount of other kind of files (installed software files). In this case another command may be useful, modulepath-only, to define a list of path pattern (1..N) to only take into account to check for modulefiles. When modulepath-only is used, patterns defined via modulepath-ignore are ignored as the logic is not to accept all ignore some but to ignore all accept some.

Preferably, such commands should be set in the top .modulerc of modulepath, as it determines how to walk through its content. If used in a .modulerc file found deeper, current directory will be added to the pattern. For instance, if /path/to/modpath/modname/.modulerc defines foo* pattern to ignore, it will be converted into /path/to/modpath/modname/foo*.

If user mentions a file that matches a modulepath-ignore pattern or does not match modulepath-only pattern when set:

  • it should get a file path not found error
  • such files should be excluded from generated .modulecache

@zerothi
Copy link

zerothi commented Jan 15, 2025

I really like where this is heading. My main hurdle of using Lmod is that it's so slow compared to env-modules. If the overhead gets the same in both its would be really annoying IMHO.

Another approach could be to build the names of the modules, only based on file-names. Then when loading, only those files that get loaded are actually sourced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants