Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom sampling option to PySMO #1298

Merged
merged 24 commits into from
Dec 18, 2023

Conversation

OOAmusat
Copy link
Contributor

@OOAmusat OOAmusat commented Dec 8, 2023

Summary/Motivation:

This PR adds a sampling method that allows users to specify the sampling distributions to be used for each input variable/column. Current interface provides random, uniform and normal distribution sampling options.

Changes proposed in this PR:

  • Add CustomSampling class
  • Add tests

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

Copy link

codecov bot commented Dec 8, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (021fb14) 77.36% compared to head (361077c) 77.38%.

Files Patch % Lines
idaes/core/surrogate/pysmo/sampling.py 98.05% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1298      +/-   ##
==========================================
+ Coverage   77.36%   77.38%   +0.02%     
==========================================
  Files         390      390              
  Lines       63462    63547      +85     
  Branches    11671    11699      +28     
==========================================
+ Hits        49099    49179      +80     
- Misses      11832    11836       +4     
- Partials     2531     2532       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@OOAmusat OOAmusat marked this pull request as ready for review December 8, 2023 19:15
@OOAmusat OOAmusat requested review from andrewlee94, bpaul4, dangunter and bknueven and removed request for andrewlee94, bpaul4, carldlaird and ksbeattie December 8, 2023 19:15
Copy link
Contributor

@bpaul4 bpaul4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks very good, and the coverage is very high as well. Nicely done!

Copy link
Member

@andrewlee94 andrewlee94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few requests, mostly for more specific exception types and descriptive test names.

idaes/core/surrogate/pysmo/sampling.py Show resolved Hide resolved
idaes/core/surrogate/pysmo/sampling.py Outdated Show resolved Hide resolved
idaes/core/surrogate/pysmo/sampling.py Outdated Show resolved Hide resolved
idaes/core/surrogate/pysmo/sampling.py Outdated Show resolved Hide resolved
idaes/core/surrogate/pysmo/sampling.py Outdated Show resolved Hide resolved
idaes/core/surrogate/pysmo/sampling.py Show resolved Hide resolved
idaes/core/surrogate/pysmo/sampling.py Outdated Show resolved Hide resolved
idaes/core/surrogate/pysmo/tests/test_sampling.py Outdated Show resolved Hide resolved
idaes/core/surrogate/pysmo/tests/test_sampling.py Outdated Show resolved Hide resolved

@pytest.mark.unit
@pytest.mark.parametrize("array_type", [np.array])
def test_sample_points_01(self, array_type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like tests that could (should) be done using fixed inputs. I.e. do not use the sample generation and mock up a set of known distributions that we can then use to ensure we get the right answer out.

As far as I can tell, sample_points is a case where we can test the method in isolation used manufactured data - i.e. provide the input specification, a mock up of a random dataset and a concrete answer we expect to get back out.

@OOAmusat OOAmusat requested a review from andrewlee94 December 11, 2023 22:34
@ksbeattie ksbeattie added the Priority:Normal Normal Priority Issue or PR label Dec 14, 2023
@dangunter dangunter merged commit 6352629 into IDAES:main Dec 18, 2023
37 checks passed
@ksbeattie
Copy link
Member

@OOAmusat looks like this did not make the Dec 2.3.0 release...

@OOAmusat
Copy link
Contributor Author

@OOAmusat looks like this did not make the Dec 2.3.0 release...

Yes, I don't know why as it had the required approvals and tests passing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority:Normal Normal Priority Issue or PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants