Design test procedures for `examples/` #245

FilipBolt · 2020-12-18T12:18:52Z

We need a way to automatically test examples to ensure they work with framework core changes. One solution would be marking these tests can be marked with slow to avoid running them each time. Also, dataset loading and downloading, training models and similar slow running operations should all be mocked (perhaps a set of mocking tools can be created)

The text was updated successfully, but these errors were encountered:

mariosasko · 2020-12-18T13:03:34Z

Currently, the test examples are not part of the docs and some of them are outdated.

When we add them, it shouldn't be too hard to implement automatic execution of these examples as part of the existing test suite. Not sure about the mocking tool, think this is a long term goal (probably because I'm not a big fan of mocking). Ideally, we would have an external runner (e.g. TakeLab server) connected to CI that runs slow examples if the file content is changed/new files are added. This means we would need the server only from time to time. IMO, these examples (only BERT comes to my mind) are "slow", but not that slow to mock parts of them.

FilipBolt · 2020-12-23T17:07:56Z

This sounds fantastic, if we could setup a CI that detects changes and runs only in those cases. We could probably get away without mocking whatsoever, which sounds great.

I'd say the docs are a separate issue (though a valid point), so I wouldn't like to expand this issue too much.

ivansmokovic · 2020-12-23T19:38:28Z

How would we go about checking the correctness of non-deterministic parts of the examples, e.g. training of a model, which should be a fairly common example case? I see breaking examples into subfunctions and testing those subfunctions separately as a good first step, but would that impact the "esthetics" of the examples?

mttk · 2020-12-23T20:09:13Z

We can either fix training of the model (you can make it deterministic at the cost of speed) or simply not care about performance metrics (unless they are relevant) as long as the training completes.

ivansmokovic · 2020-12-24T18:48:00Z

We can either fix training of the model (you can make it deterministic at the cost of speed) or simply not care about performance metrics (unless they are relevant) as long as the training completes.

But can we guarantee the correctness of training examples?

mttk · 2021-01-08T12:23:52Z

We can either fix training of the model (you can make it deterministic at the cost of speed) or simply not care about performance metrics (unless they are relevant) as long as the training completes.

But can we guarantee the correctness of training examples?

Not sure what you mean by this.

I'd delegate this for post 1.1.0.

mttk · 2021-04-02T11:07:09Z

Will be closed via #318

mttk mentioned this issue Mar 31, 2021

Rework examples #316

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design test procedures for `examples/` #245

Design test procedures for `examples/` #245

FilipBolt commented Dec 18, 2020

mariosasko commented Dec 18, 2020

FilipBolt commented Dec 23, 2020

ivansmokovic commented Dec 23, 2020 •

edited

Loading

mttk commented Dec 23, 2020

ivansmokovic commented Dec 24, 2020

mttk commented Jan 8, 2021

mttk commented Apr 2, 2021

Design test procedures for examples/ #245

Design test procedures for examples/ #245

Comments

FilipBolt commented Dec 18, 2020

mariosasko commented Dec 18, 2020

FilipBolt commented Dec 23, 2020

ivansmokovic commented Dec 23, 2020 • edited Loading

mttk commented Dec 23, 2020

ivansmokovic commented Dec 24, 2020

mttk commented Jan 8, 2021

mttk commented Apr 2, 2021

Design test procedures for `examples/` #245

Design test procedures for `examples/` #245

ivansmokovic commented Dec 23, 2020 •

edited

Loading