23 Aug 16:18

c663a73

0.11.0 Latest

Latest

What's Changed

Significant update that has refactored much of the PyTerrier source code and renamed many classes as we progress towards a PyTerrier 1.0 release.

The most significant changes are:

pt.init() is no longer required 😃. If necessary pt.java methods can be used to change Java initialisation
pt.BatchRetrieve is now pt.terrier.Retriever, and similar changes for other Terrier indexers and retrievers
pt.AnseriniBatchRetrieve is now in its own separate project, PyTerrier-Anserini, with various improvements

All changes are backwards compatible in this release - deprecation warnings will guide you how to update your code.

More details below:

Improvements

Move all Java/JNIUS code into pt.java, move all Terrier code into pt.terrier; remove pt.init() by @seanmacavaney in #447
dynamic module loading by @seanmacavaney in #461
Incorporate Retrieval Scores into RM3 by @mam10eks in #453
pt.apply for making an indexer by @cmacdonald in #467
query_toks support for terrier.Retriever by @cmacdonald in #466
add save_mode='warn' and save_mode='error' to pt.Experiment (warn as default) by @cmacdonald in #408

### Refactoring

Deprecate DFIndexer by @cmacdonald in #457
pt.terrier.rewrite revisions - remove Axiomatic, remove terrier-prf by @seanmacavaney in #472
shims for deprecated modules by @seanmacavaney in #476
text_loader abstraction for pt.text.get_text by @seanmacavaney in #469
move Anserini to a separate project by @seanmacavaney in #473

Documentation

Add RankVicuna and RankZephyr Plugins by @kaustubhdhole in #441
Update tuning.rst by @albertoueda in #446
Add PyTerrier_ChatNoir to the plugin section by @mam10eks in #452
Remove nptyping dependency to assure numpy 2 compatability by @cmacdonald in #445

Minor

change all tests to use new terrier retriever names, but check old names too by @cmacdonald in #458
Parallel fixes by @seanmacavaney in #462
fix logger error by @seanmacavaney in #464
Add comments to requirements.txt by @cmacdonald in #465
failing anserini tests due to version 0.36.0, disabling for now by @seanmacavaney in #468
remove the writing of a default terrier.properties file by @cmacdonald in #470
fix test_maven by @seanmacavaney in #471
Python 3.12 in GHA by @cmacdonald in #459
Bump most JDK version tested in GHA to 21 by @cmacdonald in #475
Update pt.terrier.Retriever str and repr #474

New Contributors

@kaustubhdhole made their first contribution in #441
@mam10eks made their first contribution in #452

Full Changelog: 0.10.1...0.11.0

Contributors

cmacdonald, albertoueda, and 3 other contributors

Assets 2

02 May 18:14

cmacdonald

0.10.1

7274586

0.10.1

Minor release with minor improvements and bug fixes.

What's Changed

Bugfix: Delete baseline pvalue from correction method input by @JorgeGabin in #440
Fix: fix msmarco location by @cmacdonald in #435
Feature: added corpus_iter for Terrier index by @cmacdonald in #426
remove sklearn as required dependency by @cmacdonald in #410
Add troubleshoot for installation and certification error by @Krissy510 in #411
fix parsing of trecxml topics by @lukaszett in #414
paired t-tost by @seanmacavaney in #420
read_results optimization by @seanmacavaney in #421
pickling QE pipelines to parallelised QE gridsearch by @cmacdonald in #430
Require Python 3.8 minimum by @cmacdonald in #431
Bump logback from 1.2.0 to 1.2.13 in /terrier-python-helper by @dependabot
improved error message pt.apply.query - from #433 by @cmacdonald in #434
Improved testing of FeaturesBatchRetrieve by @cmacdonald in #437

New Contributors

@Krissy510 made their first contribution in #411
@JorgeGabin made their first contribution in #440

Full Changelog: 0.10.0...0.10.1

Contributors

cmacdonald, seanmacavaney, and 4 other contributors

Assets 2

02 Nov 13:32

cmacdonald

0.10.0

14bb260

0.10.0

What's Changed

New Features

Transformer.__call__ now supports both dataframe and iterdicts by @cmacdonald in #381
Terrier: Custom stopwords by @cmacdonald in #372
Terrier: Access the stemmer of Terrier from PyTerrier by @cmacdonald in #382
Terrier: Improved API for loading Terrier indices into memory by @cmacdonald in #386

Improvements

added tokenizer as arg for pt.text.sliding by @mihirs16 in #387
addresses #367 - include qid in pt.apply Exception by @cmacdonald in #370
addresses #377: pt.apply.query() raises exception if the query column does not exist by @cmacdonald in #380
let pt.tqdm exist without pt.init() by @cmacdonald in #399
deprecate pt.Utils by @cmacdonald in #384
removes two warnings by @cmacdonald in #385
work on test failure by @cmacdonald in #401
Test pyterrier with newer Python versions by @cmacdonald in #400
bump supported Anserini version by @cmacdonald in #406, addresses #404
Terrier: allow to put term and LexiconEntry into a tuple by @cmacdonald in #369

Bugs:

stringify properties and controls, addresses #357 by @cmacdonald in #358
fix bug in metadata size warning by @seanmacavaney in #362

Documentation

Update pipeline_examples.md by @gurcankavakci in #359
Fixed typo by @hermlon in #364
Update ltr.rst by @Hermi-Mire in #371
Update transformer.rst by @albertoueda in #383
clarify docstring for indexing with regards to metadata by @lukaszett in #394
Query Rewriting & Expansion by @cakiki in #402, #403

New Contributors

@gurcankavakci made their first contribution in #359
@hermlon made their first contribution in #364
@Hermi-Mire made their first contribution in #371
@lukaszett made their first contribution in #394
@cakiki made their first contribution in #402
@mihirs16 made their first contribution in #387

Full Changelog: 0.9.2...0.10.0

Contributors

cmacdonald, gurcankavakci, and 7 other contributors

Assets 2

19 Dec 12:31

cmacdonald

0.9.2

b0827bf

0.9.2

Minor release with minor improvements and bug fixes.

What's Changed

add sbert example notebook by @cmacdonald in #344
Update scikit-learn requirement from the deprecated sklearn, which was causing build errors at some times.
adding batching operations to apply.generic() and apply.by_query() by @cmacdonald in #351 - thanks to Xun Zhou, University of Michigan via #350
improve error messages for invalid indexing configurations by @cmacdonald in #349 -- thanks to @maxhenze in #348
Various empty dataframe fixes by @cmacdonald in #353 -- thanks to report by Prithvijit Dasgupta, University of Michigan in #352
improved error message for add_ranks by @cmacdonald in #354

Full Changelog: 0.9.1...0.9.2

Contributors

cmacdonald and maxhenze

Assets 2

11 Nov 15:20

cmacdonald

0.9.1

72a5c55

0.9.1

Bugfix release addressing a problem with pretokenised indices on Windows

What's Changed

Nofifo pretok indexing fixes by @cmacdonald in #343

Full Changelog: 0.9.0...0.9.1

Contributors

cmacdonald

Assets 2

10 Nov 19:17

cmacdonald

0.9.0

7e97dde

0.9.0

Significant update - refactoring of public API (e.g. pt.transformer.TransformerBase -> pt.Transformer) and support in the Terrier backend for making indices from pre-tokenised documents. Python 3.10 is now supported.

What's Changed

fix error in IRDSDataset when a query field is named "query" by @seanmacavaney in #303
Fix type annotation by @heinrichreimer in #313
addresses #315 IRDS corpus_iter are not subscriptable by @cmacdonald in #316
Missing comma in bm25_qe example by @JohnGiorgi in #319
Argument meta should be supplied as dictionary by @JohnGiorgi in #320
use Jnius 1.4 by @cmacdonald in #249
Python 3.10 support by @cmacdonald in #322
Lz4 support for pt.io.autoopen() by @cmacdonald in #323
addresses #326 faster version of add_ranks for single queries by @cmacdonald in #327
addresses #321 pt.apply.doc_score batching by @cmacdonald in #325
IterDictIndexer can index pre-tokenised documents by @cmacdonald in #328
Bump logback-core from 1.2.0 to 1.2.9 in /terrier-python-helper by @dependabot in #336
documenting BM25F controls and tuning by @cmacdonald in #296, addresses #294
0.9refactor by @cmacdonald in #314, #339, addresses #271
pt.Experiment() alters the input measures list to drop "mrt" #301
Expose Termpipelines in Terrier index backend by @cmacdonald in #338
pt.rewrite.tokenise() impl by @cmacdonald in #340 addresses #252 #253
upgraded GitHub actions by @cmacdonald in #341, #342
fix LTR groupby for xgboost & lightgbm by @cmacdonald in #284

New Contributors

@heinrichreimer made their first contribution in #313
@JohnGiorgi made their first contribution in #319

Full Changelog: 0.8.1...0.9.0

Contributors

cmacdonald, seanmacavaney, and 3 other contributors

Assets 2

10 Apr 08:56

cmacdonald

0.8.1

586b983

0.8.1

Minor release with minor improvements and bug fixes.

What's Changed

fixed bug with is_transformer by @seanmacavaney in #274
addresses #275 issue k in kmaxavg, improved testing by @cmacdonald in #276
defer loading ir_datasets by @seanmacavaney in #280
Set meta and meta_lengths in constructor by @MWschutte in #282
Anserini fixes by @cmacdonald in #279, reported by @Azouu
prevent use of nptyping v2 by @cmacdonald in #291, reported by @tabonnet
SourceTransformer pass through extra columns, addresses #287 by @cmacdonald in #288, reported by @Xiao0728
more transformers with repr by @cmacdonald in #289

New Contributors

@MWschutte made their first contribution in #282

Full Changelog: 0.8.0...0.8.1

Contributors

cmacdonald, seanmacavaney, and 4 other contributors

Assets 2

18 Jan 13:24

cmacdonald

0.8.0

4014063

0.8.0

PyTerrier 0.8.0 Release Notes

Released on 18/01/2022

What's Changed - Major

Require Python 3.7 by @cmacdonald in #255
Deprecate automatic coercion of transformers by @cmacdonald in #258
introduce pt.Transformer as public API; * pt.transformer.TransformerBase will be deprecated in 0.9; by @cmacdonald in #258
introduce query biased summarisation - addresses #205 by @cmacdonald in #223, suggested by @adambaker
provide re-ranking runs from datasets by @seanmacavaney in #262

What's Changed - Minor

faster testing in Github Actions: focus on requested jnius, rather than changing jnius version 3 times by @cmacdonald in #256
Faster tests by @cmacdonald in #257
Use Flake to identify bugs, reduce imports etc by @cmacdonald in #259
pyterrier loaded message to stderr by @seanmacavaney in #260
Fix code block in ltr.rst in section Working with Features by @bart-kosmala in #261
get_dataset() for non-existant irds dataset by @seanmacavaney in #263
Filter out non-indexed/metadata fields when indexing by @seanmacavaney in #267, reported by @bjoernengelmann in #266
mirroring of Vaswani dataset files by @seanmacavaney in #268
pt.io.read_results() can merge topics by @seanmacavaney in #265
addresses #264, text.scorer() will default to takes='docs' by @cmacdonald in #269
change paths and exercise names by @cmacdonald in #270

New Contributors

@bart-kosmala made their first contribution in #261

Full Changelog: 0.7.2...0.8.0

Contributors

adambaker, cmacdonald, and 3 other contributors

Assets 2

20 Dec 13:31

cmacdonald

0.7.2

0414f71

0.7.2

Minor release addressing some useful bug fixes and small features. This is the last release that will support Python 3.6.

What's Changed

using chunked instead of ichunked - this fixes indexing speed/memory-consumption/crashes with indexing pipelines, by @seanmacavaney in #238
remove deprecated code by @cmacdonald in #239
combsum dropping documents not appearing on both sides of + by @cmacdonald in #240
addresses #203, verbose in pt.Experiment by @cmacdonald in #245
Py37 minimum warning, addresses #241 by @cmacdonald in #246
use caching in GitHub Actions by @cmacdonald in #248
save run files automatically in pt.Experiment #163 by @cmacdonald in #247
Set dtype for qrels columns at read time in io method by @jjdelvalle in #254
support meta config in IterDictIndexer constructor, addresses #250, by @cmacdonald in #251

New Contributors

@jjdelvalle made their first contribution in #254

Full Changelog: 0.7.1...0.7.2

Contributors

cmacdonald, jjdelvalle, and seanmacavaney

Assets 2

02 Nov 23:01

cmacdonald

0.7.1

e6ba40c

0.7.1

PyTerrier 0.7.1

Minor update to support activities for CIKM 2021 tutorial. In particular:

pt.debug.print_num_rows() added
Terrier Data Repository support for TREC Covid test collection.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Improvements

Documentation

Minor

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Features

Improvements

Bugs:

Documentation

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

PyTerrier 0.8.0 Release Notes

What's Changed - Major

What's Changed - Minor

New Contributors

Contributors

What's Changed

New Contributors

Contributors

PyTerrier 0.7.1

Releases: terrier-org/pyterrier

0.11.0

What's Changed

Improvements

Documentation

Minor

New Contributors

Contributors

0.10.1

What's Changed

New Contributors

Contributors

0.10.0

What's Changed

New Features

Improvements

Bugs:

Documentation

New Contributors

Contributors

0.9.2

What's Changed

Contributors

0.9.1

What's Changed

Contributors

0.9.0

What's Changed

New Contributors

Contributors

0.8.1

What's Changed

New Contributors

Contributors

0.8.0

PyTerrier 0.8.0 Release Notes

What's Changed - Major

What's Changed - Minor

New Contributors

Contributors

0.7.2

What's Changed

New Contributors

Contributors

0.7.1

PyTerrier 0.7.1