Skip to content

Commit

Permalink
Merge branch 'master' into benchmark_support
Browse files Browse the repository at this point in the history
  • Loading branch information
flying-sheep authored Feb 22, 2024
2 parents d69a90c + 48b495d commit f8368c6
Show file tree
Hide file tree
Showing 11 changed files with 60 additions and 187 deletions.
9 changes: 4 additions & 5 deletions .azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,21 +52,21 @@ jobs:
- script: |
python -m pip install --upgrade pip
pip install wheel coverage
pip install wheel
pip install .[dev,$(TEST_EXTRA)]
displayName: 'Install dependencies'
condition: eq(variables['DEPENDENCIES_VERSION'], 'latest')
- script: |
python -m pip install --pre --upgrade pip
pip install --pre wheel coverage
pip install --pre wheel
pip install --pre .[dev,$(TEST_EXTRA)]
pip install -v "anndata[dev,test] @ git+https://github.com/scverse/anndata"
displayName: 'Install dependencies release candidates'
condition: eq(variables['DEPENDENCIES_VERSION'], 'pre-release')
- script: |
python -m pip install pip wheel tomli packaging pytest-cov
python -m pip install pip wheel tomli packaging
pip install `python3 ci/scripts/min-deps.py pyproject.toml --extra dev test`
pip install --no-deps .
displayName: 'Install dependencies minimum version'
Expand All @@ -81,8 +81,7 @@ jobs:
condition: eq(variables['TEST_TYPE'], 'standard')

- script: |
coverage run -m pytest
coverage xml
pytest --cov --cov-report=xml --cov-context=test
displayName: 'PyTest (coverage)'
condition: eq(variables['TEST_TYPE'], 'coverage')
Expand Down
36 changes: 16 additions & 20 deletions docs/release-notes/1.10.0.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,37 @@
### 1.10.0 {small}`the future`
### 1.10.0rc1 {small}`2024-02-22`

```{rubric} Features
```

* {func}`~scanpy.pp.scrublet` and {func}`~scanpy.pp.scrublet_simulate_doublets` were moved from {mod}`scanpy.external.pp` to {mod}`scanpy.pp`. The `scrublet` implementation is now maintained as part of scanpy {pr}`2703` {smaller}`P Angerer`
* {func}`scanpy.pp.pca`, {func}`scanpy.pp.scale`, {func}`scanpy.pl.embedding`, and {func}`scanpy.experimental.pp.normalize_pearson_residuals_pca` now support a `mask` parameter {pr}`2272` {smaller}`C Bright, T Marcella, & P Angerer`
* Enhanced dask support for some internal utilities, paving the way for more extensive dask support {pr}`2696` {smaller}`P Angerer`
* {func}`scanpy.pp.highly_variable_genes` supports dask for the default `seurat` and `cell_ranger` flavors {pr}`2809` {smaller}`P Angerer`
* New function {func}`scanpy.get.aggregate` which allows grouped aggregations over your data. Useful for pseudobulking! {pr}`2590` {smaller}`Isaac Virshup` {smaller}`Ilan Gold` {smaller}`Jon Bloom`
* {func}`scanpy.pp.neighbors` now has a `transformer` argument allowing the use of different ANN/ KNN libraries {pr}`2536` {smaller}`P Angerer`
* {func}`scanpy.experimental.pp.highly_variable_genes` using `flavor='pearson_residuals'` now uses numba for variance computation and is faster {pr}`2612` {smaller}`S Dicks & P Angerer`
* {func}`scanpy.tl.leiden` now offers `igraph`'s implementation of the leiden algorithm via via `flavor` when set to `igraph`. `leidenalg`'s implementation is still default, but discouraged. {pr}`2815` {smaller}`I Gold`
* {func}`scanpy.pp.highly_variable_genes` has new flavor `seurat_v3_paper` that is in its implementation consistent with the paper description in Stuart et al 2018. {pr}`2792` {smaller}`E Roellin`
* {func}`scanpy.datasets.blobs` now accepts a `random_state` argument {pr}`2683` {smaller}`E Roellin`
* {func}`scanpy.pp.pca` and {func}`scanpy.pp.regress_out` now accept a layer argument {pr}`2588` {smaller}`S Dicks`
* {func}`scanpy.pp.subsample` with `copy=True` can now be called in backed mode {pr}`2624` {smaller}`E Roellin`
* {func}`scanpy.pp.neighbors` now has a `transformer` argument allowing for more flexibility {pr}`2536` {smaller}`P Angerer`
* {func}`scanpy.experimental.pp.highly_variable_genes` using `flavor='pearson_residuals'`
now uses numba for variance computation {pr}`2612` {smaller}`S Dicks & P Angerer`
* {func}`scanpy.external.pp.harmony_integrate` now runs with 64 bit floats improving reproducibility {pr}`2655` {smaller}`S Dicks`
* {func}`~scanpy.pp.scrublet` and {func}`~scanpy.pp.scrublet_simulate_doublets` were moved from {mod}`scanpy.external.pp` to {mod}`scanpy.pp`.
The `scrublet` implementation is now maintained as part of scanpy {pr}`2703` {smaller}`P Angerer`
* Enhanced dask support for some internal utilities, paving the way for more extensive dask support {pr}`2696` {smaller}`P Angerer`
* {func}`scanpy.pp.pca`, {func}`scanpy.pp.scale`, {func}`scanpy.pl.embedding`, and {func}`scanpy.experimental.pp.normalize_pearson_residuals_pca`
now support a `mask` parameter {pr}`2272` {smaller}`C Bright, T Marcella, & P Angerer`
* New function {func}`scanpy.get.aggregate` which allows grouped aggregations over your data. Useful for pseudobulking! {pr}`2590` {smaller}`Isaac Virshup` {smaller}`Ilan Gold` {smaller}`Jon Bloom`
* {func}`scanpy.tl.rank_genes_groups` no longer warns that it's default was changed from t-test_overestim_var to t-test {pr}`2798` {smaller}`L Heumos`
* {func}`scanpy.tl.leiden` now offers `igraph`'s implementation of the leiden algorithm via via `flavor` when set to `igraph`. `leidenalg`'s implementation is still default, but discouraged. {pr}`2815` {smaller}`I Gold`
* {func}`scanpy.pp.highly_variable_genes` has new flavor `seurat_v3_paper` that is in its implementation consistent with the paper description in Stuart et al 2018. {pr}`2792` {smaller}`E Roellin`
* {func}`scanpy.pp.highly_variable_genes` supports dask for the default `seurat` and `cell_ranger` flavors {pr}`2809` {smaller}`P Angerer`
* Auto conversion of strings to collections in `scanpy.pp.calculate_qc_metrics` {pr}`2859` {smaller}`N Teyssier`
* `scanpy.pp.calculate_qc_metrics` now allows `qc_vars` to be passed as a string {pr}`2859` {smaller}`N Teyssier`

```{rubric} Docs
```

* Re-add search-as-you-type, this time via `readthedocs-sphinx-search` {pr}`2805` {smaller}`P Angerer`
* Fixed a lot of broken usage examples {pr}`2605` {smaller}`P Angerer`
* Improved harmonization of return field of `sc.pp` and `sc.tl` functions {pr}`2742` {smaller}`E Roellin`
* Re-add search-as-you-type, this time via `readthedocs-sphinx-search` {pr}`2805` {smaller}`P Angerer`
* Improved docs for `percent_top` argument of {func}`~scanpy.pp.calculate_qc_metrics` {pr}`2849` {smaller}`I Virshup`

```{rubric} Bug fixes
```

* Updated {func}`~scanpy.read_visium` such that it can read spaceranger 2.0 files {smaller}`L Lehner`
* Fix {func}`~scanpy.pp.normalize_total` {pr}`2466` {smaller}`P Angerer`
* Fix testing package build {pr}`2468` {smaller}`P Angerer`
* Fix {func}`~scanpy.pp.normalize_total` for dask {pr}`2466` {smaller}`P Angerer`
* Fix setting `sc.settings.verbosity` in some cases {pr}`2605` {smaller}`P Angerer`
* Fix all remaining pandas warnings {pr}`2789` {smaller}`P Angerer`
* Fix some annoying plotting warnings around violin plots {pr}`2844` {smaller}`P Angerer`
Expand All @@ -45,13 +42,12 @@
```

* Scanpy is now tested against python 3.12 {pr}`2863` {smaller}`ivirshup`

```{rubric} Ecosystem
```
* Fix testing package build {pr}`2468` {smaller}`P Angerer`

```{rubric} Deprecations
```

* Dropped support for Python 3.8. [More details here](https://numpy.org/neps/nep-0029-deprecation_policy.html). {pr}`2695` {smaller}`P Angerer`
* Deprecated specifying large numbers of function parameters by position as opposed to by name/keyword in all public APIs.
e.g. prefer `sc.tl.umap(adata, min_dist=0.1, spread=0.8)` over `sc.tl.umap(adata, 0.1, 0.8)` {pr}`2702` {smaller}`P Angerer`
* Dropped support for `umap<0.5` for performance reasons. {pr}`2870` {smaller}`P Angerer`
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ dependencies = [
"natsort",
"joblib",
"numba>=0.56",
"umap-learn>=0.3.10",
"umap-learn>=0.5,!=0.5.0",
"pynndescent>=0.5",
"packaging>=21.3",
"session-info",
Expand All @@ -87,6 +87,7 @@ test-min = [
"pytest>=7.4.2",
"pytest-nunit",
"pytest-mock",
"pytest-cov",
"profimp",
]
test = [
Expand Down Expand Up @@ -159,7 +160,6 @@ addopts = [
"--import-mode=importlib",
"--strict-markers",
"--doctest-modules",
"-pscanpy.testing._pytest",
]
testpaths = ["scanpy"]
norecursedirs = ["scanpy/tests/_images"]
Expand Down
13 changes: 0 additions & 13 deletions scanpy/_utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,6 @@ def set_igraph_random_state(random_state: int):


def check_versions():
from .._compat import pkg_version

umap_version = pkg_version("umap-learn")

if version.parse(anndata_version) < version.parse("0.6.10"):
from .. import __version__

Expand All @@ -96,15 +92,6 @@ def check_versions():
f"not {anndata_version}.\nRun `pip install anndata -U --no-deps`."
)

if umap_version < version.parse("0.3.0"):
from . import __version__

# make this a warning, not an error
# it might be useful for people to still be able to run it
logg.warning(
f"Scanpy {__version__} needs umap " f"version >=0.3.0, not {umap_version}."
)


def getdoc(c_or_f: Callable | type) -> str | None:
if getattr(c_or_f, "__doc__", None) is None:
Expand Down
6 changes: 1 addition & 5 deletions scanpy/neighbors/_connectivity.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def umap(
from umap.umap_ import fuzzy_simplicial_set

X = coo_matrix(([], ([], [])), shape=(n_obs, 1))
connectivities = fuzzy_simplicial_set(
connectivities, _sigmas, _rhos = fuzzy_simplicial_set(
X,
n_neighbors,
None,
Expand All @@ -134,8 +134,4 @@ def umap(
local_connectivity=local_connectivity,
)

if isinstance(connectivities, tuple):
# In umap-learn 0.4, this returns (result, sigmas, rhos)
connectivities = connectivities[0]

return connectivities.tocsr()
2 changes: 1 addition & 1 deletion scanpy/preprocessing/_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
By default uses them if they have been determined beforehand.
.. deprecated:: 1.10.0
Use `mask` instead
Use `mask_var` instead
"""

doc_obs_qc_args = """\
Expand Down
2 changes: 2 additions & 0 deletions scanpy/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

import pytest

pytest_plugins = ["scanpy.testing._pytest"]

# just import for the IMPORTED check
import scanpy as _sc # noqa: F401

Expand Down
9 changes: 8 additions & 1 deletion scanpy/tests/test_aggregated.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import numpy as np
import pandas as pd
import pytest
from packaging.version import Version
from scipy.sparse import csr_matrix

import scanpy as sc
Expand Down Expand Up @@ -113,7 +114,6 @@ def test_aggregate_vs_pandas(metric, array_type):
.groupby(["louvain", "percent_mito_binned"], observed=True)
.agg(metric)
)
# TODO: figure out the axis names
expected.index = expected.index.to_frame().apply(
lambda x: "_".join(map(str, x)), axis=1
)
Expand All @@ -124,6 +124,13 @@ def test_aggregate_vs_pandas(metric, array_type):
result_df.index.name = None
result_df.columns.name = None

if Version(pd.__version__) < Version("2"):
# Order of results returned by groupby changed in pandas 2
assert expected.shape == result_df.shape
assert expected.index.isin(result_df.index).all()

expected = expected.loc[result_df.index]

pd.testing.assert_frame_equal(result_df, expected, check_dtype=False, atol=1e-5)


Expand Down
2 changes: 1 addition & 1 deletion scanpy/tests/test_pca.py
Original file line number Diff line number Diff line change
Expand Up @@ -382,7 +382,7 @@ def test_mask_order_warning(request):
UserWarning,
match="When using a mask parameter with anndata<0.9 on a dense array",
):
sc.pp.pca(adata, mask=mask)
sc.pp.pca(adata, mask_var=mask)


def test_mask_defaults(array_type, float_dtype):
Expand Down
Loading

0 comments on commit f8368c6

Please sign in to comment.