Skip to content

Commit

Permalink
Prepare for 0.3.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
mthh committed Aug 23, 2022
1 parent 1efeb40 commit beb46e2
Show file tree
Hide file tree
Showing 10 changed files with 20,876 additions and 1,660 deletions.
33 changes: 26 additions & 7 deletions CHANGES.rst
Original file line number Diff line number Diff line change
@@ -1,45 +1,64 @@
Changes
=======

0.3.0 (2022-08-23)
------------------

- Add NumPy as a mandatory dependency.

- Only compute matrices in C code and move sorting of the values, casting to double, and computing the actual breaks to Python/Cython code for better maintainability.

- Improve performance by using 1D arrays instead of 2D arrays in ``JenksBreakValues`` C function.

- Preserve the precision of the original list/array of values in the returned breaks.

- Fix bug when requesting a number of class equal to the number of values.

- Raise an exception when the number of classes is greater than the number of unique values (however this might change in the future by choosing to return a list of breaks shorter than the one requested by the user).

- Rename ``nb_class`` parameter to ``n_classes`` (notably to be closer to sklearn ``n_clusters`` parameter).


0.2.4 (2022-08-18)
------------------

- Update package metadata and docstrings.


0.2.3 (2022-08-18)
------------------

- Check size of integer values given to `jenks_breaks` function to avoid Segfault when casting to C double (fixes #23).
- Check size of integer values given to ``jenks_breaks`` function to avoid Segfault when casting to C double (fixes #23).

- Raise an error (instead of printing a warning) when target array contains non-finite values (fixes #23).

- Raise an error when the target numpy.ndarray is not one-dimensional (fixes #25).

- Improve implementation of `JenksBreakValues` C function by using better variable naming and by simplifying the construction of the 'breaks' array (should partly fix #22).
- Improve implementation of ``JenksBreakValues`` C function by using better variable naming and by simplifying the construction of the 'breaks' array (should partly fix #22).

- Add docstrings to `JenksNaturalBreaks` methods.
- Add docstrings to ``JenksNaturalBreaks`` methods.


0.2.2 (2022-08-12)
------------------

- Update docstring to fix return type of `jenks_breaks` (fix #26).
- Update docstring to fix return type of ``jenks_breaks`` (fix #26).


0.2.1 (2022-08-12)
------------------

- Add a method to the `JenksNaturalBreaks` class that calculates the Goodness of Fit Variance thanks to Maurício Gomes / @mgomesq (#17).
- Add a method to the ``JenksNaturalBreaks`` class that calculates the Goodness of Fit Variance thanks to Maurício Gomes / @mgomesq (#17).

- Add optional download numpy using `[interface]` thanks to Muhammad Yasirroni / @yasirroni (#16).
- Add optional download numpy using ``[interface]`` thanks to Muhammad Yasirroni / @yasirroni (#16).

- Replace Travis / AppVeyor by GitHub Actions to build wheels for currently supported python versions on Windows / MacOs / Linux (according to https://devguide.python.org/versions/#supported-versions)


0.2.0 (2020-10-18)
------------------

- Add `JenksNaturalBreaks` for computing breaks in a more object-oriented manner, with an interface similar to those provided by scikit-learn *(requires Numpy to take full advantage of it)* (thanks to @yasirroni, #11)
- Add ``JenksNaturalBreaks`` for computing breaks in a more object-oriented manner, with an interface similar to those provided by scikit-learn *(requires Numpy to take full advantage of it)* (thanks to @yasirroni, #11)


0.1.6 (2020-09-02)
Expand Down
46 changes: 13 additions & 33 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,50 +14,40 @@ Wheels are provided via PyPI for Windows / MacOS / Linux users - Also available
Usage
-----

This package consists of a single function (named ``jenks_breaks``) which takes as input a `list <https://docs.python.org/3/library/stdtypes.html#list>`_ / `tuple <https://docs.python.org/3/library/stdtypes.html#tuple>`_ / `array.array <https://docs.python.org/3/library/array.html#array.array>`_ / `numpy.ndarray <https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html>`_ of integers or floats.
It returns a list of values that correspond to the limits of the classes (starting with the minimum value of the series - the lower bound of the first class - and ending with its maximum value - the upper bound of the last class).
Two ways of using `jenkspy` are available:

- by using the ``jenks_breaks`` function which takes as input a `list <https://docs.python.org/3/library/stdtypes.html#list>`_ / `tuple <https://docs.python.org/3/library/stdtypes.html#tuple>`_ / `array.array <https://docs.python.org/3/library/array.html#array.array>`_ / `numpy.ndarray <https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html>`_ of integers or floats and returns a list of values that correspond to the limits of the classes (starting with the minimum value of the series - the lower bound of the first class - and ending with its maximum value - the upper bound of the last class).

.. code:: python
>>> import jenkspy
>>> import random
>>> list_of_values = [random.random()*5000 for _ in range(12000)]
>>> breaks = jenkspy.jenks_breaks(list_of_values, nb_class=6)
>>> breaks
(0.1259707312994962, 1270.571003315598, 2527.460251085392, 3763.0374498649376, 4999.87456576267)
>>> import json
>>> with open('tests/test.json', 'r') as f:
... # Read some data from a JSON file
... data = json.loads(f.read())
...
>>> jenkspy.jenks_breaks(data, nb_class=5) # Asking for 5 classes
(0.0028109620325267315, 2.0935479691252112, 4.205495140049607, 6.178148351609707, 8.09175917180255, 9.997982932254672)
>>> jenkspy.jenks_breaks(data, n_classes=5) # Asking for 5 classes
[0.0028109620325267315, 2.0935479691252112, 4.205495140049607, 6.178148351609707, 8.09175917180255, 9.997982932254672]
# ^ ^ ^ ^ ^ ^
# Lower bound Upper bound Upper bound Upper bound Upper bound Upper bound
# 1st class 1st class 2nd class 3rd class 4th class 5th class
# (Minimum value) (Maximum value)
This package also support a ``JenksNaturalBreaks`` class as interface (it requires `NumPy` and it is inspired by ``scikit-learn`` classes).
- by using the ``JenksNaturalBreaks`` class that is inspired by ``scikit-learn`` classes).

The ``.fit`` and ``.group`` behavior is slightly different from ``jenks_breaks``, by accepting value outside the range of the minimum and maximum value of ``breaks_``, retaining the input size. It means that fit and group will use only the ``inner_breaks_``. All value below the min bound will be included in the first group and all value higher than the max bound will be included in the last group.

Install using ``pip install jenkspy[interface]`` to automatically include ``NumPy``.


.. code:: python
>>> from jenkspy import JenksNaturalBreaks
>>> x = [0,1,2,3,4,5,6,7,8,9,10,11]
>>> x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> jnb = JenksNaturalBreaks(4) # Asking for 4 clusters
>>> jnb.fit(x)
>>> jnb.fit(x) # Create the clusters according to values in 'x'
>>> print(jnb.labels_) # Labels for fitted data
... print(jnb.groups_) # Content of each group
... print(jnb.breaks_) # Break values (including min and max)
Expand Down Expand Up @@ -87,12 +77,6 @@ Installation
pip install jenkspy
+ **To include numpy in pypi**

.. code:: shell
pip install jenkspy[interface]
+ **From source**

.. code:: shell
Expand All @@ -101,7 +85,6 @@ Installation
cd jenkspy/
python setup.py install
+ **For anaconda users**

.. code:: shell
Expand All @@ -110,15 +93,12 @@ Installation
Requirements :
----------------------------------------------
--------------

- NumPy\ :sup:`*`
- C compiler\ :sup:`+`
- Python C headers\ :sup:`+`
- `Numpy <https://numpy.org>`_

\ :sup:`*` only for using ``JenksNaturalBreaks`` interface
- Only for building from source: C compiler, Python C headers and optionally Cython.

\ :sup:`+` only for building from source

Motivation :
------------
Expand All @@ -128,7 +108,7 @@ Motivation :
using *appveyor* / *travis* at first - now it uses *GitHub Actions*).
- Getting the break values! (and fast!). No fancy functionality provided,
but contributions/forks/etc are welcome.
- Other python implementations are currently existing but not as fast nor available on PyPi.
- Other python implementations are currently existing but not as fast or not available on PyPi.

.. |Build status GH| image:: https://github.com/mthh/jenkspy/actions/workflows/wheel.yml/badge.svg
:target: https://github.com/mthh/jenkspy/actions/workflows/wheel.yml
Expand Down
5 changes: 3 additions & 2 deletions jenkspy/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__version__ = "0.2.4"
__version__ = "0.3.0"

from .core import jenks_breaks
from .core import _jenks_matrices
from .core import JenksNaturalBreaks


__all__ = ['jenks_breaks', 'JenksNaturalBreaks']
__all__ = ['jenks_breaks', '_jenks_matrices', 'JenksNaturalBreaks']
Loading

0 comments on commit beb46e2

Please sign in to comment.