-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Square root filtration values for alpha and delaunay-cech complex #1138
base: master
Are you sure you want to change the base?
Square root filtration values for alpha and delaunay-cech complex #1138
Conversation
…nay cech. Python hints for delaunay complex family
// For users to be able to define their own sqrt function on their desired Filtration_value type | ||
using std::sqrt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also put it next to the single place where it is used, but it is ok here.
std::clog << "Alpha complex can be, or not, weighted (requires a file containing weights values).\n\n"; | ||
std::clog << "Weighted Alpha complex can have negative filtration values. If '-square-root-filtrations' is"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention something like that in the documentation of Alpha_complex as well?
if constexpr (square_root_filtrations) | ||
filt = sqrt(filt); | ||
complex.assign_filtration(sh, max(filt, Filtration_value(0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be safer to do the max with 0 before the sqrt.
filt = sqrt(filt); | ||
// maxf = filt except for rounding errors | ||
maxf = max(maxf, filt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a slight risk that a rounding error can make filt negative, with unpredictable results. It feels silly to use sqrt(max(filt,0))
, but I don't know if there is an easy way to avoid it. Well, maybe replacing max(maxf,filt)
by a hand-written comparison so we know what happens when filt is NaN, but it feels like a gratuitous use of NaN.
:param precision: Delaunay complex precision can be 'fast', 'safe' or 'exact'. Default is 'safe'. | ||
:type precision: string | ||
Args: | ||
points (Iterable[Iterable[float]]): A list of points in d-Dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the type redundant?
@@ -354,8 +355,10 @@ class Alpha_complex { | |||
* It also computes the filtration values accordingly to the \ref createcomplexalgorithm if default_filtration_value | |||
* is not set. | |||
* | |||
* \tparam square_root_filtrations If false (default value), it assigns to each simplex a filtration value equal to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(ignore the exact names I use here)
I wonder what is more natural between do_sqrt=true/false and output_squared_values=true/false. As the person writing the code, it is natural to ask if you should call sqrt. As the user... I am not sure, it might be the other one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments on the python interface, I did not have the chance to test this.
Quick questions that I forgot:
- are weights with
sqrt=True
well-defined (e.g., via$x\mapsto \mathrm{sgn}(x)\sqrt{|x|}$ ) ? - do you think it would be reasonable to put the
sqrt
to True ? (I'm in favor of it, to make it consistent with other complexes).
def create_simplex_tree(self, max_alpha_square: float = float('inf'), | ||
filtration: Optional[Literal['alpha', 'cech']] = None, | ||
square_root_filtrations: bool = False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For double / bools (or even vectors / anything that needs a copy to be translated in cpp), I usually find this cleaner to put the cpp type directly so that you don't need to do a copy below (even though I think the compiler will optimize both cases the same); furthermore the errors are less cryptic in python as they don't refer to the inside of a pyx function (or even a "no signature found") but only to the signature.
def create_simplex_tree(self, max_alpha_square: float = float('inf'), | |
filtration: Optional[Literal['alpha', 'cech']] = None, | |
square_root_filtrations: bool = False): | |
def create_simplex_tree(self, double max_alpha_square = float('inf'), | |
filtration: Optional[Literal['alpha', 'cech']] = None, | |
bool square_root_filtrations = False) -> SimplexTree: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know (or remember) that this was possible, but it looks good, there are probably more places where we could simplify the code this way. We need to check what the generated documentation looks like (I think we want to see the Python types there).
return stree | ||
|
||
@staticmethod | ||
def set_float_relative_precision(precision): | ||
def set_float_relative_precision(precision: float): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def set_float_relative_precision(precision: float): | |
def set_float_relative_precision(double precision): |
Also, instead of returning nothing, I usually like to return self
to be able to do some functional programming like pattern, e.g.,
stuff = (
thing.do_this(...)
.do_that(...)
.transform(...)
)
but that's completely subjective. This doesn't cost anything performance-wise (every python object is a fat pointer IIRC).
def get_point(self, vertex): | ||
def get_point(self, vertex: int) -> list[float]: | ||
"""This function returns the point corresponding to a given vertex from the :class:`~gudhi.SimplexTree` (the | ||
same as the k-th input point, where `k=vertex`) | ||
|
||
:param vertex: The vertex. | ||
:type vertex: int | ||
:rtype: list of float | ||
:returns: the point. | ||
Args: | ||
vertex: The vertex. | ||
Returns: | ||
the point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it costly to make so that AlphaComplexes do not reorder points (in the simplextree at least) ? so that this function becomes non-necessary ?
There is also the question of removed points, but
- is it generically not the case ?
- we can also "recreate" the deleted point in the simplextree (with a flag enabled by default)
I think that this behavior is a bit dangerous by default.
For the return type I think it is better to return a numpy array or "à défaut" a tuple.
As loops are cursed in python, another function that could be useful in that case, is a get_points()
method, doing the equivalent of this (which is optimized but still slow)
ac = DelaunayComplex(...)
st = ac.create_simplex_tree(...)
points = np.fromiter((ac.get_point(i) for i in st.get_vertices()), dtype = np.dtype((np.float32, ndim)), count=st.num_vertices())
In general, in python I feel that the classes RipsComplex
, AlphaComplex
, etc. are not that useful in the interface, and could be replaced by helper functions, directly returning the simplextree;
but that's another topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it costly to make so that AlphaComplexes do not reorder points (in the simplextree at least) ? so that this function becomes non-necessary ?
That's already the case, the doc you are quoting says "same as the k-th input point". This function is a left-over from an old release where points could be shuffled (only if they were in dimension 3 IIRC). We could hide this function in the doc if it causes too much confusion.
There is also the question of removed points, but
1. is it generically not the case ?
With weights, it is perfectly generic for points to disappear.
2. we can also "recreate" the deleted point in the simplextree (with a flag enabled by default)
I don't understand what you mean by that. If we just add an extra vertex connected to nothing, that will add a connected component, that's much more problematic than a "missing" point.
For the return type I think it is better to return a numpy array or "à défaut" a tuple.
For such a legacy function that mostly remains to avoid breaking old user code, I am not convinced it is worth changing it and risking to break said user code.
In general, in python I feel that the classes
RipsComplex
,AlphaComplex
, etc. are not that useful in the interface, and could be replaced by helper functions, directly returning the simplextree; but that's another topic.
I think @VincentRouvreau must be tired of me saying exactly that for years now 😉 The class is not just unnecessary for users, it also complicates the implementation because half of the work is done in the constructor, forcing us to store intermediate stuff for later use in create_simplex_tree
.
@@ -18,6 +19,7 @@ from libcpp.string cimport string | |||
from libcpp cimport bool | |||
from libc.stdint cimport intptr_t | |||
import warnings | |||
from typing import Literal, Optional, Iterable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that typing.Iterable
was deprecated (and aliased to collections.abc
)
from typing import Literal, Optional, Iterable | |
from typing import Literal, Optional | |
from collections.abc import Iterable |
If the weights make some (squared) values negative, I don't think theory provides anything meaningful for the sqrt, no. There are a number of applications where the weights can only increase filtration values, so they remain non-negative. Adding the same constant to all weights (without sqrt, so possibly negative) preserves a lot of things (the triangulation, the order between filtration values), that's one way to avoid negative values, but it obviously does not preserve filtration values, so it would be confusing.
I am rather strongly against changing the behavior of old code, that's really not a nice thing to do to users. However
|
Fix #80 and #771