gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

skirpichev · 2024-12-13T16:17:34Z

Benchmark	ref	patch
int(Decimal(1<<7))	648 ns	474 ns: 1.37x faster
int(Decimal(1<<38))	740 ns	501 ns: 1.48x faster
int(Decimal(1<<300))	2.06 us	2.02 us: 1.02x faster
int(Decimal(1<<3000))	115 us	115 us: 1.00x faster
Geometric mean	(ref)	1.20x faster

>>> sys.int_info[:2]
(30, 4)

# bench_Decimal-to-int.py

import pyperf
from decimal import Decimal

values = ['1<<7', '1<<38', '1<<300', '1<<3000']

runner = pyperf.Runner()
for v in values:
    d = Decimal(eval(v))
    bn = 'int(Decimal('+v+'))'
    runner.bench_func(bn, int, d)

Issue: Remove private _PyLong_FromDigits() function #127937

…757)

picnixz · 2024-12-13T17:31:54Z

hide _PyLong_FromDigits()? it's not used outside of the longobject.c anymore

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

news

Not needed I think, unless you want to indicate the performance gain (it's always nice to know that something is faster). I did report the improvements of fnmatch.translate, so I think you can report those improvements as well.

Modules/_decimal/_decimal.c

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev · 2024-12-14T00:47:15Z

Modules/_decimal/_decimal.c

+    n = (mpd_sizeinbase(x, 2) + bpd - 1) / bpd;
+    PyLongWriter *writer = PyLongWriter_Create(mpd_isnegative(x), n,
+                                               (void**)&ob_digit);
+    /* mpd_sizeinbase can overestimate size by 1 digit, set it to zero. */


BTW, this looks as a bug in the mpdecimal. C.f. the GNU GMP, the mpz_sizeinbase docs says: "If base is a power of 2, the result is always exact".

skirpichev · 2024-12-14T01:05:31Z

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

I've updated the pr descriptions with my research. So far, I've found just one use case.

At least, I think we should deprecate (not soft) this. This apparently affects not so much projects and there is now a public alternative. @picnixz, what do you think?

picnixz · 2024-12-14T01:30:40Z

At least, I think we should deprecate (not soft) this

I would be fine with deprecating it, saying which alternative to use, so that we can simply remove it in some later versions. I think Victor was the one who removed and restored it so we should ask him as well.

picnixz · 2024-12-14T01:31:31Z

should dec_from_long() be modified here? (To use the PyLong_Export API.) I would prefer to do this in a separate PR.

If you prefer doing it in a follow-up PR because you fear it would be too hard to review, then it's better. If the change is minimal, we can do it this one (I didn't check the code to change)

skirpichev · 2024-12-14T02:03:21Z

If the change is minimal, we can do it this one

You can estimate them looking on the gmpy2 pr (referenced in the PEP): aleaxit/gmpy#495 In principle, I don't think that this will complicate review to much. On another hand, changes looks logically independent. I would rather include here deprecation.

picnixz · 2024-12-14T02:11:14Z

Let's change dec_from_long in another PR since the changes are independent (sorry it's 3 AM here and I don't have much energy).
For deprecating _PyLong_FromDigits, maybe it's better to make a separate PR so that we have a dedicated NEWS entry and re-use the issue that actually removed the private API (and not the issue that reverted the removal). WDYT? (we would also be able to change PyLong_Copy accordingly)

Modules/_decimal/_decimal.c

Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst

* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news

skirpichev · 2024-12-26T06:44:39Z

@serhiy-storchaka, now I did memset, zeroed all digits before import. I don't see a measurable difference:

Benchmark	ref	patch-no-memset	patch
int(Decimal(1<<7))	626 ns	516 ns: 1.21x faster	520 ns: 1.20x faster
int(Decimal(1<<38))	719 ns	505 ns: 1.42x faster	515 ns: 1.39x faster
int(Decimal(1<<300))	2.07 us	1.98 us: 1.04x faster	2.00 us: 1.03x faster
Geometric mean	(ref)	1.16x faster	1.15x faster

Benchmark hidden because not significant (1): int(Decimal(1<<3000))

outdated

skirpichev · 2025-01-07T03:16:57Z

Ok, I did some cleanup, added asserts. I think that @serhiy-storchaka concerns were addressed: now digits array initialized.

Should we add a safe path for systems with broken log10?

From my benchmarks it seems that caching the layout parameters has very little effect on performance (or no at all). So, I don't think we should do this.

serhiy-storchaka

If libmpdec uses floating-point log10, it will likely does not work for integers with more than 2**53 bits (and perhaps before this limit). The maximal Decimal has 2**62 bits.

cc @tim-one, @mdickinson, @skrah

Modules/_decimal/_decimal.c

skirpichev · 2025-01-07T11:17:37Z

If libmpdec uses floating-point log10

It's used for base argument, which is uint32_t.

Modules/_decimal/_decimal.c

vstinner

LGTM.

mpd_qexport_*() functions used here with assumption, that no resizing
occur, i.e. len was obtained by a call to mpd_sizeinbase.

IMO it's a reasonable trade-off and an acceptable risk.

Modules/_decimal/_decimal.c

serhiy-storchaka · 2025-01-07T14:42:18Z

It is not guaranteed, and there is no way to enforce that resize does not occur in mpd_qexport_*() functions.

How to estimate the risk? If Python has undefined behavior in one of billion cases, is it acceptable risk?

Co-authored-by: Victor Stinner <[email protected]>

skirpichev · 2025-01-08T02:48:00Z

It is not guaranteed, and there is no way to enforce that resize does not occur in mpd_qexport_*() functions.

@serhiy-storchaka, we have a confirmation from the library author, that this expectation is correct, unless libm is broken. I guess it's not just one place where we depend on quality of system libraries.

Or do you believe that mpd_sizeinbase() can underestimate size with correct log10? If so, it's a bug. Lets just fix one. Here is the function (IIRC it's same in latest upstream version):

cpython/Modules/_decimal/libmpdec/mpdecimal.c

Lines 8084 to 8113 in 65ae3d5

    
           size_t 
        
           mpd_sizeinbase(const mpd_t *a, uint32_t base) 
        
           { 
        
               double x; 
        
               size_t digits; 
        
               double upper_bound; 
        
               assert(mpd_isinteger(a)); 
        
               assert(base >= 2); 
        
               if (mpd_iszero(a)) { 
        
                   return 1; 
        
               } 
        
               digits = a->digits+a->exp; 
        
           #ifdef CONFIG_64 
        
               /* ceil(2711437152599294 / log10(2)) + 4 == 2**53 */ 
        
               if (digits > 2711437152599294ULL) { 
        
                   return SIZE_MAX; 
        
               } 
        
               upper_bound = (double)((1ULL<<53)-1); 
        
           #else 
        
               upper_bound = (double)(SIZE_MAX-1); 
        
           #endif 
        
               x = (double)digits / log10(base); 
        
               return (x > upper_bound) ? SIZE_MAX : (size_t)x + 1; 
        
           }

Edit:
In fact, we need much simpler case, as base is a power of 2. So, we want ndigits * log2(10)/shift. This should be a correct bound:

(size_t)(3.321928094887363*((ndigits + shift - 1)/shift))

For shift=30 and ndigits ~ 1<<53 (upper_bound for typical case) - it will overestimate size in just 1 digit.

vstinner · 2025-01-08T10:06:18Z

@picnixz: Would you mind to review the latest PR version? It changed a lot since last month.

picnixz

A few final comments on English wording and some variables. Otherwise, LGTM. Sorry Victor, the ping got under my radar.

Modules/_decimal/_decimal.c

Co-authored-by: Bénédikt Tran <[email protected]>

pythongh-102471: convert decimal module to use PyLongWriter API (PEP …

80f1a04

…757)

bedevere-app bot mentioned this pull request Dec 13, 2024

The C-API for Python to C integer conversion is, to be frank, a mess. #102471

Open

skirpichev requested review from vstinner and picnixz December 13, 2024 16:42

picnixz reviewed Dec 13, 2024

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

skirpichev and others added 2 commits December 14, 2024 03:40

+ news

c13b7d2

Apply suggestions from code review

589f926

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev commented Dec 14, 2024

View reviewed changes

skirpichev marked this pull request as ready for review December 14, 2024 01:05

bedevere-app bot added the awaiting review label Dec 14, 2024

This comment was marked as outdated.

Sign in to view

skirpichev marked this pull request as draft December 14, 2024 05:07

bedevere-app bot removed the awaiting review label Dec 14, 2024

skirpichev added 2 commits December 14, 2024 08:42

Merge branch 'master' into long_export-decimal

f27adef

+ adapt dec_from_long() to use PEP 757

6669b89

skirpichev changed the title ~~gh-102471: convert decimal module to use PyLongWriter API (PEP 757)~~ gh-102471: convert decimal module to use import/export API for ints (PEP 757) Dec 14, 2024

skirpichev requested a review from picnixz December 14, 2024 06:53

skirpichev marked this pull request as ready for review December 14, 2024 07:10

bedevere-app bot added the awaiting review label Dec 14, 2024

skirpichev mentioned this pull request Dec 14, 2024

gh-127937: deprecate _PyLong_FromDigits() function #127939

Draft

Merge branch 'master' into long_export-decimal

05ec274

vstinner reviewed Dec 16, 2024

View reviewed changes

skirpichev added 2 commits December 16, 2024 10:56

Don't use PyLong_GetNativeLayout()

6e46bc1

Address review:

7f0061f

* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news

skirpichev mentioned this pull request Dec 26, 2024

gh-102471: convert decimal module to use PyLong_Export API (PEP 757) #128267

Merged

Merge branch 'master' into long_export-decimal

37ec841

skirpichev marked this pull request as draft January 6, 2025 11:10

bedevere-app bot removed the awaiting merge label Jan 6, 2025

skirpichev changed the title ~~gh-127937: convert decimal module to use import/export API for ints (PEP 757)~~ gh-127937: convert decimal module to use import API for ints (PEP 757) Jan 6, 2025

bedevere-app bot added the awaiting review label Jan 6, 2025

skirpichev added 3 commits January 6, 2025 15:52

Merge branch 'master' into long_export-decimal

d4728c4

+ cleanup, add asserts

f6a4afb

Merge branch 'master' into long_export-decimal

4b07189

skirpichev marked this pull request as ready for review January 7, 2025 03:10

skirpichev requested review from vstinner and picnixz January 7, 2025 03:10

serhiy-storchaka reviewed Jan 7, 2025

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

skirpichev added 2 commits January 7, 2025 14:03

Merge branch 'master' into long_export-decimal

4db7917

address review

0fec6e1

serhiy-storchaka reviewed Jan 7, 2025

View reviewed changes

Modules/_decimal/_decimal.c Show resolved Hide resolved

address review

59636f9

vstinner approved these changes Jan 7, 2025

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

bedevere-app bot added awaiting merge and removed awaiting review labels Jan 7, 2025

Update Modules/_decimal/_decimal.c

b8bf49f

Co-authored-by: Victor Stinner <[email protected]>

picnixz approved these changes Jan 11, 2025

View reviewed changes

Modules/_decimal/_decimal.c Show resolved Hide resolved

Modules/_decimal/_decimal.c Show resolved Hide resolved

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Update Modules/_decimal/_decimal.c

a8189e6

Co-authored-by: Bénédikt Tran <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

skirpichev commented Dec 13, 2024 •

edited

Loading

picnixz commented Dec 13, 2024

skirpichev Dec 14, 2024

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024

picnixz commented Dec 14, 2024

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024 •

edited

Loading

This comment was marked as outdated.

skirpichev commented Dec 26, 2024

skirpichev commented Jan 7, 2025

serhiy-storchaka left a comment

skirpichev commented Jan 7, 2025

vstinner left a comment

serhiy-storchaka commented Jan 7, 2025

skirpichev commented Jan 8, 2025 •

edited

Loading

vstinner commented Jan 8, 2025

picnixz left a comment

gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

Are you sure you want to change the base?

gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

Conversation

skirpichev commented Dec 13, 2024 • edited Loading

picnixz commented Dec 13, 2024

skirpichev Dec 14, 2024

Choose a reason for hiding this comment

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024

picnixz commented Dec 14, 2024

skirpichev commented Dec 14, 2024

picnixz commented Dec 14, 2024 • edited Loading

This comment was marked as outdated.

skirpichev commented Dec 26, 2024

skirpichev commented Jan 7, 2025

serhiy-storchaka left a comment

Choose a reason for hiding this comment

skirpichev commented Jan 7, 2025

vstinner left a comment

Choose a reason for hiding this comment

serhiy-storchaka commented Jan 7, 2025

skirpichev commented Jan 8, 2025 • edited Loading

vstinner commented Jan 8, 2025

picnixz left a comment

Choose a reason for hiding this comment

skirpichev commented Dec 13, 2024 •

edited

Loading

picnixz commented Dec 14, 2024 •

edited

Loading

skirpichev commented Jan 8, 2025 •

edited

Loading