Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce PyCapsule #1012

Open
wants to merge 7 commits into
base: production
Choose a base branch
from
Open

Introduce PyCapsule #1012

wants to merge 7 commits into from

Conversation

cv3d
Copy link
Contributor

@cv3d cv3d commented Jul 28, 2021

Problem

The association between a line of code calling some PyArray_* functions and the other line of code deallocating its data is not very obvious, which results in memory leaks when some of these lines are adapted in new places. In the following example, the developer might not notice the importance of Py_DECREF.

PyArrayObject *pPyObj = PyArray_GETCONTIGUOUS(/* some code goes here */);
/* some lines of code using pPyObj and potentially other logic as well */
Py_DECREF(pPyObj);

AutoPyArrayObjectDereferencer simplified memory deallocation by simply exiting a scope. It also brought Py_DECREF much nearer to the initial allocation line. However, having two classes in two lines of code is still not ideal, and mistakes can still happen easily.

Solution

Rather than storing the array pointer in two different classes (PyArrayObject and AutoPyArrayObjectDereferencer), this PR suggests storing them in a single one (PyArrayCapsule) that does the job of both. This way, making accidental memory leakages is highly unlikely. Moreover, inspired by smart pointers, PyArrayCapsule is more elegant and allows better flexibility/extensibility.

@cv3d cv3d requested a review from rdiankov July 28, 2021 15:24

private:
PyArrayObject* _ptr;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is more or less equivalent to:

class PyArrayCapsule : public boost::shared_ptr<PyArrayObject>
{
public:
    PyArrayCapsule(PyArrayObject* ptr) : boost::shared_ptr<PyArrayObject>(ptr, PyArrayCapsule::Deleter()) {}
    PyArrayCapsule() : PyArrayCapsule(nullptr) {}

private:
    class Deleter
    {
    public:
        void operator()(PyArrayObject *ptr) {
            if ( ptr != nullptr ) {
                Py_DECREF(p);
            }
        }
    };
};

I would recommend against writing your own shared pointer because it is error-prone. For instance, in your case copy constructor is not deleted and will cause double free.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziyan I don't think the purpose is to use reference counting here as much as just trigger the destructor, which the old implementation should be fine except for the fact that it throws an exception whenever the pointer is null.
what advantage does boost::shared_ptr have besides reduced code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It prevents this misuse:

PyArrayCapsule SomeFunction() {
    PyArrayCapsule cap(...);
    return cap
}

Or

PyArrayCapsule a(...);
PyArrayCapsule b = a;

The current implementation, the above 2 patterns will result in Py_DECREF being called twice in each case.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziyan hmm... worrying about usages like that means even the current implementation has this problem, but i highly doubt anyone will do that. nevertheless, i made your point.
would the following one-liner work?

boost::shared_ptr<PyArrayObject> parray(PyArray_GETCONTIGUOUS(reinterpret_cast<PyArrayObject*>(obj.ptr())), Py_DECREF);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_DECREF is not a class, so you have to have a deleter class

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_DECREF just needs to be a function for this to work..

Copy link
Owner

@rdiankov rdiankov Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it is a macro ;0/

#define Py_DECREF(op)                                   \
    do {                                                \
        if (_Py_DEC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
        --((PyObject*)(op))->ob_refcnt != 0)            \
            _Py_CHECK_REFCNT(op)                        \
        else                                            \
        _Py_Dealloc((PyObject *)(op));                  \
    } while (0)

however, there's also a function

PyAPI_FUNC(void) Py_DecRef(PyObject *);

perhaps that can be used instead..?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know that. Then yes, you really just need to use it like this:

boost::shared_ptr<PyArrayObject> a(..., Py_DECREF);

For convienence, consider:

class PyArrayCapsule : public boost::shared_ptr<PyArrayObject>
{
public:
    PyArrayCapsule(PyArrayObject* ptr) : boost::shared_ptr<PyArrayObject>(ptr, Py_DECREF) {}
    PyArrayCapsule() : PyArrayCapsule(nullptr) {}
};

Copy link
Collaborator

@ziyan ziyan Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this?

template<class T>
class PyCapsule : public boost::shared_ptr<T>
{
public:
    PyCapsule(T* ptr) : boost::shared_ptr<T>(ptr, PyCapsule<T>::_Delete) {}
    PyCapsule() : PyCapsule(nullptr) {}

private:
    static void _Delete(T *ptr) {
        if (ptr != nullptr) {
            Py_DECREF(ptr);
        }
    }
};

typedef PyCapsule<PyArrayObject> PyArrayCapsule;

Copy link
Contributor Author

@cv3d cv3d Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziyan @rdiankov I adapted the proposal according to your valuable comments. Many thanks~

  • prevented the copy constructor to prevent more than a single decrement
  • made the proposal as a template class to support more use cases

However, I think using the smart pointer is an overkill for this small task, especially with its atomic and thread-safety stuff.
More importantly, this class is going to be used in other projects, in which explicitly raising on cases of nullptr being passed is really needed, mostly to catch bugs easily, thus I wonder if you can be fine with the current version?

@cv3d cv3d force-pushed the PyArrayCapsule branch from 0a0aeec to 24e6b63 Compare July 30, 2021 04:35
@cv3d cv3d changed the title Introduce PyArrayCapsule Introduce PyCapsule Jul 31, 2021
@cv3d
Copy link
Contributor Author

cv3d commented Aug 9, 2021

@rdiankov @ziyan Is these anything I should do to get this shipped?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants