Skip to content

Conversation

@bradh352
Copy link
Contributor

@bradh352 bradh352 commented Feb 6, 2025

Memory management is a foreign concept to python in general, however it
appears DNode and Context require .free() and .destroy() to be called,
respectively, to avoid a memory leak.

This seems easily avoidable by simply attaching a destructor to the
class so the python garbage collector can clean up automatically when all
references go out of scope.

We add a reference to any DNode object derived from another DNode object
(but not duplicated) so we can determine if we are allowed to be the one
to destroy the DNode cdata, as well as to keep proper reference counting
in case the owner goes out of scope but a dependent reference is still
in scope.

This change should also be compatible with existing integrations which may be
aware of this as it sets the internal cdata reference to None and checks
it to ensure it doesn't run the cleanup again. That said, I think calling
DNode.free() in this case is still risky since there are no checks to
ensure this is truly valid like the destructor does.

Signed-off-by: Brad House (@bradh352)

@bradh352
Copy link
Contributor Author

bradh352 commented Feb 6, 2025

guess I need to dig into the test system to see what it doesn't like about this....

Memory management is a foreign concept to python in general, however it
appears DNode and Context require .free() and .destroy() to be called,
respectively, to avoid a memory leak.

This seems easily avoidable by simply attaching a destructor to the
class so the python garbage collector can clean up automatically when all
references go out of scope.

We add a reference to any DNode object derived from another DNode object
(but not duplicated) so we can determine if we are allowed to be the one
to destroy the DNode cdata, as well as to keep proper reference counting
in case the owner goes out of scope but a dependent reference is still
in scope.

This change should also be compatible with existing integrations which may be
aware of this as it sets the internal `cdata` reference to None and checks
it to ensure it doesn't run the cleanup again.  That said, I think calling
DNode.free() in this case is still risky since there are no checks to
ensure this is truly valid like the destructor does.

Signed-off-by: Brad House <[email protected]>
@bradh352 bradh352 marked this pull request as draft February 7, 2025 14:04
@bradh352
Copy link
Contributor Author

bradh352 commented Feb 7, 2025

tests are getting further, but it looks like I'll need to dig deeper, this is fairly hard to debug since its crashing in C code, not sure how to dereference the python line that causes the crash or if its in the garbage collecting process.

likely i need to extend the reference linking even further to additional child structures from DNode, and possibly also enforce each DNode references the Context it is associated with, it looks like in a few places that may not be true.

marking as a draft, going to put this on hold as I've sprinkled some .free() and .destroy() in the implementation I'm working on for now ...
bradh352/sonic-buildimage@5019c24

@steweg
Copy link
Contributor

steweg commented Nov 27, 2025

Hi, first at all I am not a maintainer, but it seems that you are breaking existing API definition. On top of that the construct easily lead to segfaults due to access to non existing data in various APIs. E.g. using children API from parent DNode will return different instance of DNode but with the same DNode.cdata. So playing with free and unlink operation can be very tricky.

In my project, I am solving the same problem by putting a wrapper around DNode called AutoFreeDNode, which unlink itself (+ its subtree) from the rest of the data tree and free the cdata. Therefore if you are traversing the data tree in anyway (e.g. using find_one or find_all or children), you will get just the regular DNode as the API result. If you need to keep it valid for whatever reason, you need to keep it valid you need to inject dependency that node deeper in the tree keep the nearest AutoFreeDNode alive. So the wrapper needs to created explicitly from existing DNode when necessary. The top level DNodes shall either not have any dependency or have dependency on "first sibling". The reason is that by unlinking first sibling the most expected state is that whole data tree including siblings will be released (similarly as if you put dict_to_dnode you get multiple siblings on top level, but you have just single DNode reference, which is exactly "first sibling").

Having as a parent Context is very non-intuitive and more importantly, it shall not be Context but rather the exact Module as if you somehow lost the Module from Context, you will lose ability to interpret the data as the associated schema is already gone even if Context itself is still valid.

And the last part is that you need to be sure that you have exactly one and only AutoFreeDNode wrapper around DNode.cdata. Otherwise you are risking double free or segfaults.

Also a side node that relaying on garbage collector is not always the best as different python implementations treat it differently. E.g. regular Python 3.10+ will call garbage collector almost immediately after you free the last reference. But pypy will not and you need to do that more/less explicitly. So you can get unexpected results in certain environments. For this part I didn't found any better solution.

If there is a common need, I can probably provide a patch based on my project code. It just need a small rework as it have more functionality as it is required for this memory management purpose.

@bradh352
Copy link
Contributor Author

This sort of library is really hard to wrap in python, since you're getting pointers to a leaf in a much larger data structure. Those leafs could be invalidated at any point due to manipulation of the parent. This is sort of why I stalled, I haven't thought of the best way to handle that. That said, it may just be best to document that as a limitation.

If you have another solution that you can submit that works and is simpler, I'm all for that. I was just looking for a solution that would work for migrating from old libyang1 python to the newer CFFI wrapper. I ended up just adding the necessary frees where needed rather than using this due to time constraints in what I was doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants