[WIP] gh-151722: Defer GC tracking of frozendict.fromkeys() as possible#152021
[WIP] gh-151722: Defer GC tracking of frozendict.fromkeys() as possible#152021corona10 wants to merge 4 commits into
Conversation
Co-authored-by: tonghuaroot <tonghuaroot@gmail.com>
|
cc @tonghuaroot (I add you as the co-author), too many things need to be addressed from #151967 |
| int status; | ||
|
|
||
| PyTypeObject *cls_type = _PyType_CAST(cls); | ||
| d = _PyObject_CallNoArgs(cls); |
There was a problem hiding this comment.
If there is a good way to obtain an untracked object here, that would be great, but I think that is a separate topic.
There was a problem hiding this comment.
Is these cases considered?
- subclass of frozendict implements
__init__instead of__new__, passes itsselfto another thread - another thread obtains a reference to the instance via GC while the
__init__is being executed
| done: | ||
| // Built untracked above; GC-track now that it is complete. | ||
| if (d != NULL) { | ||
| assert(!_PyObject_GC_IS_TRACKED(d)); |
There was a problem hiding this comment.
@tonghuaroot
By this, we don't need additional test code.
There was a problem hiding this comment.
Agreed, the assert covers it. Thanks for taking this over.
| } | ||
| // The constructor returns a tracked object; keep it untracked while it is | ||
| // filled and GC-track it once complete. | ||
| _PyObject_GC_UNTRACK(d); |
There was a problem hiding this comment.
Thanks for driving this. I believe the test_fromkeys CI failure originates here: for an empty exact frozendict, cls() already returns an untracked object (thanks to your gh-151740 defer-track), so this unconditional _PyObject_GC_UNTRACK trips its internal IS_TRACKED assertion. Guarding it with if (_PyObject_GC_IS_TRACKED(d)) should resolve it. Please feel free to adjust as you see fit.
There was a problem hiding this comment.
tl; dr I wrote PR gh-152067 which is a similar fix, but with a safer design.
I don't think that calling _PyObject_GC_UNTRACK(d) on any type is a good approach. We need to track again the dictionary in many code paths which is very tricky. For example, if dict_dict_fromkeys() fails (return NULL), we need to track again the dictionary... but we no longer have a (safe) reference to d since dict_dict_fromkeys() called Py_DECREF(d). Having to track again the dictionary on error is really complicated to implement.
Moreover, we cannot use Py_INCREF(d) at the beginning (to keep a safe reference to the dictionary) since assert(can_modify_dict(mp)); fails in this case on frozendict.
IMO we should only be worried about the dictionary being not tracked by the GC if the dictionary is a frozendict. And it's a frozendict, we can copy it using frozendict_new_untracked() to create a copy which is not tracked by the GC.
Uh oh!
There was an error while loading. Please reload this page.