-
Notifications
You must be signed in to change notification settings - Fork 233
Description
Even though #171 and the subsequent fix in PR #188/270 addressed one class of socket leaks, we’re still seeing file-descriptor leaks when creating and tearing down Zeroconf instances in rapid succession (e.g. one per “cycle”). Over time, the process exhausts its FD limit with errors like OSError: [Errno 24] Too many open files.
What’s happening
Persistent sockets remain open
Every time Zeroconf(interfaces=[…]) is called, it creates a set of listening and responding sockets. Although those sockets are closed (via .close()) when the instance is torn down, some underlying file descriptors remain open in the OS.
Lingering asyncio selectors and threads
The library spins up an asyncio event loop per instance. Even after .close(), we observe threads (in selector_events.py and _core.py) that never fully tear down, holding onto their self-pipes or sockets.
ifaddr adapter enumeration leak
In our environment, each new Zeroconf triggers an ifaddr.get_adapters() call—which under certain Linux distributions seems to leave open netlink sockets or file handles that are never garbage-collected.
Reproduction steps
Write a simple loop that does:
from zeroconf import Zeroconf
import time
import os
# Run a tight loop creating & closing Zeroconf instances
for i in range(1, 1001):
zc = Zeroconf(interfaces=['127.0.0.1'])
zc.close()
if i % 100 == 0:
print(f"Iteration {i}")
time.sleep(0.1) # give the OS a moment
def fd_count():
return len(os.listdir(f"/proc/{os.getpid()}/fd"))
for i in range(1,1001):
zc = Zeroconf(interfaces=['127.0.0.1'])
zc.close()
if i % 100 == 0:
print(f"Iteration {i}, open-fds: {fd_count()}")Expected behavior
After zc.close(), all sockets, event loops, and threads associated with that Zeroconf instance should be fully torn down, and the OS file descriptor count should return to its prior level.
Environment
python-zeroconf version: ≥ 0.28.0 (includes the fixes from PR #188/270)
OS: Ubuntu 24.04, Linux kernel 6.x
Python: 3.12
Suggested investigation areas
- Verify that every internal selector and asyncio loop is explicitly closed and that its underlying sockets/pipes are closed too.
- Audit ifaddr.get_adapters() and its use of netlink sockets—ensure that any sockets opened for adapter enumeration are closed.
- Add a regression test that repeatedly constructs and closes Zeroconf objects and asserts that FD count remains constant.
I'm trying to write a unidirectional mDNS reflector solution, it works flawlessly if I start, and then stop the main reflector script rather than simply closing the sockets, but this requires a wrapper script to then handle this with added complexity along with added noise & advertisements appearing, then disappearing when the reflector script exits.
Unfortunately, I don't know enough to patch this, and implement a fix effectively, I've been fault finding with the use of AI to be able to get this write up where it is so there could be some incorrect information, but running the code above I can re-produce the issue on different systems.
If you need any help testing, I'd be happy to assist where I can