-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with mDNS Queries/Responses #22
Comments
I'm also running VLANs. I use Pfsense and the Avahi package to provide mDNS. Awesome work @cbpowell! |
I'm still reading through this (excellent!) report, but I just have to take a moment and wonder
what the hell was I thinking when I wrote that? 😆 Okay, back to reading... |
I'm going to take some notes here while I work through this:
I suspect something in your stack is inspecting (and maybe altering) mDNS responses as they come back by re-writing the transaction ID and/or refusing to pass packets with a malformed source port. I'll get a change out right now to fix the source port bug. Can you share the script you're using to test |
Sure thing - sorry I should have just done that yesterday. The code is in this gist. With a preceding Absolutely makes sense on the unique What's weird with the |
Super weird. The internet appears to disagree about whether mDNS should have a qid of 0, but I can't find anything in the spec saying it must be. What's weirder is that I can reproduce your results, even directly on my RPi running Homebridge, which means the current code should never work, right? |
Ah, okay, here we go, from RFC 6762 section 18.1:
|
Ah, here we go, from RFC 6762 section 6.7:
|
Okay, got it. In the original code, I set the source port to 0, which causes mDNS responders to behave as though the requestor wants a "Legacy unicast response". |
Check out this gist. I get the following output:
|
Okay, so the behavior here is broken in the case of multiple bridges, as I could get multiple responses for the query for |
I'm going to have to think about this, and probably refactor how discovery works (again 😆). The DNS-SD library I use only looks for PTR records, but the SRV record contains the bridge ID. Maybe asking for PTR records is redundant, and I could possibly only ask for SRV records, but I'm not sure what the implications of that would be w.r.t. DNS-SD (does DNS-SD require PTR record requests or something?). At the very least, I need to back out my most recent change, I think, as I'm relying on having the port set in order for the qid to work. |
Alright, I unpublished 2.3.7. |
Oh man, as a quick response - I'm at work and it's killing me that I can't reply and work on it live! I had done some Wireshark investigation since yesterday too, but I don't think I uncovered anything that you didn't also just describe. When you request a SRV you do also get an A record with the IP in the |
It looks like I do get an A record along with the SRV response:
|
Yep same with the mdns-resolver script:
|
Side note, those dig queries just don't seem to work across the OPNsense mdns repeater. |
Yeah, I'd actually expect they wouldn't since it's not well-formed mDNS. Send a packet (using my gist above) with qid zero and source port 5353, and I bet it repeats it. |
I was getting the same problem as @readybeginn in homebridge-lutron-caseta-leap#10, so I poked around a bit and tried some things. I have an OPNsense router with multiple VLANs, though I've turned on the mDNS repeater and it's generally worked fine for all other things HomeKit. As far as I could tell the mDNS queries/responses were getting through the firewall, but the plugin wasn't seeing my hub at initial setup. My Homebridge also runs in Docker with
--net=host
.I tried running a simple
multicast-dns
standalone script that fires off the same query as the plugin here and seemed to be getting responses, so I tried working through the plugin code. I found that the response packets never have anid
value other than 0 for me, with the plugin or withmulticast-dns
independently. Not sure if that's an issue with my network/devices/firewall or what?I also had to change the
resolver
configuration to use port5353
(or leave it off). With the setting of0
it never received any packets - I'm not sure of the total significance of using0
here.I edited the plugin code to instead look for incoming packet responses that had was a single-answer, was the
SRV
type, and had a name that matchedLutron
:This change didn't immediately work by itself, but I tried simultaneously running the independent
multicast-dns
script (note: running on my laptop, instead of my Homebridge server). That DID result in a response packet that matched the above criteria, and the bridge appeared in the Homebridge UI. Of course that doesn't solve the long term problem of needing to do that on every Homebridge boot so the plugin can find the Lutron bridge again.I'm brushing the edge of my Node async experience here - it seemed like the resolver query might be going out, but before the resolver listener was running to handle it? Or some other sequencing/timing type thing? (Ignoring the whole absence of a packet
id
).There's probably a much better way than this hacky approach, but I edited the
getBridgeId
function to the following. It uses the above packet criteria, and fires off a new query upon receiving any packet that isn't the correct one. It then destroys the resolver as (I think) it's no longer needed.Everything seems to work fine with this code, including initial pairing and Homebridge reboots. But I totally wouldn't be surprised if there's something simple I missed!
The text was updated successfully, but these errors were encountered: