Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several goroutine leaks #262

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

Elara6331
Copy link
Contributor

This PR fixes several goroutine leaks that I've found:

  • When EnableNotifications(nil) is called on a DeviceCharacteristic, the property channel is now closed, which releases the notification handler goroutine and allows it to exit.
  • When an adapter connects to a device, the signal channel is now closed, releasing the goroutine that waits for the device to connect.
  • When a device disconnects, all characteristics now close their property channels and release their notification handler goroutines.

Fixes #260

@deadprogram deadprogram changed the base branch from release to dev April 3, 2024 10:22
@jasday
Copy link

jasday commented Apr 8, 2024

I've done some testing on this branch and seem to be getting some interesting results:

  • EnableNotifications no longer works on repeated connections. Steps below.
    • Connect to device, enable notifications, receive message/decode value, disconnect from device.
    • Connect to device again, enable notifications, no messages are received.

This seems to only be a problem when the device is connected to repeatedly without another connection in between. If I connect to one device, receive notifications, disconnect, connect to different device, read notifications, disconnect, I am able to receive notifications from the original device again. Sometimes repeating to the same device repeatedly will result in a notification being read, but it seems to be slightly random when it will work.

  • After successfully connecting and disconnecting from a device, the next connection attempt often occurs in a le-connection-abort-by-local error.

This was tested on multiple devices, and occurred on both although not consistently. That being said, I get this on the release version, just not as often - so potentially a different problem that this fix is revealing?

@Elara6331
Copy link
Contributor Author

Elara6331 commented Apr 9, 2024

EnableNotifications no longer works on repeated connections

Hmm, that's weird. I've specifically tested that case several times and it seems to work for me. I also see no reason that this PR would break that, since the DBus match rules and signal channel are set up again when you re-enable notifications.

Here's a simplified reproduction that I used to test this:

main.go
package main

import (
	"log"
	"time"
	
	"tinygo.org/x/bluetooth"
)

const address = "ED:47:AC:47:F4:FB"

var (
	serviceUUID = mustParse("00000000-78fc-48fe-8e23-433b3a1942d0")
	eventCharUUID = mustParse("00000001-78fc-48fe-8e23-433b3a1942d0")
)

func main() {
	adapter := bluetooth.DefaultAdapter
	err := adapter.Enable()
	if err != nil {
		panic(err)
	}

	device := connectDevice(adapter)
	char := getChar(device)

	log.Println("First connection complete")
	
	err = char.EnableNotifications(func(buf []byte) {
		log.Printf("Received notification: %x\n", buf)
	})
	if err != nil {
		panic(err)
	}

	log.Println("Notifications enabled")


	time.Sleep(2*time.Second)

	err = device.Disconnect()
	if err != nil {
		panic(err)
	}

	log.Println("Device disconnected")

	time.Sleep(2*time.Second)

	device = connectDevice(adapter)
	char = getChar(device)

	log.Println("Second connection complete")
	
	err = char.EnableNotifications(func(buf []byte) {
		log.Printf("Received notification: %x\n", buf)
	})
	if err != nil {
		panic(err)
	}

	log.Println("Notifications re-enabled; sleeping forever")

	select {}
}

func connectDevice(adapter *bluetooth.Adapter) bluetooth.Device {
	var (
		err    error
		device bluetooth.Device
	)

	err = adapter.Scan(func(adapter *bluetooth.Adapter, sr bluetooth.ScanResult) {
		if sr.Address.String() == address {
			device, err = adapter.Connect(sr.Address, bluetooth.ConnectionParams{})
			if err != nil {
				panic(err)
			}
			adapter.StopScan()
		}
	})
	if err != nil {
		panic(err)
	}

	return device
}

func getChar(device bluetooth.Device) bluetooth.DeviceCharacteristic {
	services, err := device.DiscoverServices([]bluetooth.UUID{serviceUUID})
	if err != nil {
		panic(err)
	}

	chars, err := services[0].DiscoverCharacteristics([]bluetooth.UUID{eventCharUUID})
	if err != nil {
		panic(err)
	}

	return chars[0]
}

func mustParse(s string) bluetooth.UUID {
	uuid, err := bluetooth.ParseUUID(s)
	if err != nil {
		panic(err)
	}
	return uuid
}

After successfully connecting and disconnecting from a device, the next connection attempt often occurs in a le-connection-abort-by-local error.

I've also had this happen intermittently with muka/go-bluetooth and bluetoothctl (mostly on Intel Bluetooth adapters, but on others as well), so I don't think this is a problem with this library, it's most likely a bug in BlueZ or somewhere else in Linux's bluetooth stack

@Elara6331
Copy link
Contributor Author

Elara6331 commented Apr 9, 2024

I did find a different problem though:

func (c DeviceCharacteristic) EnableNotifications(callback func(buf []byte)) error {

The problem here is that the receiver is not a pointer, so the DeviceCharacteristic struct is copied. That means when EnableNotifications is called and it sets some struct fields here:

bluetooth/gattc_linux.go

Lines 250 to 253 in 4557b5d

c.property = make(chan *dbus.Signal)
c.adapter.bus.Signal(c.property)
c.propertiesChangedMatchOption = dbus.WithMatchInterface("org.freedesktop.DBus.Properties")
c.adapter.bus.AddMatchSignal(c.propertiesChangedMatchOption)

those changes don't persist, which means running EnableNotifications(nil) later has no effect (though disconnect events do still have an effect because the goroutine captures the receiver variable), and c.property is always nil, so errDupNotif never gets returned if someone tries to add multiple notification handlers.

There are two ways to fix this. The cleaner way would be to change the EnableNotifications function to use a pointer receiver, but that would be a breaking change and it would be inconsistent with the other implementations. The other way, which I don't really like but would avoid a breaking change, is to store a pointer to another struct in DeviceCharacteristic, with all the fields that need to be changed.

I'm not sure which approach this library would rather go with, so I won't change this until a maintainer of the project weighs in.

@jasday
Copy link

jasday commented Apr 9, 2024

Hmm, that's weird. I've specifically tested that case several times and it seems to work for me. I also see no reason that this PR would break that, since the DBus match rules and signal channel are set up again when you re-enable notifications.

Here's a simplified reproduction that I used to test this:
main.go

That's interesting, I tried with your code and got this output log:

 First connection complete
 Notifications enabled
 Received notification: %x
 [6 109 1 0 255 219 7 1 21 20 27 0 1]
 Device disconnected
 Second connection complete
 Notifications re-enabled; sleeping forever

I was taking readings (using a health thermometer device) and expected notifications to appear after the second notification re-enabled, but received nothing. I'm assuming I should be seeing notifications appear here as long as the device is still active and taking measurements?

Doing a little more testing, I have two devices with the same service/characteristic and one different. On the first connection of one of the matching pair, I receive a single notification with value, and then no more. I can then connect the other of the pair, and get the same behavior. If I then connect a different type of device, with a different service/characteristic, I can get only the first values from that - however connecting this new service seems to do some type of 'reset' and I can then receive the first value from both the original devices again.

The only real difference I'm seeing between your example code and mine is that I don't stop scanning - I need to be able to listen for devices whilst I'm handling a scan result, but adding a stopScan to mine still produces the same results.

@Elara6331
Copy link
Contributor Author

Elara6331 commented Apr 9, 2024

That's interesting, I tried with your code and got this output log:

That's weird. It's been working consistently for me. I've tried it on Intel, MediaTek, and Realtek bluetooth adapters, and with different peripherals. Here's my log:

2024/04/09 09:41:30 First connection complete
2024/04/09 09:41:31 Notifications enabled
2024/04/09 09:41:33 Received notification: 03
2024/04/09 09:41:33 Received notification: 03
2024/04/09 09:41:33 Received notification: 01
2024/04/09 09:41:34 Received notification: 04
2024/04/09 09:41:35 Device disconnected
2024/04/09 09:41:38 Second connection complete
2024/04/09 09:41:38 Notifications re-enabled; sleeping forever
2024/04/09 09:41:39 Received notification: 01
2024/04/09 09:41:39 Received notification: 04
2024/04/09 09:41:40 Received notification: 03
2024/04/09 09:41:40 Received notification: 01
2024/04/09 09:41:41 Received notification: 01

@jasday
Copy link

jasday commented Apr 11, 2024

I'm testing on an NXP IW416 chip - not sure if this would cause any differences though.

Firmware is up to date, and the code looks like it should be working. Could this specific chip be sending extra signals that is causing notifications to be disabled?

I've attached my bluez logs in case it's a possibility. The device I'm testing with is dev_C0_26_DA_1A_FE_1A

log.txt

@jasday
Copy link

jasday commented Apr 22, 2024

I've done a bit of testing on a local branch, and seem to have got it to work on my device.

It only needs a small change to the nil case in the same switch statement, adding

err := c.characteristic.Call("org.bluez.GattCharacteristic1.StopNotify", 0).Err
		if err != nil {
			return err
		}

to make the entire case

case nil:
		if c.property == nil {
			return nil
		}

		err := c.characteristic.Call("org.bluez.GattCharacteristic1.StopNotify", 0).Err
		if err != nil {
			return err
		}

		err = c.adapter.bus.RemoveMatchSignal(c.propertiesChangedMatchOption)
		c.adapter.bus.RemoveSignal(c.property)
		close(c.property)
		c.property = nil
		return err

I'm not entirely sure the side affect of this, if we error when calling stop notify this may cause the signals to not be removed properly (or, on the other hand, will this mean that it will just try again if it fails?) - @Elara6331 do you have any thoughts?

If this change could be tested/confirmed and then added to the PR if proven working, that would be great.

Copy link
Member

@aykevl aykevl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good idea.

@@ -345,6 +345,7 @@ func (a *Adapter) Connect(address Address, params ConnectionParams) (Device, err
// were connected between the two calls the signal wouldn't be picked up.
signal := make(chan *dbus.Signal)
a.bus.Signal(signal)
defer close(signal)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this close needed?
Note that the only thing closing a channel does is signal to receivers that the channel is closed (and makes sending on a channel panic). It is not needed to recover resources: a channel will be garbage collected like any other object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this close is to end the range loop inside the goroutine and let it exit

@@ -280,6 +293,7 @@ func (c DeviceCharacteristic) EnableNotifications(callback func(buf []byte)) err

err := c.adapter.bus.RemoveMatchSignal(c.propertiesChangedMatchOption)
c.adapter.bus.RemoveSignal(c.property)
close(c.property)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This close also exits the goroutine

@Elara6331
Copy link
Contributor Author

Elara6331 commented May 3, 2024

I do have a question about this:

func (c DeviceCharacteristic) EnableNotifications(callback func(buf []byte)) error {

Since the method doesn't use a pointer receiver, the following lines operate on a copy of the characteristic and make EnableNotifications(nil) useless since those changes don't persist and c.property will always be nil:

c.property = make(chan *dbus.Signal)

c.propertiesChangedMatchOption = dbus.WithMatchInterface("org.freedesktop.DBus.Properties")

I'm not sure what the best way to solve this would be, since making it a pointer receiver would be a breaking change and inconsistent with the other implementations.

@deadprogram
Copy link
Member

Pinging @aykevl for an opinion on this, please.

@jasday
Copy link

jasday commented Aug 20, 2024

Have there been any more thoughts on this? Along with the stop notifiy addition? @aykevl @Elara6331

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Characteristic notification handler goroutine leak
4 participants