[WIP]Add lock processing for queue purge #22

cdyangzhenyu · 2017-11-15T11:33:36Z

When the queue is purging, sending the message to queue will fail,
and the failure information will be recorded in the log.

Redmine: http://192.168.15.2/issues/11196

At present, some operations require a locking mechanism when concurrent requests. This patch implements the basic logic of the lock.

When the queue is purging, sending the message to queue will fail, and the failure information will be recorded in the log. Redmine: http://192.168.15.2/issues/11196

huntxu · 2017-11-16T07:11:04Z

zaqar/storage/mongodb/messages.py

+            LOG.error('PURGE LOCK: The queue: %s in project: %s'
+                      'is purging, so can not post.' % (queue_name,
+                                                        project))
+            return []


这里是不是直接raise出错比较好？否则实际上消息没有成功发布，但HTTP返回了201？

这里的逻辑就是清空队列时，发送的消息也要被清空，但是mongo无法保证，所以这里直接阻拦不让入库，变相实现了这个功能。所以就没必要触发异常了，发不进去是正常现象。

huntxu

锁的机制不完整，并不能完整解决问题。

根据之前的了解，mongo本身有写时的独占锁。假设purge用的清空操作也是一种写入，那么不可能会发生清空和写入新信息同时发生的情况。

看bug描述，无法确认发消息的操作正好发生在清空队列的过程中。所以我倾向认为这里并没有bug存在。

huntxu · 2017-11-17T02:08:46Z

zaqar/storage/mongodb/messages.py

@@ -658,6 +666,16 @@ def post(self, queue_name, messages, client_uuid, project=None):
        # The worst-case scenario is that we'll increase the counter
        # several times and we'd end up with some non-active messages.

+        # Get a lock for claim.
+        is_locked = self._lock_ctrl.is_locked(queue_name, 'queues',
+                                              project, 'purge')


这个锁的机制还是有问题，有可能这么检查的时候没有锁但下面的插入操作前被锁了。除非这里每条消息的发布也加锁防止purge

这种情况就相当于purge前发送的消息，理论上是可以发送成功的。

这样的话，其实也就是只在 purge 这个操作执行期间不让发送信息。不过看 purge 调用的 collection.remove 用的是 w:0 ，估计也是一种异步操作，所以 purge 操作本身应该也会很快完成。

zhaochao · 2017-11-17T02:12:09Z

zaqar/storage/mongodb/locks.py

+            LOG.debug(ex)
+            return False
+
+        return True


collection.update 应该不会轻易出现异常，这里的本意应该是没有更新成功就返回 False 。 collection.update 返回的是一个 WriteResult ，在 pymongo 里应该是个 dict ，需要根据返回结果判断是否加锁成功。

这里应该只有一种情况会失败，就是找不到条件里的数据，然后插入失败（key重复），也就是此时已经存在锁了。

~~即使找不到满足条件的数据，也不会出现异常， collection.update 仍然会返回一个 WriteResult ，告诉客户端没有更新任何数据。~~

是我对 upsert 的理解有误了，的确是可以用 unique index 来保证这里只有一种情况失败。

那这里的except好像不能太宽泛。否则的话让前面那个修饰utils.raises_conn_error都没有意义了

恩这个有道理只需要捕获key重复的错误。

@zhaochao 这里的upsert=True，如果找不到对应的doc去更新，则会插入一条数据，这里报错是插入时候报key重复，这个我测试过，确实是这样的。

zhaochao · 2017-11-17T02:32:51Z

zaqar/storage/mongodb/messages.py

@@ -658,6 +666,16 @@ def post(self, queue_name, messages, client_uuid, project=None):
        # The worst-case scenario is that we'll increase the counter
        # several times and we'd end up with some non-active messages.

+        # Get a lock for claim.
+        is_locked = self._lock_ctrl.is_locked(queue_name, 'queues',
+                                              project, 'purge')


这样的话，其实也就是只在 purge 这个操作执行期间不让发送信息。不过看 purge 调用的 collection.remove 用的是 w:0 ，估计也是一种异步操作，所以 purge 操作本身应该也会很快完成。

cdyangzhenyu · 2017-11-17T02:38:45Z

@huntxu 这里我也倾向不是bug。
mongo的remove确实是写锁，由于mongo的写锁是库级别的，所以mongo的锁有临时放弃的机制，比如批量删除很多文件需要总耗费10分钟，但是不可能一直锁库达10分钟，而是类似逐条加锁删除释放锁，所以还是能够写入新数据的。
bug描述里我是认为测试方法是有问题的，因为这个方法无法保证是在清除过程中发送消息。

cdyangzhenyu · 2017-11-17T02:57:28Z

@zhaochao 关于w=0这个我确实没有过多关注，看来这个问题通过这种方式确实是无法解决的。毕竟业务层面的本来就是异步的，加锁也没有意义了。那我觉得这个问题可以退回了，我和产品再沟通一下。 @huntxu

huntxu · 2017-11-17T02:59:29Z

确实，这个比较难界定什么消息是刚好在清理过程中发送进来的

cdyangzhenyu · 2017-11-17T03:05:20Z

@huntxu @zhaochao 虽然这里的修改无法解决该问题（或者这个问题本就不应该做处理），但是这种锁的机制确实是用来做mongo的事物处理的方式（mongo本身没有事物）。

zhaochao · 2017-11-17T03:12:41Z

@huntxu @zhaochao 虽然这里的修改无法解决该问题（或者这个问题本就不应该做处理），但是这种锁的机制确实是用来做mongo的事物处理的方式（mongo本身没有事物）。
嗯，大致了解了下，看起来是这样。

cdyangzhenyu requested review from huntxu, zhaochao and zh-f November 15, 2017 11:34

cdyangzhenyu added 2 commits November 15, 2017 19:40

Add lock mechanism for zaqar.

13768fd

At present, some operations require a locking mechanism when concurrent requests. This patch implements the basic logic of the lock.

Add lock processing for queue purge

6efb219

When the queue is purging, sending the message to queue will fail, and the failure information will be recorded in the log. Redmine: http://192.168.15.2/issues/11196

cdyangzhenyu force-pushed the redmine-11196 branch from 33973bd to 6efb219 Compare November 15, 2017 11:41

huntxu reviewed Nov 16, 2017

View reviewed changes

huntxu requested changes Nov 17, 2017

View reviewed changes

zhaochao requested changes Nov 17, 2017

View reviewed changes

cdyangzhenyu changed the title ~~Add lock processing for queue purge~~ [WIP]Add lock processing for queue purge Nov 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP]Add lock processing for queue purge #22

[WIP]Add lock processing for queue purge #22

cdyangzhenyu commented Nov 15, 2017

huntxu Nov 16, 2017

cdyangzhenyu Nov 16, 2017

huntxu left a comment

huntxu Nov 17, 2017

cdyangzhenyu Nov 17, 2017

zhaochao Nov 17, 2017

zhaochao Nov 17, 2017

cdyangzhenyu Nov 17, 2017

zhaochao Nov 17, 2017 •

edited

Loading

huntxu Nov 17, 2017

cdyangzhenyu Nov 17, 2017

cdyangzhenyu Nov 17, 2017

zhaochao Nov 17, 2017

cdyangzhenyu commented Nov 17, 2017

cdyangzhenyu commented Nov 17, 2017 •

edited

Loading

huntxu commented Nov 17, 2017

cdyangzhenyu commented Nov 17, 2017

zhaochao commented Nov 17, 2017

[WIP]Add lock processing for queue purge #22

Are you sure you want to change the base?

[WIP]Add lock processing for queue purge #22

Conversation

cdyangzhenyu commented Nov 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huntxu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhaochao Nov 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdyangzhenyu commented Nov 17, 2017

cdyangzhenyu commented Nov 17, 2017 • edited Loading

huntxu commented Nov 17, 2017

cdyangzhenyu commented Nov 17, 2017

zhaochao commented Nov 17, 2017

zhaochao Nov 17, 2017 •

edited

Loading

cdyangzhenyu commented Nov 17, 2017 •

edited

Loading