-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hanging when excecuting webdriver-related commands in some cases #947
Comments
|
There is no DB related activity happening in the link counting command however. |
Also
shouldn't affect how any of this works. |
However this command also doesn't do anything useful for larger crawls and maybe we should encourage removing it then. |
Yes, there is no DB related command in the costum_command.py, but I thought maybe the process of recording in the database is going on in another thread or process and that causes the problem. |
This issue would happen also with non-modified GetCommand, I used this scenario as the MWE so you can reproduce it. |
I can reproduce this issue for http://www.marriott.com. The Relevant log output:
@vringar can you reproduce this? |
I can reproduce it but don't have any idea, why this might be happening. |
Just me brainstorming: Is it really the webdriver becoming unresponsive and if so, how could that happen? It's unlikely to be anything in OpenWPM, since there was no change to that part of the code base recently. |
Disabling openwpm.xpi will fix this issue; since openwpm.xpi is responsible for communication between DB and browser instances, I suggest looking into those parts of the project to find the problem, i.e the instrumentation parts. |
Could either of you add another webdriver action right after the last action in the GetCommand and see if it starts hanging there instead of in the Finalize Command? I'll likely only get to work on this next Friday so any information gathered until then would reduce the delay in getting this fixed. |
I already did; yes it will hang! |
Thanks! Okay, then I'm going to see when we last updated geckodriver, see if we need to update our copy of their prefs and see if I need to file an upstream bug. |
Nope, GeckoDriver hasn't been updated in forever. https://github.com/mozilla/OpenWPM/blame/master/environment.yaml#L11 |
Hmm, I'm very confused. |
Yes, it should have. |
I don't think this is related to this issue, or if it is then it's related via a common root cause. |
Just guessing, because this log would be prompted several times (between 1 till 10, it varies time to time) exactly before the timeout log. |
@vringar It doesn't need to disable the whole openwpm.xpi; just disabling js_instrument (and accordingly callstack_instrument #557 ) will fix the problem. It is clear there is a huge amount of js_instrument recorded just for this single website, so as I already mentioned, the problem is likely related to the burst of the DB IO process. |
Could you try just disabling the callstack Instrument and see if that also resolves the problem? The callstack Instrument messes with Firefox internals, so I'd think it's more likely to cause breakage. |
I disabled each of the instruments one by one and it just worked when I did it for js_instrument. |
Thank you for doing this investigation! |
During crawling some websites, the OpenWPM will kinda freeze when reaching a webdriver-related command, e.g. running "https://www.marriott.com" in official demo.py, the script will hang in this line of custom_command.py
link_count = len(webdriver.find_elements(By.TAG_NAME, "a"))
.I think this could be related to the DB recording process since this issue usually happens for websites that relatively set big amounts of Cookies.
The text was updated successfully, but these errors were encountered: