Jupyter notebook kernel can go idle during long processing, losing cell results and preventing completion of notebook #291

cybersam · 2023-11-16T19:15:04Z

Describe the bug
When running .predict on a model in a Jupyter notebook cell, the intervals between progress bar output can become so long that the Jupyter kernel decides the cell has finished, and can put the kernel in an idle (basically, "dead") state after enough inactivity. This is very possible for long-running predictions (say, running overnight), where the user steps away and does not touch Jupyter for many hours. When the Neo4j server finally finishes prediction, the results (say, from .predict.stream) of the many hours of processing are lost, since the notebook is dead. I suspect the same problem can occur with other long-running GDS operations.

To Reproduce

I don't know if the eventual slow-down in progress bar output happens for all long-running use cases or server configurations. In my case, it usually happens but sometimes not.

GDS version: 2.5.3
Neo4j version: 5.11.0
Operating system: Amazon Linux

My specific Jupyter environment: JupyterLab 4.0.8, Python 3 (ipykernel) kernel, on AWS EC2 with Amazon Linux

Steps to reproduce the behavior:

Start a long-running (say, 10 hour long) .predict.stream operation in a Jupyter cell, and do not touch Jupyter the whole time.
After a good amount of progress, the progress bar output will freeze at some percentage of completion and the kernel will go to idle state.
If you CALL gds.listProgress in the Browser, you will see that the prediction is still running (if it has not yet completed).
After prediction complets on the server, the (dead) notebook does not not display any new content, and no following cells are executed.

Expected behavior
The Jupyter notebook should never go idle while any long-running GDS operation is still in progress.

Probably just need to ensure that output is regularly produced (say, every x minutes).

The text was updated successfully, but these errors were encountered:

adamnsch · 2023-11-17T08:59:31Z

Hi @cybersam,

Thank for bringing this to our attention.

The progress of a link prediction pipeline is not linear, so it may well be that there are substantial chunks of time where the algorithm has not reported progress in terms of %, even though it's still running. Is your jupyter environment by any chance running an idle culler? If so, have you tried to configure the idle culler according to your needs? https://tljh.jupyter.org/en/latest/topic/idle-culler.html

Adam

cybersam · 2023-11-17T15:34:08Z

As far as I know, I am using default Jupyter configuration, in which culling is supposed to be disabled. The config files do not set any culling values.

cybersam · 2023-11-17T17:41:36Z

Also, it turns out my idle kernel is not culled, even after a long time. It still remembers the state before the cell that "died".

cybersam · 2023-11-17T20:57:06Z

Some background: the cell in question stores the prediction result in a 'result' variable.

I tried the following experiment, and the results are very interesting:

I changed the cell to execute model.predict_mutate instead of model.predict_stream, so that the results are not completely lost when the kernel goes idle.
I saw that the cell output stopped showing new progress at some point.
In the Browser, I ran CALL gds.listProgress to keep tabs on the progress of the ongoing prediction operation.
When the prediction completed, I used the Browser to verify that the new relationships created by mutate (for link prediction pipeline) were in the GDS projection.
I then added a new cell to display the 'result' variable, and it actually showed the mutation result!

So, when the kernel goes idle it is apparently still able to get ultimate results. But the cell output is messed up, and subsequent cells do not execute when they are supposed to.

cybersam added the BUG Something isn't working label Nov 16, 2023

cybersam changed the title ~~Jupyter notebook kernel can go idle during model prediction, causing streaming results to be lost~~ Jupyter notebook kernel can go idle during long processing, causing streaming results to be lost Nov 16, 2023

cybersam changed the title ~~Jupyter notebook kernel can go idle during long processing, causing streaming results to be lost~~ Jupyter notebook kernel can go idle during long processing, losing streaming results and stopping further cell processing Nov 16, 2023

cybersam changed the title ~~Jupyter notebook kernel can go idle during long processing, losing streaming results and stopping further cell processing~~ Jupyter notebook kernel can go idle during long processing, losing cell results and stopping further cell processing Nov 16, 2023

cybersam changed the title ~~Jupyter notebook kernel can go idle during long processing, losing cell results and stopping further cell processing~~ Jupyter notebook kernel can go idle during long processing, losing cell results and preventing completion of notebook Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jupyter notebook kernel can go idle during long processing, losing cell results and preventing completion of notebook #291

Jupyter notebook kernel can go idle during long processing, losing cell results and preventing completion of notebook #291

cybersam commented Nov 16, 2023 •

edited

Loading

adamnsch commented Nov 17, 2023 •

edited

Loading

cybersam commented Nov 17, 2023

cybersam commented Nov 17, 2023 •

edited

Loading

cybersam commented Nov 17, 2023

Jupyter notebook kernel can go idle during long processing, losing cell results and preventing completion of notebook #291

Jupyter notebook kernel can go idle during long processing, losing cell results and preventing completion of notebook #291

Comments

cybersam commented Nov 16, 2023 • edited Loading

adamnsch commented Nov 17, 2023 • edited Loading

cybersam commented Nov 17, 2023

cybersam commented Nov 17, 2023 • edited Loading

cybersam commented Nov 17, 2023

cybersam commented Nov 16, 2023 •

edited

Loading

adamnsch commented Nov 17, 2023 •

edited

Loading

cybersam commented Nov 17, 2023 •

edited

Loading