-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading Platform/Scenario from large local database uses too much RAM/CPU/time #433
Comments
Thanks for the detailed report! I also experience similar issues, though not as extreme: on the order of 10s of seconds waiting for a local platform to load. These performance issues arise from the Java code underlying ixmp's JDBCBackend—either the code itself (which is in a private repository called ixmp_source) or the libraries it uses, e.g. HyperSQL for local databases. This code is not actively maintained. Work has started to add new, pure-Python backend to ixmp (addressing #78), but I am not aware of a roadmap or target timing to release it. One of the goals for that new backend should be to provide "good" or "acceptable" performance, including for the items you mention. These should be quantified, e.g. “Loading a platform with A data points in B scenarios should use less than C GB of RAM and take less than D seconds on such-and-such hardware.” See also #215, which is about specifying and confirming performance metrics. For both items (1) and (2), what would you consider acceptable performance for your usage of ixmp? |
Acceptable Performance philosophy: The current setup works great if we start with fresh DB and only run the final scenarios from the DB. This would entail that we do our tests in a separate DB and only commit the final changes in the main DB. Now, I think this is bit tough to manage so here is what I think an acceptable usage should be: Item 1: initialising IXMP platform should be done within 2-5 mins as this task is performed whenever I launch a notebook and in case of errors in my scenario development. If I want to try coding for 10 different items like sensitivity on variables, history calibration, technology updating 5 mins of load time of IXMP*10 times opening the notebook = 50 mins spent waiting for the platform to load which is OK instead of 200+ mins currently. At the same time when I monitor energy draw , I am consuming 200+ watts for these 20 mins every time I load IXMP. Short Answer: 2-5 mins instead of 20+ mins Item 2: Here when I load prebuilt data via Excel sheet it takes 2-3 mins , but if I read a param from base scenario and manipulate it on fly I take 20-25 mins. This example entails me updating and loading new techs, creating new grid structure and doing historical calibration in a single go. My deduction is that reading variable, manipulating it on the fly and committing it adds in high penalty compared to just loading prebuilt structures from excel. Given that the total data we are manipulating is <50 mb, this much penalty should not occur (I can be wrong here as I have not used Java based workflows before). Short Answer: 5-7 mins instead of 20+ minutes. My experiences with DB is just limited to Tearadata and GIS (SQL/SQLITE/GPKG) datasets which do not incur such high penalties with increasing DB size. Maybe there is a tweak present inside IXMP that I am missing? [https://github.com//issues/41] |
Excellent, thanks for sharing these expectations as a user, which are useful to developers. I agree that, given the performance of other software that works with much larger data, it should be possible for our new Python backend to meet or exceed these metrics. Again, the Java code will not be further developed, so we won't look too closely at why it's performing poorly.
There's none, no. |
Thanks for Sharing this feedback. All the best with the python porting. |
Let's keep it open until we've verified that the new backend meets these requirements. |
iiasa/message_ix#617 reports a different error message that may have a similar cause:
That occurs on a macOS system with 8 GB RAM. |
Hi,
I am experiencing slowdown in the following tasks when using IXMP:
1.) when loading IXMP platform, the load process gets slower with each new scenario/ sensitivity runs. Currently the DB size on my system is 20gb with platform load time >15 mins and upto 40 mins. The RAM usage goes upto 18 GB with CPU pinned at 80-100% during this load process. Mitigation can be observed when using a fresh DB for IXMP where the load times are <1min and RAM usage <5 GB.
2.) When loading the model from the IXMP platform and making changes to it eg. adding new technologies. Here the processor is panned at again 80-100% and ram usage goes upto 40 GB (manual allocation to manipulate Global Model without OOM error) taking around 20 mins. When doing the same task on a fresh DB time taken is <7 mins with RAM usage <15 GB.
Additional Note: I always use mp.close_db() to tidy up the IXMP DB after each run. I am using a high speed SSD with 64 GB RAM and 12 Core i9 CPU (all cores 4.2 GHz) for these tests.
Best,
Sid
The text was updated successfully, but these errors were encountered: