-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Importing large maps #43
Comments
One possible way to go would be to use |
Does
Indeed, but I would have expected that the memory complexity is lower. Since we use the callback API, it should have to represent the full XML at any time. So what is taking all the memory ? julia> d = @time get_map_data("/home/blegat/Downloads/andorra-latest.osm", use_cache=false);
1.666097 seconds (10.15 M allocations: 805.471 MiB, 19.43% gc time)
julia> d = @time get_map_data("/home/blegat/Downloads/andorra-latest.osm", use_cache=false);
1.444814 seconds (10.15 M allocations: 805.486 MiB, 8.79% gc time)
julia> d = @time get_map_data("/home/blegat/Downloads/andorra-latest.osm", use_cache=false);
1.638368 seconds (10.15 M allocations: 805.471 MiB, 18.30% gc time)
julia> d = @time get_map_data("/home/blegat/Downloads/andorra-latest.osm", use_cache=false);
1.446085 seconds (10.15 M allocations: 805.477 MiB, 10.57% gc time)
julia> Base.summarysize(d)
5150378
julia> d = @time get_map_data("/home/blegat/Downloads/andorra-latest.osm");
[ Info: Read map data from cache /home/blegat/Downloads/andorra-latest.osm.cache
0.058867 seconds (280.02 k allocations: 15.695 MiB)
|
There are two versions of map parsers - routing oriented and raw Routing oriented (does additional processing)
Raw version (25% lighter):
The code for collecting elements can be found at the beginning of parseMap.jl file. I actually run the profiler:
If you try running it you can see that around 20% of time is OSMX while the rest is LibExpat. So perhaps one option would be to try a faster XML parser. Looking at the number of allocations it seems that LibExpat.jl is operating on Strings (rather than much faster Symbols) and is inefficient for large files. |
One more test:
Hence currently the XML parser is the major source of problems. At the time we started repairing https://github.com/tedsteiner/OpenStreetMap.jl the LibExpat.jl was the best we could have - there were not too many great Julia stream based XML parsers at that time. Prhaps EzXML.jl could be a good new choice? |
Yes, I think moving to EzXML might help. |
Hi, thanks for *.pbf support! I have also updated the tests since they were relying on RNG and this has changed in Julia 1.6. Now all tests pass locally on the current Julia version. I can also see that Travis migrated their servers from travis-ci.org to travis-ci.com. Somehow I am not able to change the unit testing mechanism from *.org to *.com. Travis.com seems not to be aware of the OpenStreetMapX (I just do not see the project in Travis list) - I still need to sort that out. |
I managed to reconfigure Travis and get everything to work, so now we have a new OpenStreetMapX release with pbf support! Should you need other functionality for your project (perhaps with some support on my site) please let me know. Thank you. |
Thanks! I made a few changes that are mixed up in a branch of my fork: https://github.com/blegat/OpenStreetMapX.jl/tree/mixed_changes |
Just started to import Belgium.
belgium-latest.osm.bz2
makes 750 MB, and the unpackedbelgium-latest.osm
takes 8.6 GB. I triedget_map_data("belgium-latest.osm")
, it took some time until my computer ran out of its 16 GB of RAM and 6 GB of SWAP and then the Julia program was killed.I'm wondering if it would be possible to load such map given enough time, e.g. by storing things in the disc, I'm wondering if that's what graphhopper does with the
_gh
directory.Another solution that might help for medium size osm file would be to support
.pbf
, is that feature planned or in the scope of OpenStreetMapX ?The text was updated successfully, but these errors were encountered: