Easier Way To Download!? #3
Replies: 5 comments 18 replies
-
The way I did it is this. note I have no AWS account. Download the Amazon AWS command line tool, then run It will download the datasets with no further prompting. The entire dataset is 215.5 GB as of this writing. |
Beta Was this translation helpful? Give feedback.
-
This can be easily done by the GeoParquet version of OvertureMap data produced by Apache Sedona. Please refer to my blog post: https://medium.com/@dr.jiayu/harnessing-overture-maps-data-apache-sedonas-journey-from-parquet-to-geoparquet-d99f7767a499 Wherobots uses Sedona to generate these data and provide them for free.Buildings: Method 1: use Sedona to download data according to the exact shape of a regionSedona supports geospatial filter pushdown on geoparquet so you can easily download the data of a country using its real shape. Example spatial SQL queries:
Try it now. This query on the 120GB Building dataset will take about 3 min to finish on the docker image.
The Jupyter notebook will be available at https://localhost:8888/ Method 2: Use AWS CLI to download a specific region according to GeoHash ID:All GeoParquet data provided by Wherobots are partitioned by GeoHash ID with 2 character precision. See below: You can download a specific region as follows
You can find the geohash id of your target geospatial region using this online tool: https://www.movable-type.co.uk/scripts/geohash.html |
Beta Was this translation helpful? Give feedback.
-
@jiayuasu I am trying to run the docker image:
but when I go to https://localhost:8888/ it says Token authentication is enabled but I don't see any token. |
Beta Was this translation helpful? Give feedback.
-
@d3netxer , @jiayuasu I'm working on simplifying the data download process in a fork of the @wherobots repository, which you can find here: https://github.com/Youssef-Harby/OvertureMaps. Once I've completed the changes, I plan to submit a pull request to the original repository at https://github.com/wherobots/OvertureMaps. To test it out, you can use the Docker Compose file included in my fork. After cloning the repo and cd to it docker compose up -d after you run the part of getting the df you can save the Sedona DataFrame as a example of save the df of building to offline parquet (it will go directly to your current project folder because it's mounted volume to the host directory): df_building.write.format("geoparquet").save("/opt/workspace/building.parquet") to convert it you can use for example : ogr2ogr -f GPKG output.gpkg building.parque |
Beta Was this translation helpful? Give feedback.
-
unfortunately it is not working for me @Youssef-Harby. I am getting an error when tying the Docker compose command:
|
Beta Was this translation helpful? Give feedback.
-
Is there an easier way to download these? I understand the data is massive but the few ways that are setup are going to be very difficult for the masses to access.
Appreciate any other options your teams can put out for the average user
Beta Was this translation helpful? Give feedback.
All reactions