David Zhao
3 min readAug 30, 2020
opopenmaptiles.org

OpenMapTiles is a great project by the team over at maptiler.com that defines and provides a set of tools for generating and serving maps. Their main repository comes with an easy-to-use set of scripts and Docker configurations that lets you quickly download and process map data into a PostgreSQL database and generate vector map tiles. However, it is (purposefully) limited to small runs and not meant for large maps (e.g. continent or planet-size).

At WOWA.ca, we needed to generate a set of vector map tiles for North America (our main customer base). I used an AWS EC2 m5ad.24.xlarge instance with 96vCPUs, 384GiB RAM, and 900GiBx4 NVMe drives in software RAID0 with Amazon Linux 2. Everything went smoothly until I got to generate the map tiles.

Configuration of openmaptiles.yaml
My openmaptiles.yaml for North America

16 days. That’s how long it was expected to take. I took a peek at docker stats to see how my PostgreSQL instance and the tile generation instance were doing. It was showing a lame 500% total usage (5vCPUs). No wonder it was taking so long!

COPY_CONCURRENCY

By default, tilelive-copy (run by make generate tiles) uses 10 threads. This parameter is controlled by a COPY_CONCURRENCY environmental variable. To fully utilize your resources, set this first to half the number of vCPUs if you’re also hosting your DB on the same instance. PostgreSQL will also need CPU time and using more connections than necessary will likely slow things down rather than speed them up. You can always increase it if you haven’t reached full load.

docker-compose.yml configuration
Add both COPY_CONCURRENCY and UV_THREADPOOL_SIZE to docker-compose.yml

You can add the parameter as an environmental variable to docker-compose.yaml under the generate vector-tiles instance configuration.

UV_THREADPOOL_SIZE

Node.JS uses a libuv threadpool to handle many of its I/O operations. The default number of threads is 4, but can be set up to 1024. At minimum, set it to the number of threads you defined using COPY_CONCURRENCY, but since it’s a limit rather than a definite amount, you can set it up to the maximum of 1024 without any significant negative impact.

.env configuration
My final .env for m5ad.x24large

The setting goes in the same docker-compose.yaml under the generate vector-tiles instance configuration.

Results: From 16 Days to 24hrs

With no PostgreSQL optimizations, the new configuration dropped my expected time from 16 days to 24hrs — 16 times faster than the default.

Middle of my run. Now it’s expected to less than a day in total!

Upon closer inspection, PostgreSQL is now the bottleneck in the process. I have not made any optimizations to postgres.conf to take advantage of my memory (only 9GB reserved out of 384GiB!).

“docker stats” shows most of the CPU usage is actually used by PostgreSQL
htop output showing all 96vCPUs. Memory usage is definitely not optimized.

With some more work, I would likely be able to speed up the process even faster. With the amount of free memory I still have, caching part of the database in RAM would likely reduce the PostgreSQL (and I/O) bottleneck. However, I only needed to do a single run and I could live with waiting for a day. A 1,600% speedup is good enough for me.

Open for Suggestions

If you’ve worked with openmaptiles before, have any suggestions for PostgreSQL config, or any other tips that you think would be helpful, leave a comment!

David Zhao
David Zhao

Written by David Zhao

Full-Stack Engineer at Stripe

Responses (4)