If buildings are to be placed on a smaller scale maps, they must be prepared: simplified, then typified and finally aggregated/amalgamated.
Building simplification is not the same as line/polygon simplification (done with DP or VW algorithms). When simplifying building, you want characteristic details to remain: for example most buildings have square shapes, that must remain in simplified version.
Example of building simplification:

Here dashed polygon - original building, yellow one - simplified to specified amount.
As you can see square angles have been preserved as well as larger details while smaller details have been removed.
The amount of simplification depends on a resolution of the screen/printer (if a size of a pixel is 10 meters there is no point of trying to depict details smaller than 10 meters) as well as legibility requirements - when too much details is displayed, map reader cannot clearly read the map. If unsimplified buildings are placed on a small scale printed map, they could be visible as some kind of sand or other pattern, not as buildings.
Such building simplification helps not only with the very important criteria in cartography - legibility, but also with some technical details - building polygons have less vertexes, therefore they are rendered faster. While legacy technology of using raster tiles does not suffer too much because of more complex geometries, it is very important with currently used vector tiles - buildings take up much less (sometimes up to 50%) space so are faster to transfer and use less CPU (and battery on mobile) to render them.
You can check the difference in building shape in a live OpenStreetMap topo map (simplification kicks in on zoom less than 14)
Discussion
Comment from mboeringa on 31 July 2019 at 15:56
Hi Tomas,
Interesting work you did, and nice results! Especially the maintaining of square corners in the generalized building geometries must be tough to implement and get right.
For my own renderer, I took a far less sophisticated route, although I do think it is still a valid compromise between no generalization at all, or an advanced algorithm as you developed.
In my case, although this is somewhat simplified, I simply chose to generalize all buildings using a 9 m2 tolerance for the PostGIS ‘ST_SimplifyVW’ command (for ‘ST_SimplifyPreserveTopology’ it would be a tolerance of 3 m).
Additionally, to prevent collapsing already simple building structures with few vertices into something like undesirable triangles, I use a minimum 9 vertex threshold: if the geometry has less than that number of vertices, it won’t be generalized. This is to prevent e.g. ‘horseshoe’ shaped buildings to collapse in something unrecognizable.
Overall, this seems to give a nice ‘compromise’ result: it doesn’t distort buildings to much due to relatively low tolerance used (although square corners may be lost, but this is virtually indiscernible with this tolerance when viewed at appropriate scales), but still manages to cut away a lot of clutter in really detailed digitized complex buildings.
Again, definitely not as sophisticated as your approach, but I thought it nice to share for others struggling with similar issues.
Comment from Tomas Straupis on 31 July 2019 at 17:06
Hi mboeringa,
For simple way of building generalisation you can try st_buffer with join=mitre. Try different combinations of +-, +-+, -+- and see what works best for your required scale. It will preserve square corners most of the time and will even remove small inner rings. BTW all code used in creating *.openmap.lt is in github.
Comment from mboeringa on 3 August 2019 at 08:19
Adamnt1,
Tomas has to answer himself of course, but to clear up a misunderstanding: you do not “loose information” when generalizing: generalized data is for display at scales where that “information”, for all intents and purposes, is of no use at all, because it would get lost in a mist of tiny essentially invisible objects anyway.
We’re talking about especially the scale range of 1:10k - 1:50k for details like buildings and other stuff in city centers. At 1:50k, 2 cm on a map is 1 km, 1 mm on the map is already 50 m in reality. You can’t realistically display a visible patch of grass of 5 x 5 m at that scale, it would be 0,1 x 0,1 mm on the map…
And when you do want to see that level of detail, you zoom in on a web map, even if it has generalized data for smaller, zoomed out, scales.
All current vector tile implementations use heavy levels of generalization for smaller scales.
Comment from mboeringa on 3 August 2019 at 08:23
Tomas,
Interesting, thanks for sharing that suggestion. I would never have thought you were using ST_Buffer for this. This option is also not clearly documented on the PostGIS website for ST_Buffer, but I guess it is also mild “hack”…
Nonetheless, the result is still impressive.
Comment from Tomas Straupis on 3 August 2019 at 09:35
mboeringa, Current building simplification uses much more complex algorithm than buffering (well, even typification uses a more complex algorithm). I was just suggesting using ST_Buffer instead of ST_Simplify, because it should give better results, but as ST_Buffer is not intended for that, you have to check if it works in your particular use case.
Comment from mboeringa on 3 August 2019 at 09:41
Thanks for the additional info.
Admittedly, I haven’t had a look at the code repository yet, so aren’t yet aware of the exact process you used, but yes, I appreciate it is likely a lot more complex than just buffering.
Comment from Tomas Straupis on 3 August 2019 at 09:46
Adamant1,
I guess mboeringa has already answered main part of your question.
I will just add that generalisation usually has more than one step. Simplification is operation of just one step. Another thing you would want to do is amalgamate (merge polygons which are “close enough” into one polygon).
For some practical example: I’m currently playing with building generalisation sequence: amalgamate and then simplify on 5m, simplify and typify on 10m, simplify and typify on 20m, simplify and typify on 40m.
If you want to see how gods of cartography do it - look at SwissTopo map: https://map.geo.admin.ch
Comment from Tomas Straupis on 3 August 2019 at 09:55
mboeringa,
Main part of simplification algorithm is here:
https://github.com/openmaplt/vector-map/blob/master/db/func/stc_simplify_building_line.sql
And it is far from being finished: there are a lot of TODO’s left and preserving of building area is still to be implemented. But for some cases it is already doing probably better than A** algo:
Here:
dark polygon - original building,
light polygon - postgis,
purple outline - A (you can see that A is preserving building area so outline is moved outwards in some places).
Comment from mboeringa on 3 August 2019 at 10:43
Impressive, do you have any figures on the performance of this? How much time of processing for X million buildings?
Comment from mboeringa on 3 August 2019 at 10:47
And does your code run properly for both lat/long (4326) and Web Mercator (3857)?
Comment from Tomas Straupis on 3 August 2019 at 11:27
Nothing was done for optimisation yet.
Lithuania has ~1M buildings. Amalgamation+Simplification for 10m takes ~15min.
I have not tested it on other projections.
For proper use on larger areas a problem of segmentation has to be solved: recalculating only the impacted area, which is not straightforward because surrounding geometries influence the result.
Comment from mboeringa on 3 August 2019 at 19:39
Hi Tomas,
With “Simplification for 10m” you mean you use a 10 meters tolerance?
One other unrelated question, since you seem very experienced with PostGIS, have you ever had the situation where “ST_CollectionExtract(ST_MakeValid(way),3)” returned invalid polygon geometries?
I am kind of at loss at this moment with a problem I have: despite running the above command after a custom generalization process I developed, which even includes a test for a polygon area of 0 (so collapsed geometries are being dropped), and an “ST_Buffer(way,0)” step to potentially fix any other bad stuff, I still get an occasional bad geometry that causes an error in “A” ;-).
Have you ever experienced similar issues, and how do you guarantee valid geometries after some complex geometric processing?
Comment from Tomas Straupis on 3 August 2019 at 20:05
Yes, “Simplification for 10m” means 10m tolerance.
I'm not too experienced in PostGIS and I'm not a coder. I'm thinking in general geometry and luckily PostGIS has functions for all geometry operations I need. So I cannot tell you if collectionextract may produce invalid geometries, but the first thing which comes to my mind is overlapping buildings could cause problems. You see, I'm only working with Lithuanian data and here we're using a loooot of additional QA tests daily. One of those is building overlap.
Current implementation of simplification is part of my "research" (hobby level). So on each iteration in building simplification I check if resulting geometry is valid (ST_IsSimple), if not - simplification is stopped and warning is raised so that I could check it in detail later: either additional QA rules have to be added (input data fixed), or simplification code has to be adjusted (algorithm fixed).
Have you traced to the specific bad geometry? That is can you display it in QGIS or at least extract ST_AsText to check manually? This would allow tracing back what input data is causing it.
Comment from mboeringa on 3 August 2019 at 20:25
Well, for someone who claims to not be a coder, you write pretty advanced code ;-)
In my case, the data I was writing about is actually not buildings (although I do use them as well), but processed forest polygons. Not that it really matters, the problems of validity checking are the same.
However, this is what still confuses me:
This page:
https://postgis.net/docs/using_postgis_dbmanagement.html#OGC_Validity
of the PostGIS online manual says:
“By definition, a POLYGON is always simple”
This text suggests it does not really make sense checking for simplicity for polygons? ;-()?
Yet, you (only) use ST_IsSimple? Or do you use both?
My code currently only checks validity using ST_IsValid in some steps, and the commands I wrote about before.
No, I haven’t yet attempted to track down the exact geometry causing the issue. It is also a kind of hard, because I know the original geometries are fine, they work well in a GIS. It is just after my processing, that a few features fail (which is unfortunately a big issue, as a GIS generally requires all geometries to be fully valid). Still working on getting it right…
Comment from Tomas Straupis on 4 August 2019 at 07:29
When moving vertexes of a polygon you can get into situation when a ring of a polygon crosses itself and then it is "not simple". This is what ST_IsSimple detects in my case in inner procedure, where only one ring is being processed.
But there is also a ST_IsValid check done with ST_MakeValid executed where we have full multipolygon with possibly multiple rings.