I was finally able to start reconciling the Vancouver Storefronts Inventory (VSI from now on) and the OSM nodes. VSI has 578 coffe/café matches, OSM has 574. These numbers are so close, it gives me hope.
When searching from nodes in OSM that have a nearby (<10 m) node in VSI, 54 results come out. Of those, 51 are perfect matches (business name in OSM is the same as in VSI, except for things like “Starbucks” in OSM vs “Starbucks Coffee” in VSI). This isn’t too thrilling, but honestly a near 10% perfect match from the get go is pretty sweet.
Using 10 meters is pretty bold, so I’ll experiment a bit on a healthy threshold that gives me more matches but doesn’t yield too many false matches. A 25 m radius already jumps to 391 matches and a 50 m radius gives 705 which is obviously too much.
If I have the time, I should also probably start getting fancy with fuzzy matching business names to get the obvious non-identical matches out of the way so I can investigate proper mismatches.
