[HRT-72] Fix scraper Overpass OSM — bounding box + Content-Type + User-Agent #9
Reference in New Issue
Block a user
Delete Branch "feature/HRT-72-fix-overpass-query"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Contexte
Fix des 3 bugs identifiés dans
leadhunter_scraper.pycausant 0 résultat OSM.Issue Paperclip : HRT-72
Changements
Bug 1 — Requête area → bounding box (lignes 65–74)
Avant :
area["name"="Métropole Européenne de Lille"]qui échoue silencieusement.Après : bounding box directe
(50.4,2.8,50.8,3.3)— déterministe et fiable.Bug 2 — Header Content-Type explicite (ligne 282)
Ajout de
Content-Type: application/x-www-form-urlencodedexplicite dansrequests.post().Bug 3 (découvert au test) — User-Agent bloqué
overpass-api.deretourne 406 pourUser-Agent: python-requests/*.Fix : User-Agent
H3R7Tech-LeadHunter/1.0 (contact@h3r7tech.fr).Test local
Résultat : 5 leads retournés (Crêperie La Vieille Bourse, Pinocchio, L'Opera Corner, Le Royal, Crêperie Saint-Georges).
Backup
leadhunter_scraper.py.backup_20260427_221429Bug 1: Replace area["name"="..."] query with direct bounding box (50.4,2.8,50.8,3.3) — area resolution fails silently on public Overpass API depending on server version. — Direct bbox is deterministic and reliable for MEL coverage. — Also simplify website filter to use [!"website"] tag negation syntax. Bug 2: Add explicit Content-Type: application/x-www-form-urlencoded header — Some network configs/proxies strip the implicit header set by requests.post(data={}). — Explicit header is best practice per Overpass API docs. Bug 3 (discovered during test): Add User-Agent header — overpass-api.de returns 406 Not Acceptable for User-Agent: python-requests/*. — Fix: send H3R7Tech-LeadHunter/1.0 as custom User-Agent. — Tested: 5 OSM leads returned from Lille center bounding box. Backup: leadhunter_scraper.py.backup_20260427_221429 Co-Authored-By: Paperclip <noreply@paperclip.ing>View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.