DevOps Engineer 86e85aa1c6 fix(HRT-72): fix Overpass OSM scraper — bounding box + Content-Type + User-Agent
Bug 1: Replace area["name"="..."] query with direct bounding box (50.4,2.8,50.8,3.3)
  — area resolution fails silently on public Overpass API depending on server version.
  — Direct bbox is deterministic and reliable for MEL coverage.
  — Also simplify website filter to use [!"website"] tag negation syntax.

Bug 2: Add explicit Content-Type: application/x-www-form-urlencoded header
  — Some network configs/proxies strip the implicit header set by requests.post(data={}).
  — Explicit header is best practice per Overpass API docs.

Bug 3 (discovered during test): Add User-Agent header
  — overpass-api.de returns 406 Not Acceptable for User-Agent: python-requests/*.
  — Fix: send H3R7Tech-LeadHunter/1.0 as custom User-Agent.
  — Tested: 5 OSM leads returned from Lille center bounding box.

Backup: leadhunter_scraper.py.backup_20260427_221429

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 22:19:10 +02:00
Description
Turf SaaS platform with ML ensemble predictions
2.9 MiB
Languages
Python 63.7%
HTML 35.1%
Shell 0.8%
JavaScript 0.2%
Dockerfile 0.1%