Methodology & Caveats
How threat data is collected, geolocated, and aggregated
Data sources
| Source | What | License | Refresh |
|---|---|---|---|
| Abuse.ch FeodoTracker | Active botnet C2 IPs (Emotet, Dridex, TrickBot, Qakbot, etc.) | CC0 | Continuous; we snapshot once daily |
| Abuse.ch ThreatFox | Recent IoCs incl. C2 IPs and ports | CC0 | Recent CSV (rolling window); we snapshot once daily |
| ip-api.com | IP→country/region/city/lat-lon/AS | Free non-commercial | On-demand for new IPs only |
| CISA KEV | Known-exploited CVE catalog | Public domain (US gov) | When CISA updates |
| FIRST EPSS | Daily exploit-probability per CVE | CC-BY-SA | Daily |
| NVD | CVSS scores per CVE | Public domain (US gov) | On-demand for new KEV CVEs |
All sources are non-commercial-friendly. ip-api.com’s free tier is explicitly non-commercial; this site has no ads, no products, and no paid content. Attribution is on every page that displays each source.
Cache architecture
Daily snapshots accumulate so the site can show both: - “as of right now” (latest snapshot) - “accumulated over N days” (union/sum of last N snapshots)
data/cybersecurity/
cache/
feodo_YYYY-MM-DD.json daily Abuse.ch FeodoTracker dump
threatfox_YYYY-MM-DD.csv daily ThreatFox export
kev_YYYY-MM-DD.json daily CISA KEV dump
epss_YYYY-MM-DD.csv.gz daily EPSS snapshot
epss_history.csv persistent: first-seen + current per CVE
nvd_cvss.csv persistent: CVSS scores per CVE (lookup once)
ip_geolocation.csv persistent: IP → country/region/city/lat/lon
current_threats.csv latest snapshot, joined w/ geo
current_botnets.csv FeodoTracker subset of above
province_daily.csv per-day per-province IP counts
malware_family_daily.csv per-day per-malware counts
threats_summary.csv headline figures for index page
cves_kev.csv KEV catalog with EPSS + CVSS joined
cves_summary.csv headline figures for CVE page
Provincial geolocation
Threat IPs are geolocated to city/province level via ip-api.com’s batch endpoint (100 IPs per request, 2-second pause between batches). Each unique IP is looked up once and the result is stored in ip_geolocation.csv. Subsequent runs reuse cached geolocations, so a typical daily refresh sends 0–50 lookup requests.
Province/region accuracy varies by country and ISP — major commercial providers (AWS, Azure, GCP, Cloudflare, OVH) often resolve to the provider’s primary data-center region, which may differ from where the underlying VM is physically running.
Caveats
Hosting location ≠ attacker location. Attackers routinely use rented hosting in countries with weak attribution or extradition. Maps show infrastructure, not perpetrators.
Geolocation is approximate. Region/province accuracy depends on the IP’s WHOIS records and the geolocation provider’s heuristics. Mobile, VPN, and CDN traffic often resolve to incorrect locations.
Snapshot, not stream. We sample once per 24 hours. Threats that come online and disappear within a single day may be missed. The accumulated view captures threats that persist or recur.
Daily geolocation budget. ip-api.com’s free tier limits us to ~64,000 lookups per day. Typical fresh-IP volume is 50–500 per day, so we operate well under the limit, but a sudden surge (e.g. botnet takedown reveals 10,000 new C2s in a day) could exceed it. The fetcher processes IPs in priority order and warns rather than fails.
EPSS interpretation. EPSS scores are probabilities, not binary judgments. EPSS = 0.95 means “95% probability of exploit observation in the next 30 days,” not “95% severe.”
KEV is conservative. A CVE not in KEV is not necessarily safe — CISA only adds CVEs after observing exploitation in operational US federal incident-response cases. Many CVEs are exploited globally without ever appearing in KEV.
Privacy note on per-IP display
Threat IPs are not displayed individually in the site’s main views. Aggregations (province, country, AS, malware family) are shown instead. The raw IPs are present in the downloadable CSVs for transparency and reproducibility — the same IPs appear in the upstream Abuse.ch feeds and KEV catalog, which are themselves public.
Code license
MIT — see the LICENSE file.