This is an actively updated (last updated 05/14/2020) list of the organizations that offer Internet scanning data for sale, or in some cases, for free. If you have additional information that would be helpful in determining coverage, please contact us via [email protected]!
Below, find a list in alphabetical order, with the public coverage metrics they publish.
BinaryEdge – Currently 240 ports. They also offer on-demand scanning.
Censys – Currently 1045+ Ports. They also offer on-demand scanning.
Rapid7 (Project Sonar) – Rapid7’s data is open, and while they don’t advertise the number, it’s pretty simple to grep out the per-port coverage. Currently they cover 21 UDP and 133 TCP ports on a monthly basis.
SecurityTrails – Advertises 50+ of the most important ports scanned weekly.
Shodan – The grandaddy of Internet scanning, Shodan does not publish coverage numbers. They also offer on-demand scanning.
Spyse – Relative newcomer to the scene, they advertise coverage of 55 ports and growing.
ZoomEye – Also a relative newcomer, their coverage is currently unknown.
There are other folks that perform Internet scanning as part of their core offering, but don’t sell the data standalone, and for that reason, we’ve not included them in this list.
The takeaway? No single organization offers exactly the same data, and in order to get the most complete coverage of what’s happening on the internet, you might need to combine multiple to get to the the level of visibility needed.
We’ve had a few requests in the Slack about getting started on Digital Ocean. It’s as simple as starting a droplet and running the Docker One-Liner, but i’ll document the exact steps below.
First log into Digital Ocean. The interface is pretty straightforward, but first we need to create a project:
No need to move existing resources in, we’re starting a new project, so we’ll click “Skip for now”
Now, we have a project, so let’s create a new Droplet inside it.
And let’s make sure to give it at least 16GB of RAM:
Everything else can be standard, so let’s scroll down. Obviously make sure you set up / use an SSH key you have access to so you can log in. In this case, I’m using a pre-generated SSH key on my local system:
And then click “Create Droplet”
Now, give DO a few seconds, and we can browse to Droplets and get our IP Address
Then, we’ll use our SSH key to log in:
And first things first, let’s update the system and install Docker with the following command:
This will generate a dynamic (and random) password, so make sure to make note of that:
Now, back on your laptop or PC, simply connect to the Digital Ocean host on https://YOUR_DROPLET_IP:7777, use the password provided in the startup output, and you’re live! No need to configure firewall or any other specifics for this droplet.
Oh. Hey! Wow. You look, better, even, … i mean … you’re practically glistening. It’s been a year, hasn’t it? You must be working out. What have we been up to? Oh. I’m glad you asked! (PS – if you want to get straight to the goods, go here.)
Ready to go? Let’s dig in.
One underlying but prevailing theme of this release is “scaling up”. As we operationalized more engines over the last year to support our efforts in the Intrigue.io service, we needed a proper process management system, and to split out supporting components to their own managed processes. These are services such as:
Apache Tika (for parsing pretty much every file format on the planet, safely)
An EventMachine-based DNS Resolver for super fast resolution
And once these components were properly managed, database optimization became a focus, getting into the gory guts of Postgres and finally driving the ability to past a million entities per project. (Try the machines feature after running “create_entity” on your domain of choice :], and you’ll see.)
So with that work in place, we focused on a new and improved “Standalone” Docker Image as part of this release, which (finally, we know!) uses Docker’s volume support to allow you to save your data locally. No more starting from scratch each time you spin up an image!
Another key feature of this release is the all-new issue tracking system. Issues are now a first class object – like Entities – and are our way to capture vulnerabilities, misconfigurations, and other findings which should be brought to the attention of an analyst.
This release also adds some other oft-requested features including SSL by default and a much more in-depth automated scoping engine. More on that below.
Even with several major new features in this release, it’s hard to overstate how much has changed in the codebase over last 12 months. And we’re not slowing down. As always with a new release, this one brings tons of new tasks, entities, improvements and bugfixes (…read on for details)
One major feature since the last release that will be very visible when you use a Machine is the automated scoping functionality.
Scoping takes the seeds (Entities) you provide and uses them as guidance for what to scan, and more importantly, what NOT to scan. In previous versions, thiis was a blacklist, but now there’s some smarts behind it.
You’ll notice right away on the entities page that some are hidden by default. This is the scoping engine.
You can view them by selecting “Show unscoped” and “Show hidden” on the search screen.
Give it a try and let us know what you think!
New Discovery Tasks
Okay, so this bit is going to get a little long. And, while it’s been a year, many of these tasks were built and refined over just the last 3 months, thanks in no small part to @anasbensalah who joined as a committer this year.
This v0.7 release includes 23 shiny new tasks, bringing the current total to 124 discovery tasks.
Ready to dig in? The new tasks are in alphabetical order below and each individually linked to their implementation for those brave enough to dive into the codebase.
dns_lookup_dkim – Attempts to identify all known DKIM records by iterating through known selectors.
dns_morph – Uses the excellent DNSMORPH to find permuted domains.
email_brute_gmail_glxu – Uses an enumeration bug in the mail/glxu endpoint on gmail to check account existence.
gitrob – Uses the excellent Gitrob to search a given GithubAccount for committed secrets.
As if that wasn’t enough, the following new tasks help determine if a given Domain or DnsRecord is compromised or otherwise blocked by utilizing the content blocking provided in the respective provider’s DNS service. They’re all very similar in implementation, but may provide different results depending on the provider. These is more great work from Anas.
Reading carefully above, you might notice some of the tasks are introducing new entity types, and for that matter, new use cases.
This release brings two new entities. First, the “AnalyticsId” which represents an id from an analytics provider like NewRelic or Google. Secondly the “FileHash” entity brings us the ability to represent an md5 or sha1 hash as an entity.
Definitely check the tasks creating these entities (search_spyonweb, and search_alienvault_otx_hashes, respectively) above and have a play around with them. Feedback is very welcome. If you find them useful or have ideas on ways we could improve, let us know and we’ll add support for more providers and hash types.
Major Improvements to Tasks
The following were significantly overhauled during the course of this release, and worth checking out again if you have tried the task previously. These now have a lot more functionality.
Luckily we had no bugs in the last release, so this one will continue that tradition. (Just kidding, there were simply way too many to mention. You know how to find them.)
New Vuln Checks
If you were following along over the last year, you probably noticed a significant amount of effort went into testing for vulnerabilities and misconfigurations.
The 0.7 release brings 9 new vuln check tasks, each linked below.
Now that we have a better system for finding and reporting them (blog post forthcoming), you can expect to see more of this kind of shiny goodness in the future.
This release has been well over a year in the making and would not have been possible without the following contributors of code, ideas, and bugs. Make sure to say thank you the next time you see these fine folks.
@kennasecurity (for being, well, awesome and helping this project grow)
So that’s it you say? Well, it’s as much as we could recollect of the blur that was 2019. There’s surely a bunch of neat stuff that we’ve forgotten and you’ll discover when you get started. So with that, go get started now!
Try it out and send feedback via Email, Slack or Twitter. Have fun, and let’s not let it go another year before we do this again!
Here’s a clip of an interview with @nahamsec and @th3g3nt3lman talking about how Intrigue Core can help bug bounty hunters and internal security teams. If you haven’t yet seen Nahamsec’s Twitch.tv channel, it’s a good source of techniques and fun to watch.
The clip talking about Intrigue Core starts at the 25:00 min mark. Check it out!
Gitrob is a handy open source utility by Michael Hendrickson to find secrets in public Github repositories. Gitrob works by downloading all repositories under a given Github account, and then scanning for strings that might be an accidental leak. Even if a given line or file has been investigated, it may still be in the commit log, so Gitrob will check all commits for these potential leaks. Learn more about Gitrob.
This new Core integration makes it simple to spin up Gitrob every time we find a Github repository, and by combining it with the search_github task, we can now scale our search for leaked secrets very quickly!
Recently, Rob Graham shared a post detailing how he used Masscan + RDPscan to check for vulnerable hosts, finding that over 1 million hosts were vulnerable to BlueKeep (CVE-2019-0708). I was curious how many of these systems were corporate or enterprise systems, given that the awareness is often higher in organizations with dedicated patch and vulnerability management teams.
To explore this, using scan data gathered on the Fortune 500 from Intrigue.io, I pulled all systems with port 3389 open, finding a total of 1140 systems.
Using the same tooling (rdpscan) as Rob, i then checked to see if these hosts were were still exposed to Bluekeep. When attempting to connect to these systems to verify that they were in fact RDP, we found that only 286 responded with an RDP protocol . The difference can probably be attributed to firewalls and other network security devices that respond automatically (and erroneously) when scanned.
So, using the set of 286 systems verified to be RDP and returning results from RDPscan, we found that across 49 unique F500 organizations exposing a system, they could be broken into the following statuses:
This is pretty good, in my opinion. Given that the vulnerability was announced on 05/14/2019, and this check was run on 05/31/2019, two weeks to patch or mitigate 75% of the vulnerable systems is incredible. I’d attribute this to the fact that there are often dedicated teams inside these large organizations that pay special attention to externally accessible systems, and often will apply a patch “out of cycle” in cases like this.
Of those 71 vulnerable systems, they were spread across 17 organizations in the following sectors:
The organization with the most publicly-exposed RDP services was an Oil and Gas company, and it was interesting to see systems attributed to the same organization that were only partially patched or mitigated, with many still vulnerable. Patching, even in this case, where the update is available, and could theoretically be automatically applied, is still a time consuming and change-controlled process in larger organizations. Their systems were about 2/3 patched or mitigated, with 34 systems still externally exposed and vulnerable:
Wrapping up, this was a quick look from a different perspective around this vulnerability, in an attempt to see how many of those million systems were “managed” systems, attributable to an organization. As suspected, there were few externally accessible F500 systems still vulnerable to Bluekeep two weeks out from the announcement of the vulnerability. This speaks to the processes inside these organizations to manage and remediate important vulnerabilities such as BlueKeep.
This data was gathered per-organization using Intrigue Core based on a set of “seeds” attributed to each organization, and thus may not be 100% complete. It does not attempt to account for internal hosts, where an RDP worm would likely wreak havoc in most organizations. I strongly suggest following Microsoft’s guidance and applying the patch, even if this requires an out of band update. Given that real attack surface is the internal corporate network, it’s likely we’ll see this vulnerability weaponized as part of a multi-tier attack, similar to how EternalBlue has been being used.