Internet-Wide Scan Data – Port Coverage

This is an actively updated (last updated 05/14/2020) list of the organizations that offer Internet scanning data for sale, or in some cases, for free. If you have additional information that would be helpful in determining coverage, please contact us via [email protected]!

Below, find a list in alphabetical order, with the public coverage metrics they publish.

BinaryEdge – Currently 240 ports. They also offer on-demand scanning.

Censys – Currently 1045+ Ports. They also offer on-demand scanning.

Onyphye – Currently 100 ports.

Rapid7 (Project Sonar) – Rapid7’s data is open, and while they don’t advertise the number, it’s pretty simple to grep out the per-port coverage. Currently they cover 21 UDP and 133 TCP ports on a monthly basis.

SecurityTrails – Advertises 50+ of the most important ports scanned weekly.

Shodan – The grandaddy of Internet scanning, Shodan does not publish coverage numbers. They also offer on-demand scanning.

Spyse – Relative newcomer to the scene, they advertise coverage of 55 ports and growing.

ZoomEye – Also a relative newcomer, their coverage is currently unknown.

There are other folks that perform Internet scanning as part of their core offering, but don’t sell the data standalone, and for that reason, we’ve not included them in this list.

The takeaway? No single organization offers exactly the same data, and in order to get the most complete coverage of what’s happening on the internet, you might need to combine multiple to get to the the level of visibility needed.

Getting Started on Digital Ocean

We’ve had a few requests in the Slack about getting started on Digital Ocean. It’s as simple as starting a droplet and running the Docker One-Liner, but i’ll document the exact steps below.

First log into Digital Ocean. The interface is pretty straightforward, but first we need to create a project:

No need to move existing resources in, we’re starting a new project, so we’ll click “Skip for now”

Now, we have a project, so let’s create a new Droplet inside it.

And let’s make sure to give it at least 16GB of RAM:

Everything else can be standard, so let’s scroll down. Obviously make sure you set up / use an SSH key you have access to so you can log in. In this case, I’m using a pre-generated SSH key on my local system:

And then click “Create Droplet”

Now, give DO a few seconds, and we can browse to Droplets and get our IP Address

Then, we’ll use our SSH key to log in:

And first things first, let’s update the system and install Docker with the following command:

apt-get -y update && apt-get -y upgrade && apt-get -y install docker.io  

Then, simply run the docker one-liner command, found here.

This will generate a dynamic (and random) password, so make sure to make note of that:

Now, back on your laptop or PC, simply connect to the Digital Ocean host on https://YOUR_DROPLET_IP:7777, use the password provided in the startup output, and you’re live! No need to configure firewall or any other specifics for this droplet.

If you have any problems, feel free to file an issue on the Github repository.

Detecting Backported Software (Versions) with Ident

One common challenge with version detection and inference-based vulnerability analysis – the kind we do in Ident and Core- is graceful handling of versions which appear to be vulnerable baed on their versions, but in reality have been backported. 

What do we mean by backporting? Well, some operating systems (*ahem* Red Hat *ahem*) will apply security fixes and patches to previous version of a software package, but not update the software version. Rather, they update their own package version. The software keeps reporting the vulnerable version, but users are protected. This is commonly referred to as backporting.

This is actually a good thing for users, in general, but can result in some false positives when performing unauthenticated vulnerability assessments. So we need to prepare for it. To address this, we added a bit of sophistication to the ident library, which allows us to detect this behavior in cases where we know the operating system, and respond accordingly. 

Below, you can see how our dynamic version detection for PHP, OpenSSL and Apache, all handle this by looking for the existence of RHEL, Red Hat or CentOS, and appending the “(Backported)” bit to the version string. Note that this might result in some false negatives, so we retain the version string for users.

code snippet of ident backporting changes
Version Detection now handles Backporting gracefully

You can see the full set of changes in the Ident repository on Github

When using the Ident CLI utility, previously, you’d have seen results like this, which were false positives

Now, we politely decline to infer these results, while retaining the version information and appending “(Backported)” to the version string. Which results in the following:

Updated results showing no vulnerability inference on Backported software

Core now gives the same result, as it uses the Ident project as a library, and benefits from changes in fingerprinting capabilities, automatically. .

The current Gem version is 0.92 and can be installed directly from Github. This change is also now live on Ident’s master branch, and you can quickly test it out using the pre-built docker image.

Intrigue Core v0.7 Released!

Oh. Hey! Wow. You look, better, even, … i mean … you’re practically glistening. It’s been a year, hasn’t it? You must be working out. What have we been up to? Oh. I’m glad you asked! (PS – if you want to get straight to the goods, go here.)

Ready to go? Let’s dig in.

One underlying but prevailing theme of this release is “scaling up”. As we operationalized more engines over the last year to support our efforts in the Intrigue.io service, we needed a proper process management system, and to split out supporting components to their own managed processes. These are services such as:

  • Headless Chrome (for screen grabs, fingerprinting JS, etc)
  • Apache Tika (for parsing pretty much every file format on the planet, safely)
  • An EventMachine-based DNS Resolver for super fast resolution

And once these components were properly managed, database optimization became a focus, getting into the gory guts of Postgres and finally driving the ability to past a million entities per project. (Try the machines feature after running “create_entity” on your domain of choice :], and you’ll see.)

So with that work in place, we focused on a new and improved “Standalone” Docker Image as part of this release, which (finally, we know!) uses Docker’s volume support to allow you to save your data locally. No more starting from scratch each time you spin up an image!

Another key feature of this release is the all-new issue tracking system. Issues are now a first class object – like Entities – and are our way to capture vulnerabilities, misconfigurations, and other findings which should be brought to the attention of an analyst.

This release also adds some other oft-requested features including SSL by default and a much more in-depth automated scoping engine. More on that below.

Even with several major new features¬†in this release, it’s hard to overstate how much has changed in the codebase over last 12 months. And we’re not slowing down. As always with a new release, this one brings tons of new tasks, entities, improvements and bugfixes (…read on for details)

Automated Scoping

One major feature since the last release that will be very visible when you use a Machine is the automated scoping functionality.

Scoping takes the seeds (Entities) you provide and uses them as guidance for what to scan, and more importantly, what NOT to scan. In previous versions, thiis was a blacklist, but now there’s some smarts behind it.

Try it by using the “create_entity” task and the ” machine with a few iterations.

You’ll notice right away on the entities page that some are hidden by default. This is the scoping engine.

You can view them by selecting “Show unscoped” and “Show hidden” on the search screen.

Automated scoping results in amazonaws.com entities (and others) being hidden by default…

Give it a try and let us know what you think!

New Discovery Tasks

Okay, so this bit is going to get a little long. And, while it’s been a year, many of these tasks were built and refined over just the last 3 months, thanks in no small part to @anasbensalah who joined as a committer this year.

This v0.7 release includes 23 shiny new tasks, bringing the current total to 124 discovery tasks.

Ready to dig in? The new tasks are in alphabetical order below and each individually linked to their implementation for those brave enough to dive into the codebase.

  • dns_lookup_dkim – Attempts to identify all known DKIM records by iterating through known selectors.
  • dns_morph – Uses the excellent DNSMORPH to find permuted domains.
  • email_brute_gmail_glxu – Uses an enumeration bug in the mail/glxu endpoint on gmail to check account existence.
  • gitrob – Uses the excellent Gitrob to search a given GithubAccount for committed secrets.
  • saas_google_calendar_check – Checks to see if public Google Calendar exists for a given user.
  • search_alienvault_otx_hashes – This task searches AlienVault OTX via API and checks for information related to a FileHash.
  • search_binaryedge – This task hits the BinaryEdge API for a given IpAddress, DnsRecord, or Domain, and creates new entities such as NetworkServices and Uri, as well as associated host details.
  • search_binaryedge_risk_score – This task hits the BinaryEdge API and provides a risk score detail for a given IPAddress. It can optionally create an issue for high risk IPs.
  • search_binaryedge_torrents – This task hits the BinaryEdge API for a given IPAddress, and tells us if the address has seen activity consistent with torrenting
  • search_dehashed – This task hits the Dehashed API for leaked accounts.
  • search_grayhat_warfare – This task hits the Grayhat Warfare API and finds AwsS3Buckets.
  • search_hunter_io – This task hits the Hunter.io API. EmailAddresses are created for a given domain.
  • search_spyonweb – This task hits the SpyOnWEB API for hosts sharing the same IPAddress, Domains, or AnalyticsId.
  • uri_brute_focused_content – Check for juicy content based on the site’s technology stack (This is a special task, part discovery and part vuln check, so it’s listed below, as well).
  • uri_check_subdomain_hijack – Checks for a specific string on a gievn uri, and creates a hijackable subdomain issue if it matches.
  • well_known_gather_and_parse – Checks for files in the /.well-known/ directory, as defined in RFC5785.
  • wordpress_enumerate_plugins -If the provided Uri is running WordPress (as fingerprinted by Ident), this’ll enumerate the plugins
  • wordpress_enumerate_users – If the target’s running WordPress, this’ll enumerate the users

As if that wasn’t enough, the following new tasks help determine if a given Domain or DnsRecord is compromised or otherwise blocked by utilizing the content blocking provided in the respective provider’s DNS service. They’re all very similar in implementation, but may provide different results depending on the provider. These is more great work from Anas.

New Entity Types

Reading carefully above, you might notice some of the tasks are introducing new entity types, and for that matter, new use cases.

This release brings two new entities. First, the “AnalyticsId” which represents an id from an analytics provider like NewRelic or Google. Secondly the “FileHash” entity brings us the ability to represent an md5 or sha1 hash as an entity.

Definitely check the tasks creating these entities (search_spyonweb, and search_alienvault_otx_hashes, respectively) above and have a play around with them. Feedback is very welcome. If you find them useful or have ideas on ways we could improve, let us know and we’ll add support for more providers and hash types.

Major Improvements to Tasks

The following were significantly overhauled during the course of this release, and worth checking out again if you have tried the task previously. These now have a lot more functionality.

  • search_have_i_been_pwned
  • search_phishtank
  • search_shodan
  • search_certspotter
  • scrape_publicwww
  • search_alienvault_otx
  • import/umbrella_top_sites

Bugfixes

Luckily we had no bugs in the last release, so this one will continue that tradition. (Just kidding, there were simply way too many to mention. You know how to find them.)

New Vuln Checks

If you were following along over the last year, you probably noticed a significant amount of effort went into testing for vulnerabilities and misconfigurations.

The 0.7 release brings 9 new vuln check tasks, each linked below.

Now that we have a better system for finding and reporting them (blog post forthcoming), you can expect to see more of this kind of shiny goodness in the future.

Thank You!!!

This release has been well over a year in the making and would not have been possible without the following contributors of code, ideas, and bugs. Make sure to say thank you the next time you see these fine folks.

So that’s it you say? Well, it’s as much as we could recollect of the blur that was 2019. There’s surely a bunch of neat stuff that we’ve forgotten and you’ll discover when you get started. So with that, go get started now!

Try it out and send feedback via Email, Slack or Twitter. Have fun, and let’s not let it go another year before we do this again!

Nahamsec interview with @th3g3nt3lman

Here’s a clip of an interview with @nahamsec and @th3g3nt3lman talking about how Intrigue Core can help bug bounty hunters and internal security teams. If you haven’t yet seen Nahamsec’s Twitch.tv channel, it’s a good source of techniques and fun to watch.

The clip talking about Intrigue Core starts at the 25:00 min mark. Check it out!

@nahamsec talking bug bounty recon with @th3g3ntl3man

Ident Docker One-Liner

On a pentest or in a hurry and want to try out ident to fingerprint an application quickly? Use this one-liner which pulls the latest build from dockerhub and kicks off a run:

docker pull intrigueio/intrigue-ident && docker run -t intrigueio/intrigue-ident --url YOUR_URL_HERE

Also, handy, add a -v to check for top vulnerabilities in a given technology (as long as we have a version, we’ll match to known CVEs)

If you’re interested in the details of how it works, add a -d to see debug output!

See the checks in all their glorious detail on Github. We’re well over 500 and adding more on a regular basis. If you don’t see a technology you’d like fingerprinted, create an issue or send us a pull request!

Gitrob Integration

Gitrob is a handy open source utility by Michael Hendrickson to find secrets in public Github repositories. Gitrob works by downloading all repositories under a given Github account, and then scanning for strings that might be an accidental leak. Even if a given line or file has been investigated, it may still be in the commit log, so Gitrob will check all commits for these potential leaks. Learn more about Gitrob.

This new Core integration makes it simple to spin up Gitrob every time we find a Github repository, and by combining it with the search_github task, we can now scale our search for leaked secrets very quickly!

This integration and task are now on the develop branch. To use it immediately, build a local Docker image.