Using uri_spider to parse file metadata

The uri_spider task, when given a Uri entity such as, will spider a site to a specified level of depth (max_depth), a specified max number of pages (limit), and if configured, a specified url pattern (spider_whitelist). When configured – and by default – it will extract DnsRecord types, PhoneNumbers and EmailAddress type entities in the content of the page. All spidered Uris can be can created as entities using the extract_uris option.

Further, the spider will identify any files of the types listed below, and parse their content and metadata for the same types. Because this file parsing uses the excellent Apache Tika under the hood, the number and type of supported file formats is huge – over 300 file formats are supported including common formats like doc, docx and pdf – as well as more exotic types like application/ogg and many video formats. To enable this, simply enable the parse_file_metadata option.

Below, see a screenshot of the task’s configuration:

uri_spider task configuration

Note that you can also take advantage of Intrigue Core’s file parsing capabilities on a Uri by Uri basis by pointing the uri_extract_metadata task at a specific Uri with a file you’d like parsed, such at


Docker One-Liner

Assuming you have Docker installed, this will pull the latest standalone image from DockerHub, and start it listening on :7777.

###### The following command will pull and run the latest image. 
### Feel free to remove the -v option if you do not need to preserve
### your projects between runs
docker pull $IMAGE && \
docker run -e LANG=C.UTF-8 \
--memory=8g \
-v ~/intrigue-core-data:/data \
-p \
-it $IMAGE

TO PRESERVE DATA BETWEEN RUNS: We’ve now added the ability to preserve your instance data between container runs, thanks to Docker’s volume support. If you’d like to adjust where the data is stored, simply adjust the -v option in the docker run command above, substituting the “~/intrigue-core-data” folder with wherever you’d like to store the data on your host system.

PERMISSION ISSUES?: If you’re running on a system as a user other than root, you may need to add the —privileged flag

Once downloaded, you’ll see the startup happen automatically, and a password will be generated for you. You’ll be able to access the interface on :7777.

Now that you have an instance running, check out the Up and Running with Intrigue Core guide.

Using Intrigue Ident for Application Fingerprinting

Chatting with folks at RSA and BsidesSF, i realized it’d be helpful to share more information about the new application fingerprinter behind Intrigue Core, Ident.

Ident is a new standalone project, with a clear focus: be the most complete, flexible and most extensible software for fingerprinting application layer technologies and vulnerabilities. Given that it’s launching with over 300 checks reflecting current and widespread technology, and it’s simple to craft a new check (see details below), it’s on the way toward fulfilling this mission.

You might wonder… why not just integrate with nmap, recog, wappalyzer, or others? A couple key reasons… 1) A focus on freedom (the code is BSD licensed) and 2) Razor sharp focus on app-layer technology. To give you a more detailed view, here’s what I highlighted at BsidesSF this year – these are the key qualities I was looking for:

Compared with Recog – another excellent fingerprinting library – which is more static in its format (not a good fit for ident but also a strength) focused on infrastructure. Recog’s focus on infrastructure actually makes it a great complement to Ident, and thus it’s been integrated into Intrigue Core as well.

And while there are many tools and libraries out there, each had a licensing or technology limitation against this criteria, making them incompatible with the focus of the project.

In addition, spinning up a standalone fingerprinter has a number of benefits:

  • It makes it easier to use, and to contribute to the overall project, checks are pretty simple to create and test.
  • It opens up new use cases … If you have a set of known applications, but want to know if they’re running a given version, or if they’re configured properly, Ident’s CLI can be an excellent fit. You can just run it against a list of urls (see below).
  • Automation of Ident can be a lot easier than automating against the whole Intrigue Core platform. Feel free to drop the library into your project, and reach out if we can help you do so!
  • By building from the ground, we can integrate CPE support, ensuring vulnerability inference vs the CVE database “just works” and we don’t need to do anything special to determine vulnerabilities for a given version.

To give you a quick run through and some examples of what it can do, here’s an example of the CLI running against a single URL:

And here’s one against a file of URLs (one per line), which automatically saves results into a CSV file:

A cool thing about the CLI tool is how it handles “content checks” – a special type of checks that will always run and print output vs “fingerprint checks” – which will also run, but will only ever show up in output if they match. The CLI generates an output.csv file that makes each content check a column and is smart enough to know if a new check is added! Simply drop a new check into the “checks/content” folder if you want to get the output in the CSV.

Here’s an example content check, this one checks for directory indexing in the content of the tested page:

Fingerprint checks are also pretty simple to write, this one matches to Axis Webcams, and as you can see, it checks the body contents for a unique string. You can regex against the contents of the body, headers, cookies, title, and generator.

Ident is also tightly integrated with Google’s “Chrome Headless”, so if you add the ‘-b’ flag, you’ll notice that some additional checks are run (and it may run a little more slowly … ~10s per url on a recent machine), but this is because it’s parsing and fingerprinting against the full DOM. Very handy!

In order to keep the library speedy and minimize the number of queries that are made as a given application is fingerprinted, each Ident check takes a “paths” parameter that is used as a template for the requested pages, and this is pre-processed at runtime to ensure only ONE request is made for each unique path. This keeps the fingerprinting FAST and so we will endeavor to minimize the number of unique paths going forward! Fortunately the standard “#{url}” path is often still VERY verbose about the running software.

If you’d like to get started using it right away, you can pull the latest Dockerhub image and start testing using the following command:

docker pull intrigueio/intrigue-ident && docker run -t intrigueio/intrigue-ident --url [YOUR URL HERE]

That’s it for now. Reach out if you have questions! You can check out the full set of checks here: If you’re interested in helping out, or have ideas on how to improve the project, certainly pop into the Slack channel and say hello, or reach out on twitter: @intrigueio.

Intrigue Core v0.6 Released!

Today marks the release of Intrigue Core v0.6, bringing a bunch of new functionality including: automatic inference of CVEs and vulnerabilities on discovered applications, new entities such as “Finding” and “Domain” types, and usability features such as the ability to import a list and new analysis views to easily see expired certs or out-of-compliance cipher suites.

See below for details on how to get it, and enjoy!

Major Features 

  • Added a vulnerability inference capability based on ident’s fingerprinting
  • Added support for a “Domain” entity, representing a top-level domain (vs a standard DnsRecord)
  • Added initial support for “Finding” entity, enabling tasks to easily surface actionable findings
  • Added the ability to import and run a set of tasks on a list (thanks @hollywoodmarks!)
  • Added new analysis views (ciphers, javascript, cves)
  • Added support for task “Notifiers” & an initial (Slack) notifier
  • Adjusted application fingerprinting to a new standalone library, “intrigue-ident”
  • Adjusted Enrichment tasks to run in-line, eliminating a variety of race conditions when running machines

Minor Features

  • Added support for go-based utilities in the image via util/ setup script
  • Added support for RDAP, enabling new RIR Whois lookups (Afrinic, Lacnic, Apnic)
  • Adjusted handling of network services – all types are now subtype of “NetworkService”
  • Adjusted handling of saas services – all types are now subtype of “WebAccount”
  • Adjusted base image to Ubuntu 16.04 (util/ setup)
  • Upgraded to Bundler 2
  • Upgraded to latest GeoLite2-City

Major Bugs

  • Fixed bug causing the system to throw a runtime error when an API key is missing
  • Fixed bug in util/ script that would cause a hang due to grub-pc (thanks @bpmcdevitt!)
  • Fixed bug that would cause memory leaks in Chrome headless browser teardown
  • Mounted an out of control rollercoaster of regex bugs and arrived victorious
  • … literally hundreds of other minor bugs

New Tasks:

A huge thanks to the following folks who submitted PRs and/or contributed to this release:

You can download and run Intrigue Core v0.6 immediately using one of the following guides:

If you’re interested in contributing to the effort to make Intrigue Core the best OSINT and security intelligence gathering framework around, please jump in our chat and say hello!

Intrigue Core v0.5 Released!

Announcing the immediate availability of Intrigue Core v0.5. This release is  heavy with system level improvements and bugfixes. There are also key investments made in this release with the removal of (the now-deprecated) PhantomJS and the  integration of Chrome Headless browser for screenshots and javascript fingerprinting.

A number of security-related improvements were made as well, with all platforms (EC2, Docker, Vagrant) all relying on the same core system setup scripts and no longer using “rbenv sudo” to run as the root user.

This release continues to make progress on fingerprinting with a number of new fingerprints being added.

Enjoy, and Happy 4th!

– jcran

New Functionality: 

  • Update to Ruby 2.5.1
  • Moved headless browsing components from PhantomJS to Chrome Headless
  • Consolidated & simplified system bootstrap for Docker / Vagrant / AMI
  • Removal of rbenv sudo
  • Support for dynamic task queues
  • Lots of new fingerprints (pfsense, telerik, atlassian, etc)
  • Faster exports (streaming JSON) & dynamically generated export UI

New Tasks:

How to get it 

You can download and run Intrigue Core 0.5 immediately on the following platforms.

Announcing … Intrigue Core v0.4!

Announcing the immediate release of Intrigue Core v0.4!

In this release, you’ll find:

If that weren’t enough, we added a total of 19 new modules:

This release also had a ton of work over the last few weeks as we prepared for RSA 2018. At RSA, Ed Bellis & I discussed “Recon for Defenders” and offered up a few specific CVEs and software that defenders must be very quick to patch – particularly when it’s available for scanning.

As part of that work, we spun up around over 100 simultaneous instances of Intrigue Core, and used these instances to scan the F500 using the “org_asset_discovery_active” strategy and a single domain seed. After running for 10 hours total, we had the world’s first ~complete attack surface scan of the entire F500. Pretty sweet.

We then anonymized and released the data from those tests. As you dig into them, you’ll notice a large number of servers and applications exposed at the perimeter that were still running vulnerable versions of this software at the time of testing.

Digging through the results, I realized that Core’s fingerprinting capabilities needed a lot of work, and so shortly after the talk, I sat down and overhauled the application fingerprinter, creating a pluggable system. Now, for each URI that the system wants to fingerprint, any piece of software can plug in a set of checks. This architecture us to minimize the number of HTTP requests we make, while still supporting a large number of fingerprints.

Now that v0.4 is available,  you can now immediately download and run Core through the normal AMI, Dockerfile, or (new in this release) in a local or remote VM using Vagrant!