Data Discovery Tools

Compare All Data Discovery Tools

Filters

Apply Filters:

X

Your Industry

Your Company Size

Price

Sort by

Recommendations: Sorts listings by the number of recommendations our advisors have made over the past 30 days. Our advisors assess buyers’ needs for free and only recommend products that meet buyers’ needs. Vendors pay Software Advice for these referrals.
Reviews: Sorts listings by the number of user reviews we have published, greatest to least.
Sponsored: Sorts listings by software vendors running active bidding campaigns, from the highest to lowest bid. Vendors who have paid for placement have a ‘Visit Website’ button, whereas unpaid vendors have a ‘Learn More’ button.
Avg Rating: Sorts listings by overall star rating based on user reviews, highest to lowest.
A to Z: Sorts listings by product name from A to Z.
Showing 1 - 20 of 69 products
Filters
Showing 1 - 20 of 69 products

Hexomatic

Hexomatic is a no-code, work automation platform that enables businesses to harness the internet as their own data source and leverage ready-made automations to scale time-consuming tasks. This platform allows users to scrape the ...Read more

4.70 (20 reviews)

ManageEngine ADAudit Plus

ManageEngine ADAudit Plus is a Windows auditing, security, and compliance solution. Key features include comprehensive logon auditing, detailed change monitoring, real-time risk alerting, and streamlined compliance reporting for A...Read more

4.26 (38 reviews)

Tableau

Tableau is an integrated business intelligence (BI) and analytics solution that helps to analyze key business data and generate meaningful insights. The solution helps businesses to collect data from multiple source points such as...Read more

Learn More

Phocas Analytics

Phocas is a team of passionate professionals who are committed to helping people feel good about their data. Our software brings together organizations’ most useful data from an ERP and other business systems and presents it in a ...Read more

4.79 (98 reviews)

Learn More

Altair Monarch

Altair Monarch is a self-service, web-based data preparation solution that helps businesses extract data from reports such as HTML, PDF and XPS. The platform can access data from customer lists, sales reports, logs, inventories, i...Read more

4.65 (20 reviews)

Learn More

Qlik Sense

Qlik Sense is a business intelligence (BI) and visual analytics platform that supports a range of analytic use cases. Built on Qlik’s unique Associative Engine, it supports a full range of users and use-cases across the life-cycle...Read more

Learn More

SAS Visual Analytics

SAS Business Intelligence is a cloud-based enterprise analysis tool that helps users monitor metrics and manage interactive reports. Designed for large businesses, the platform’s features include customizable dashboard, marketing ...Read more

4.31 (35 reviews)

Learn More

Funnel

Funnel Dashboards & Reports is a cloud-based marketing analytics and reporting software for online advertisers and e-commerce companies. The software automatically collects data from all advertising platforms and allows marketers ...Read more

4.69 (16 reviews)

Learn More

AnswerRocket

AnswerRocket uses natural language processing and AI-driven automation to generate in-depth insights and visualizations in seconds. AnswerRocket is built for businesses needing direct access to their enterprise data. Our powerful...Read more

4.64 (14 reviews)

1 recommendations

Learn More

Lumenore

Discover actionable insights in your data silos! Lumenore democratizes business intelligence with no-code analytics. Empower your entire team to derive insights from data - giving you a transparent view of your operations and hel...Read more

No reviews yet

Learn More

Software pricing tips

Read our Data Discovery Tools Buyers Guide

Subscription models

  • Per employee/per month: This model allows you to pay a monthly fee for each of your employees.
  • Per user/per month: Users pay a monthly fee for users—normally administrative users—rather than all employees.

Perpetual license

  • This involves paying an upfront sum for the license to own the software and use it indefinitely.
  • This is the more traditional model and is most common with on-premise applications and with larger businesses.

Rated best value for money

Manta

MANTA’s unified lineage platform automatically scans a data environment to build a map of data flows and deliver it through a native UI and other channels to both technical and non-technical users. With MANTA, users gets visibilit...Read more

4.10 (10 reviews)

Learn More

Golden

Golden Research Engine is a cloud-based data discovery solution that helps small to large enterprises retrieve information on queries by accessing the in-built knowledge base. The centralized platform provides administrators with ...Read more

No reviews yet

Learn More

Lucidworks Fusion

Lucidworks Fusion is a cloud-based solution designed to help IT teams manage data discovery through natural language processing (NLP), query intent classification, information clustering and ranking algorithms. Key features includ...Read more

4.00 (1 reviews)

Learn More

OpenText Magellan

OpenText Magellan is a predictive analytics platform powered by artificial intelligence (AI) and machine learning. The platform is designed to help businesses across various industries make data-driven decisions by combining self-...Read more

0.00 (1 reviews)

Learn More

Wolfram Mathematica

Wolfram Mathematica is a technical computing solution that provides businesses of all sizes with tools for image processing, data visualization and theoretic experiments. The notebook interface enables users to organize documents ...Read more

Learn More

JMP

JMP is an on-premise data analytics solution that helps scientists, engineers and data explorers understand complex data relationships and visualize them via interactive dashboards. The data acquisition and cleanup functionalities...Read more

4.62 (42 reviews)

Learn More

Bold BI

Bold BI is an on-premise and cloud-based software that enables businesses in construction, education, energy, healthcare, insurance and other industries to process, combine and analyze collected data on a unified platform. With it...Read more

4.83 (6 reviews)

Learn More

Atlan

The Atlan Collect platform helps businesses collect and track high-quality customer experience data. Also available on an easy to use mobile app, Atlan Collect is designed to work anywhere. The intuitively designed dashboard helps...Read more

4.50 (2 reviews)

Learn More

Intelligize

Intelligize is a web-based research management tool that helps educational institutions, accounting and consulting firms and corporate businesses extract, collect and analyze regulatory data to streamline legal research processes....Read more

No reviews yet

Learn More

Nightfall DLP

Nightfall DLP is a cloud-based data loss prevention software which helps businesses classify and protect sensitive data using APIs. Key features include behavioral analytics, application security, sensitive data identification, in...Read more

5.00 (2 reviews)

Learn More

Popular Data Discovery Tools Comparisons

Buyers Guide

Last Updated: March 11, 2022

What is data discovery software?

Data discovery software is a tool that helps you to collect and combine data from multiple sources and identify patterns and trends in them. Data preparation, data modeling, visual analysis, and advanced statistical analysis are the key functions of data discovery software. Data discovery tools are primarily available as a part of business intelligence software solutions.


Data discovery is one of the fastest-growing and rapidly changing segments of the BI market. These tools differ dramatically from the traditional systems of record that enable IT to push reports and dashboards out to the rest of the organization.

In many cases, data discovery tools are purchased by organizations that have already deployed traditional BI systems, in order to solve issues with data access, data preparation and data exploration. Data discovery solutions have also been a godsend for small businesses that can’t afford complex data warehouses and lack the expertise to build them.

The market for data discovery software is complex and highly fragmented. There are a number of different “flavors” of data discovery, and a variety of use cases in which one flavor works better than another.

In this Buyer’s Guide, we’ll explain how data discovery software differs from traditional BI and describe the categories into which these tools break down.

Here’s what we’ll cover:

How Do Data Discovery Tools Differ From Traditional BI Systems?
Capabilities of Data Discovery Software
Types of Data Discovery Tools

How Do Data Discovery Tools Differ From Traditional BI Systems?

An easy way to understand this difference is to look at the history of BI solutions.

Traditional BI systems were an attempt to solve the difficulty of writing SQL queries in order to retrieve data such as sales information, customer information, shipping records etc. stored in multiple relational databases. Before BI, users had to be highly familiar with SQL to get the data they needed out of such databases.

Thus, traditional BI systems mapped a layer of familiar business terms (known as a semantic layer) onto the relational databases’ storage schemas, thereby allowing users to retrieve and combine data without knowing SQL at all.

Traditional BI Semantic Layer

Traditional BI Semantic Layer

 

The semantic layer is a way of expressing a data model, or a schematic representation of the relationships between data in one or multiple datasets. In particular, the semantic layer schematizes the relationships between data residing in different data sources/databases. For instance, the dimension “customer” in the semantic layer may be defined as grouping together information from both the “sales orders” database as well as the “customer records” database.

BusinessObjects—later acquired by SAP—was the first BI vendor to use the semantic layer model, and remains one of the most popular semantic layer-based solutions. The semantic layer model is still suitable for large enterprises that need unified access to data stored in numerous operational databases.

The problem with this model is that the semantic layer needs to be standardized across the organization. In other words, various business units must agree on which databases and tables in these databases the dimension “customer” will pull from. Moreover, once the semantic layer has been standardized, it remains under IT control.

As you can see in the above diagram, traditional tools for ad hoc queries pass analysts’ queries through the semantic layer, which automatically translates them into SQL queries to retrieve data from SQL databases and other data sources that support SQL querying. Thus, traditional querying tools can only work with data sources that have already been integrated into the semantic layer.

Data sources outside the semantic layer (a spreadsheet sent in an email, a public data source on the web, 500,000 Tweets about a product recall etc.) can’t be easily integrated with the semantic layer unless IT develops new processes. And, of course, IT can’t develop a process for every new data source.

When the semantic layer is standardized across the organization, the paths that analysts follow to retrieve and combine data get frozen into place. For instance, if the organization defines “store” as a subcategory of “branch,” and “branch” as a subcategory of “sales region,” while neglecting to slot “customer” somewhere into this hierarchy, blended analysis of sales and customer data can become overly complex.

Business terms mapped to operational data in SAP BusinessObjects

Business terms mapped to operational data in SAP BusinessObjects

Data discovery tools remedy this situation by providing direct access to the operational databases shown in our chart, instead of going through a semantic layer. This allows users to combine spreadsheets and other data sources outside the semantic layer with operational data.

Any data preparation work that needs to be done to combine data sources (e.g., converting “customer_ID” to “customer”) is done on the fly, instead of forcing IT to standardize terminology across the organization.

Additionally, users can develop their own data models during analysis, instead of being bound to the data model encoded in the semantic layer. This allows greater flexibility for sophisticated queries that depend on blending data from multiple sources.

Capabilities of Data Discovery Software

There’s a wide range of data discovery platforms, meaning that listing specific features is pointless. Instead, let’s take a quick look at the broad capabilities that define these solutions:

(Graphical) front end for data manipulation Allows for data access and manipulation via visualizations of data sources and patterns in data. Instead of writing a query, you can simply click on a wedge of a pie chart to drill down, or choose a heat-map visualization for your data.
In-memory processing Processes data by storing it in RAM (random access memory) instead of writing it to disk. This gives them the processing power to blend massive data sets on a user’s laptop, instead of doing the blends in the database as traditional BI tools do. See our data blending report for more details.
Big data connections Supports direct connections to data sources, instead of confining access to sources within the semantic layer. Support for flat files (.xlsx, .csv etc.) is nearly universal, as is support for SQL databases. Beyond that, the range of data sources a tool can connect to is generally a point of competitive differentiation.
Data cleaning/preparation Offers features for cleaning and preparing data, since analysts can’t rely on pre-integration of data sources via a semantic layer. These features are for normalizing dimensions, removing trailing spaces, testing the accuracy of joins etc. on the fly.

Note: Several of these definitions of data discovery capabilities were adapted from Gartner research reports, specifically What Data Discovery Means for You by Joao Tapadinhas and Dan Sommer (available to Gartner clients).

Types of Data Discovery Tools

Data discovery has been an emerging market for at least a decade, but instead of solidifying around a core set of concepts and features, the market has continued to evolve.

Data discovery functionality has also been added to traditional systems that use semantic layers, though such systems will still be overkill for many small businesses.

There are essentially three categories of data discovery solutions currently on the market:

  • “Search engine”-like tools for textual searches of data
  • Visual interaction tools that provide a graphical front-end for data manipulation
  • “AI”-based tools that do the bulk of the pattern recognition for you

Visual data interaction tools are analytics tools that directly access data sources instead of going through a semantic layer. They allow users to process massive datasets on their laptops (via in-memory caching engines) and spot patterns using a visual interface.

Data visualizations in data discovery tool Tableau

Data visualizations in Tableau

The point of a visual data discovery tool isn’t simply to crunch numbers and then output pretty charts and graphs, which can easily be done with Excel and Powerpoint. Instead, these tools are for interactive manipulation of data via visualizations.

For example, you can click on a particular city in a heat-map to begin analyzing sales just within that city’s stores. You can then add another dimension to your map—say, aggregate payroll expenses per store—to blend sales and payroll data and spot new patterns.

As you click on visualization elements and drag and drop dimensions and measures into your visualizations, an engine within the data discovery tool translates your gestures into SQL queries. Changing the visualization automatically refreshes it with newly processed data from your databases.

These tools thus allow for highly interactive and sophisticated database querying without forcing users to learn SQL. Moreover, they allow users to access and blend data from multiple data sources that haven’t been integrated via a semantic layer.

Visual data interaction tools are thus known as “self-service” BI tools, since business analysts can get the data they need and analyze it in the ways they want without involving IT in the workflow.

Originally, visual data interaction tools were designed to supplement the capabilities of an existing BI system. As they’ve evolved, however, they’ve incorporated more and more of the capabilities that used to be found only in traditional systems. Many organizations—especially smaller ones—are now exclusively relying on this form of data discovery as their dominant analytics platform.

Visual data interaction tools make up the bulk of the data discovery market, and frequently data discovery is used as a synonym for business analytics via interactive visualizations.

“Search engine-like” tools are a niche category in data discovery. They’re specifically for performing keyword searches of large collections of files, and they feature an interface similar to that of web search engines such as Google and Bing. Search-based tools harness text mining technology to allow users to search keywords within files and documents:

Data discovery using keyword searches and word clouds in WebFOCUS

Data discovery using keyword searches and word clouds in WebFOCUS

Search-based tools are clearly not the best choice for dealing with numerical values, which are, of course, absolutely central to business analysis. Instead, this form of data discovery is used by organizations with massive collections of unstructured textual data (surveys, documents, presentations, product literature etc.) sitting in numerous data siloes.

Without search-based data discovery, employees may never be able to track down the documents they need on their own. These tools thus enable better information-sharing, at the same time cutting down on the time that information “gatekeepers” have to spend tracking down documents for co-workers. Most small businesses won’t need them.

“AI”-based tools. Visual data interaction tools can be used to support pattern via machine learning (or “AI” in layman’s terms). Generally this requires integration with a variety of other tools and technologies ranging from the statistical programming language “R” to Apache Spark (a framework for programming machine-learning algorithms in cluster computing environments).

“AI-based” data discovery tools directly leverage machine learning to spot patterns for users, instead of enabling users to spot patterns themselves through visual analysis. These tools then output visualizations and can even express the patterns they find in narrative form for users (for example, they can output a sentence stating “Q4 revenue down 2.1 percent in Kentucky branch stores served by X, Y and Z distributors.”

Don’t assume that a HAL 9000 will replace your analysts anytime soon, however. Human beings still need to vet the patterns to make sure that they’re truly significant, and once a pattern has been spotted, users can continue to refine the analysis by asking new questions of the tool, similar to the workflow in a visual data interaction tool.

Examples of “AI”-based data discovery tools include IBM Watson and Salesforce BeyondCore. This is still an emerging market, and while promising, these solutions are too expensive and technologically immature for SMB users at present. Most SMBs will be better served exploring the wide range of visual data interaction tools on the market.

Note: Several of these definitions of categories in the data discovery market were adapted from Gartner research reports, specifically What Data Discovery Means for You by Joao Tapadinhas and Dan Sommer (available to Gartner clients).