9 Data Terms Explained: Making Sense of a Hot Commodity


by Margery Murphy on February 11, 2019

We hear about data analytics and data-driven decision-making in just about every industry from health care to education to manufacturing. In the world of B2B marketing and sales, companies turn to data to create buyer profiles and personas, promote new products and services, and target the best prospects. Data, and the ability to analyze and leverage it, is one of the most valuable resources companies have.

network-2402637_1280At its simplest, data is information. Even paper accounting ledgers and phone books count as data, but as computing power and the Internet have grown, most of what we think of as data today is digital information created by humans, computers, or other machines. That includes all of the websites, searches, text and email messages, photos, videos, social media, and smart devices out there. In fact, the International Data Corporation (IDC) predicts that the “Global Datasphere” will grow from 33 Zettabytes (ZB) in 2018 to 175 ZB by 2025 (for context, 1 ZB = 1x1012 GB).

If all of this sounds overwhelming, or even incomprehensible, read on as we decode some common data-related terms and concepts.

"Data, and the ability to analyze and leverage it, is one of the most valuable resources companies have."  Click to Tweet.

Big data

Big data refers to data sets that are too large for one person or one software program to analyze, sort, and make use of. Somewhat of an umbrella term for all types of digital information, it is often divided into structured data (i.e. spreadsheets, relational databases) or unstructured data (i.e. documents, images, Google search strings, social media posts, sensor data, etc.). It is estimated that 80 percent of all data is unstructured.

Open/publicly available data

Open or public information is free, without copyright, patent, or encryption protection; it does not require registration and is not available on a limited-time basis. Much of this data is from government, non-profit, and NGO sources, such as the US Census Bureau, Securities and Exchange Commission, or the Thomas directory. Commercial websites also provide large amounts of free data including professional associations, trade publications, media outlets, websites and blogs, public documents, and directories.

Marketers find these sources useful for seeing the big picture and broad trends. They’re free, but may require time, knowledge, and computing power beyond your resources to locate or interpret data. Some sources allow free access to a limited number of articles or downloads, then require registration or a fee to access “premium” content.

Private/secure data

Private data is stored “in the cloud,” on remote server computers, or on an internal network behind passwords, firewalls, or in an encrypted format. Examples include Google docs, photos uploaded to password-protected commercial sites, equipment data stored on a secure remote server, and proprietary or personnel information stored on a corporate intranet. While this data is generally considered inaccessible without the correct credentials, clever data thieves can and do breach security regularly.

Fee-based/limited access data sources

Some websites or subscription-based databases require registration or a fee for access. For marketers, these include sites that provide demographic, survey, or market info to give context and background insight. Contact details can include addresses, phone numbers, email addresses, but may be limited to higher-level employees (i.e. executives or directors). Examples include media market information, company information, market share and competitor details, and in some cases contact information. Examples include Hoovers, Linked-in Navigator, some professional organizations, and Neilson ratings.

These sources are helpful because they usually provide more exclusive data that’s unavailable from free sources. They also do the work of combing through public information, company websites, and other documents to aggregate data. The downside is that they can be extremely expensive, and there’s no way to know where the data comes from.

Purchased/rented contact lists

You’ve probably heard of companies that compile email or general contact lists and sell or rent the information to you. They can be useful for acquiring a large contact list quickly. However, you will not be able to verify if they are following the opt-in guidelines most email marketing services require, and there is no way to know how accurate the information is or where it came from. Generally, these lists are not fine-tuned to your product or niche and emails you send to these contacts could be regarded as spam by recipients, potentially harming your reputation.

Crowd-sourced data

Crowd-sourced data sources are those where users supply information voluntarily. Examples include Linkedin, Glassdoor, Quora, and Wikipedia. They’re useful for finding and connecting with contacts, conducting background research on a prospect or company, or getting an overview of a topic. They’re often free and include large amounts of data. The major downside is that the data is only as accurate as what users provide, update, or correct. It’s not uncommon to find outdated profiles, broken links, typos, and other unvetted details.

In-house/proprietary data

In-house data is what you gather and catalog in your CRM from talking with your prospects and customers, contact information you obtain through gated content forms, or targeted lists of contacts built to your specifications by a reputable marketing company. These lists are useful for drilling down to specific titles and roles in companies because you can easily validate and update contacts when you already have a “foot in the door.”

You’ll gain familiarity with your prospects and their pain points by combing through and validating contact information yourself, and you’ll save time gathering the data if you outsource the work. The downside is that list building is expensive and takes time.

Data scraping/web scraping

Data scraping refers to downloading or extracting data from a website to collect in a spreadsheet or other file. It differs from surfing the web to gather information from websites manually because scraping is automated, using an application written for this purpose. It’s useful for gathering contact information and other specific data from many websites quickly. While legal, always read the fine print since scraping may violate the terms of service on certain websites, which may expressly prohibit it.

Data cleansing/data scrubbing

This is the process of checking data for errors, duplicates, inconsistencies, outdated information or poor-fitting contacts, with the goal of creating an accurate dataset. There are many approaches, including sorting and removing duplicates (i.e. deduping), standardizing record formatting and terminology, calling contacts to verify information, or cross-checking with other sources. It’s worth the effort to create a high-quality list, but it can be time-consuming and expensive if you outsource it or use fee-based sources for validation.

Data is an ever-growing resource and businesses are finding new ways to study, manipulate, and apply it each day. Sales and marketing professionals are no exception. Not sure how to get started? Acadia can help with list building, contact validation, and more. Contact us today.


New call-to-action






Leave a comment