Unique Company Identifiers

Amazon Family Tree (Source: D&B Hoovers)
Amazon Family Tree (Source: D&B Hoovers)

Associating company records with a common identifier is critical for Account Based Marketing as well as other sales and marketing methodologies.  Lacking a common identifier makes it difficult to

  • De-duplicate company records
  • Associate subsidiaries and branches with headquarters
  • Perform both real-time and batch data enrichment of firmographic, technographic, and social links.
  • Associate company news and sales triggers to key accounts.
  • Tie together company records across multiple platforms.
  • Assess the risk (e.g. credit, supplier, reputational) associated with a business.

The importance of a “unique identifier” was discussed by Owler CEO Jim Fowler in the Harvard Business Review:

The best way to keep data clean is to use a globally known, unique identifier, or a “data backbone.” My company prefers to use URLs as identifiers. They’re free, globally recognizable, high-quality data points that enable you to efficiently gather information on a business’s industry, online activities, and functionality. For example, Cisco is a company that also goes by Cisco Systems, Inc. and Cisco Precision Tools. If sales containers required users to type in one unique URL, http://www.cisco.com/ for all those different branches, it’d be much more difficult to create duplicate accounts, which helps keep data clean. Perhaps more important, URLs facilitate communication between people, systems, and even departments. Whether it’s the customer relationship management platforms used by sales teams, enterprise resource planning software used by purchasing teams, or the account-based marketing technology employed by marketing teams, the business intelligence platform can recognize a unique URL and attach it to clean, usable data. Unique identifiers let you know you’re pulling from the sources and contacts you’ve intended to track.

I agree with 90% of what Fowler states, but disagree with his recommendation that URLs are the best unique identifier for his “data backbone”.  There are a number of reasons that URLs fall short:

  • URLs are not persistent.  If a company is acquired or renames itself, the old identifier (URL) is not retained.  This creates a potential disconnect between the old and new name.
  • URLs have a many-to-one mapping which treats most subsidiary and branch locations the same as the headquarters.  For some companies, mashing together all locations into a single record may be sufficient, but it is a highly flawed approach as it loses much of the nuance concerning companies that operate across multiple sectors and countries (e.g. General Electric).  It also makes it very difficult for sales reps to sell deeper into an organization which lacks linkage data.
  • Conversely, companies with multiple URLs are not tied together.  This could happen due to differing country identifiers (e.g. .UK, .FR), division names, brand names, and subsidiaries.  Each of these scenarios treats companies as a separate business.  Amazon has many distinct businesses including Amazon Web Services (aws.amazon.com), Zappos (www.zappos.com), Alexa Internet (www.alexa.com) Audible (www.audible.com), Internet Movie Database (www.imdb.com), and soon Whole Foods (www.wholefoods.com).  URLs do not provide a consistent data backbone when subsidiaries, acquisitions, and branches have different domains.
  • When a division or facility is divested, there is no way to determine which locations have been spun off.
  • Franchises are treated as part of the parent company when they are separate legal entities.
  • Not all companies have websites.
  • URLs can be sold.  They can also be reused if a company goes out of business or abandons a URL.

Finally, business decisions related to logistics, credit, supplier risk, and financing need to understand the underlying structure of companies.  It is not just marketing and sales that are impacted by standardizing on a non-persistent, quasi-unique identifier.

I would therefore recommend looking at credit data companies as a better source of unique identifiers.  Companies such as Dun & Bradstreet, Experian, Equifax, and Infogroup all offer location level detail and linkage associated with unique identifiers that have been developed over multiple decades.  They offer sophisticated entity matching and enrichment tools such as Dun & Bradstreet’s Optimizer service. Furthermore, these firms support multiple functions across the organization helping assist with cross-platform entity linking and on-demand decisioning.

ReachForce Unveils 3×360 Lead Hygiene

Data Quality Automation vendor ReachForce unveiled a new technology update it calls “3×360.”  The new release provides improved visibility & control, full-spectrum intelligence, and implementation & integration flexibility.  The MarTech hygiene platform provides real-time data enrichment, cleansing, and updating for Marketo, Eloqua (Oracle), Silverpop (IBM), Hubspot, and Salesforce.  The improved technology enhances both their SmartForms web form enrichment and Continuous Data Management services.

“Marketing technology stack options continue to evolve and change, however, a single fundamental factor remains unchanged in determining the success of any marketing operation – the quality and depth of the marketing lead data that flows through it,” said ReachForce CEO Bob Riazzi. “With our 3X360 update we’ve worked hard to consider multiple aspects of the marketing technologist’s needs and are proud to introduce a full spectrum of new product capabilities that will support them and the way they leverage data through their tech stacks. And why 3X360?…well, we’re a bunch of geeks from Austin.”

A new 360° Console provides a central dashboard for tracking lead enrichment and data quality.  Analytics include

  • The health of web forms that use SmartForms and the enriched leads submitted.
  • Rich drill-down reporting for Submits by form, Abandons by form, Match Rate by Geo and Marketable Submits.
  • [A] “Service Snapshot” provides summaries of enriched leads, match rate & usage enabling timely tracking and management of service contract.
ReachForce Service Snapshot for SmartForms
ReachForce Service Snapshot for SmartForms

Future Console enhancements include “client-driven configuration management and on-demand file uploads for immediate data quality improvements and enrichment.”

While ReachForce has long provided firmographic enrichment combined with contact validation and verification, they are now supporting contact-level data enrichment.  The new matching capability enriches leads with business card details, job role and function, and a social profile.

SmartForms before and after Contact Enrichment.
SmartForms before and after Contact Enrichment.

ReachForce simplified their SmartForms integration via a “simple, one-line implementation” for one or multiple forms.  The new functionality helps marketers rollout SmartForms “via multiple unique configurations without additional implementation steps.”  Their JavaScript API also “allows developers to integrate SmartForms into dynamic lead form workflows and enables decision-making based on individual SmartForms interactions.”

Trillium Software Sold to SyncSort

"Why Good Data Matters" Statistics from Trillium Software
“Why Good Data Matters” Statistics from Trillium Software

Harte-Hanks, which has been looking to sell off its Trillium Software data quality division for several quarters, found an acquirer in SyncSort.  The $112 million transaction will combine the data quality and integration capabilities of the two firms with a focus on extending Trillium services into the Hadoop big data platform.  The transaction is subject to regulatory approval.

“With most large enterprises making significant investments in Big Data for business and operational analytics, core data integration and data quality workloads are moving into Hadoop at a rapid pace,” said SyncSort CEO Josh Rogers. “As a pioneer in bringing high-performance data integration software to the Hadoop ecosystem, Syncsort sees an opportunity to extend our unique value with Trillium’s proven, best-of-breed data quality products. Together, we are a clear leader in the data integration and data quality market, and the logical choice for large enterprises seeking to chart a path to Hadoop. We look forward to working closely with Trillium’s customers and investing in the great products they have come to rely on.”’

Harte-Hanks has been slimming itself down the past few years.  They sold off both their publishing group and Trillium Software and spun off Aberdeen Services (Aberdeen market research and Access CI technology database).

“Our announcement today is the result of a comprehensive process to maximize the value of the Trillium Software business in the growing Data Quality and Data Governance segment,” said Harte-Hanks CEO Karen Puckett. “Now Harte Hanks can wholly focus resources on our core strengths and capitalize on our unique combination of marketing strategy, analytics, and execution capabilities. The sale of Trillium Software, along with the cost reduction program we implemented in 2016, provides Harte Hanks with a stronger balance sheet as we move the Company on its path toward revenue stability and historically strong cash flows and improved profitability.”

In a recent Magic Quadrant on Data Quality Tools, Gartner placed Trillium in the Leaders quadrant and gave them high marks for the “strength and stability” of their core profiling, parsing, standardization, and matching functionality.  Trillium also was noted for its “strong mind share and a very long and solid track record of delivering data quality solutions” and its growth in the cloud-based deployments.

On the negative side, Gartner noted concerns about pricing, “ease of installation, upgrade, and migration,” and the uncertainty that surrounds the Harte-Hanks divestment (the analysis was prior to the announcement).

Digital Transformation and Sales Intelligence

Data Source: “The 2016 Guide To Digital Predators, Transformers, and Dinosaurs," Forrester Research, May 2016.
Data Source: “The 2016 Guide To Digital Predators, Transformers, and Dinosaurs,” Forrester Research, May 2016.

Forrester released a study titled “The 2016 Guide To Digital Predators, Transformers, and Dinosaurs” which argued that companies need to quickly transform themselves into digital businesses.  The study broke businesses into three digital categories: Predator, Transformer, and Dinosaur and evaluated the percent of business that are either digital services or sold online.

Predators are already generating over 80% of their business digitally and will grow their business to 90% by 2020.  For them, digital is a foundational element of their operations.

Likewise, transformers are quickly evolving into digital businesses while dinosaurs are plodding along.  In 2014, only one in six dollars was generated digitally at transformers, but by 2020, two of every three dollars will be digitally mediated at transformed businesses.

At the dinosaurs, only one in three dollars will be digitally generated in 2020.

Forrester found that transformers are customer-centric in their business strategy and processes.  Customer obsession is part of their corporate DNA:

While all companies profess to put customers first, it’s clear from the data that executives at digital Predators care more passionately about the customer across multiple dimensions: In every customer metric we measured, these executives rated the importance of the customer higher than peers in transformers and dinosaurs – in short, they are not just customer obsessed, they are really, really customer obsessed.

  • Nigel Fenwick, Forrester VP and Principal Analyst

Overall, Forrester found that 29% of current total sales are influenced by digital, but that 47% would be digitally influenced by 2020.  Thus, any business that wishes to remain competitive must have a digital strategy which encompasses sales, marketing, credit decisioning, contracting, and all of the elements across your sales funnel.

My blog focuses on sales intelligence (with some discussion of marketing intelligence and DaaS), so I’m covering a subset of this transformation.  But sales intelligence is a key element of the digital transformation of sales and marketing.  Its goal is to make sales reps more efficient and effective at generating revenue through

  • Improved understanding of customers and prospects.  Whether the company is employing ABM, ABSD, social selling, trigger selling, or other techniques, customer-centricity begins with an understanding of the customer at the contact, company, and industry level.  Sales intelligence vendors go beyond firmographics and contact data to deliver business descriptions, SWOTs, biographies, social posts, industry research, financials, analyst reports, technology platforms, etc.
  • Current Awareness. Improved awareness of changes at customers and prospects helps to improve account planning, messaging, and forecasting.  Where once this intelligence was delivered as generic company news, the sales intelligence vendors have refined their tagging and now provide high precision sales triggers which are accurate at both the company and business topic level.  Some have even begun to integrate sales triggers into their prospecting engines.
  • Reduced busywork + improved data quality.  Sales intelligence vendors cut the time wasted on busywork through the implementation of DaaS enrichment of accounts, contacts, and leads.  Enrichment provides more accurate firmographics, corporate linkage, and contact information which is then propagated to downstream systems.  It also reduces the keying done by prospects on web forms and sales reps in CRMs.  Furthermore, targeting, segmentation, and messaging are much more accurate when the ongoing maintenance of account intelligence is managed by a third party.

Over the past decade, sales intelligence firms have grown from standalone web information portals to integrated workflow services that deliver a broad set of account intelligence to CRMs, marketing automation platforms, sales acceleration (ABSD) services, Google Chrome, web forms, and mobile devices.  Thus, sales intelligence is now becoming available to sales, marketing, and service departments across a broad set of platforms and devices.

If you would like to read more on my thoughts concerning the digital transformation of sales and marketing, I have also discussed the topic on Sparklane and Avention’s blogs.

Data Enrichment Assists Digital Transformation

Blog on the Sparklane UK website discussing how sales and marketing can prepare for Digital Transformation.
Sparklane Blog

In a blog on Sparklane’s website, I had the opportunity to discuss how sales and marketing can digitally transform their departments by focusing on data enrichment and sales intelligence.

Firms have traditionally taken a haphazard approach to data quality, failing to recognize that data quality is a function of both initial data (keyed data, web forms, trade show scans, purchased lists, etc.) and time.  Data is dynamic.  It can be accurate today and inaccurate tomorrow.  That’s why data quality is often broken down into three dimensions: Accuracy, Completeness, and Timeliness.

So not only are firms failing to enrich data in real-time as data is acquired (or batch if purchased), they are ignoring the simple fact that

  • Companies relocate
  • Offices are shuttered
  • Execs change companies or positions within companies
  • Corporate URLs and email domains are changed when companies are acquired or renamed
  • Companies grow and shrink

The result has been saw-tooth data quality charts with quality spiking at data refresh and then quickly declining.  Both company and contact data are subject to data decay with contact data declining at a rate of 25% per annum (A recent Radius study has it at 27%).

To address this problem, firms should evaluate third-party solutions which provide a reference database matched against their sales and marketing datasets.  By standardizing on a reference dataset, sales and marketing operations can deploy a single source of truth across data acquisition (e.g. list loads, prospecting, web forms) and maintenance (ongoing updates to their CRM and Marketing Automation platforms).

There are many benefits to this approach:

  • Web Form and other keyed data is immediately verified and graded.
  • Lead Scoring is based upon richer and more accurate data.
  • Duplicates are detected before being created, allowing leads to be matched to current customers and prospects.
  • Leads from subsidiaries and branches are tied to ABM accounts, ensuring they are properly scored and routed.
  • Addresses, Phones, and other key firmographic and biographic fields are standardized ensuring they are properly segmented, targeted, and routed.
  • Sales and Marketing no longer waste resources targeting individuals who have left an organization.
  • Sales has more complete data for lead qualification, prioritization, and messaging.
  • Higher quality data is propagated to downstream systems, reducing the long-term cost of maintaining those platforms and helping prevent downstream errors and duplicates created by low quality upstream data.

And those benefits are simply those from cleaner data.  That is before we begin to consider the value of sales intelligence platforms in account planning, messaging, current awareness, identifying additional contacts at current accounts and prospects, and opportunity prioritization.

So if you want to begin to improve enterprise decision making and efficiency, an excellent place to start is in improving the data which is the lifeblood of your digital platforms.

Radius: Data Decay Rates

While there is a commonly cited statistic about contact data decaying at a 2.1% rate per month, the nature of this decay has been less reported.  Predictive Analytics company Radius conducted a study of 10,000 businesses and assessed the rate of decay over three months.  Data quality was assessed by external vendors in May and August 2016.  The Move or Unreachable value of 27% is similar to the often cited annual decay rate of 25% for contacts.

Radius published only three month decay rates, but I annualized the data using a four-period compounding formula.

Radius three-month data decay rates with imputed annual rates calculated by GZ Consulting.
Radius three-month data decay rates with imputed annual rates calculated by GZ Consulting.

One statistic that I did not annualize is the “Emails become Invalid” rate.  If 7.6% of contacts are not reachable after three months, then why are only 2.5% of emails becoming invalid?  There are several reasons:  First, approximately 8% of companies set their mail servers to not send bounce messages (or 0.6% of the three-month spread).  Secondly, most companies do not immediately turn off email messages when a person leaves the firm.  They generally forward the emails for a period of time to an administrative assistant or the individual who has assumed the departed person’s role.  This tends to be a temporary situation, but it explains the 5% gap between the two rates.  As one would expect companies to eventually decommission old emails, the annual rate of emails becoming valid should be closer to 25% than the non-displayed CAGR rate of 9.7%.

Radius is looking to address the decay problem in its database via leveraging their clients’ second-party data to obtain network effects for augmenting and updating their file.  Customers opt into the network with their data immediately anonymized and aggregated, “providing additional points of validation and verification.”  Customer contributions now cover 70% of the businesses in Radius’ Business Graph spanning one billion interactions.

Zoominfo has employed a similar model over the past few years for building out their contact file.  Their Community network has lifted their coverage of active US B2B contacts to 80 million.

Radius claims that the network improves the accuracy, comprehensiveness, and freshness of their data.  For example, phone connect rates improve from 84% to 93% when there are at least five data validation points.  Likewise, physical address accuracy improves from 85% to 96% when there are at least five validation points.

The comprehensiveness of firmographic data also improves with additional members.  Without the customer network, only 64% of records had full firmographic or contact attributes.  The population of comprehensive records rises to 81% with fifty network members.

Finally, Radius claims it’s network is “up to 20 times faster” at updating the Business Graph “than with traditional, manual methods of data collection and validation.”

“Network effects have long been a driver of business value and innovation across many industries, particularly for B2C companies,” said Radius CEO Darian Shirazi. “At Radius we are pushing the envelope on what B2B companies can come to expect from data. Now, leveraging customer network effects opens the door to further transform B2B data and develop new marketing innovations. By tapping into our predictive expertise and already robust data set, customer network effects can help marketers make smarter, faster decisions that drive revenue and growth.”

ABM: The Art of the Start (Avention)

This morning, sales and marketing intelligence vendor Avention unveiled a survey and set of recommendations on implementing Account Based Marketing strategies.  ABM is quickly moving from a buzzword to an actionable strategy for strategically targeting your best customers and prospects.  If you are considering an ABM strategy or researching how to move forward with ABM, Avention’s “Account-Based Marketing: The Art of the Start – Leveraging a Strong Data Foundation to Fuel ABM Success” guide is now available.

The Avention Survey of over 100 top level B2B executives (e.g. CEO, CMO, VP of Sales) found that ABM strategies require “careful data-driven planning, execution and monitoring.”  Furthermore, the lack of data access and quality are “fundamental impediments” that need to be resolved for ABM strategies to succeed.

ABM is based upon strategic targeting of your best accounts and similar companies.  If your underlying data is poor you will have problems with best customer cloning, account messaging, and drilling deeper into organizations for cross-sell and upsell.  Furthermore, “once a program is started, it is essential that account and market news and events be monitored to ensure programs remain relevant.”

Todd Berkowitz, research vice president for Gartner1 wrote in a January report, “By getting a better view of customer data, creating predictive models, employing account-based marketing, and creating internal and external-facing content specifically for existing customers, even marketing leaders from smaller providers can increase the likelihood of success.”

90% of the surveyed B2B execs believe that ABM is relevant to their organization and 86% are confident that ABM will drive growth; however, 75% are having trouble finding the appropriate contacts for selling deeper into target organizations, and more than fifty percent of B2B marketers lack an ability to monitor and adjust programs directed towards ABM accounts due to a lack of real-time intelligence.

AV ABM Survey
ABM Survey of B2B Executives republished with permission from Avention.

Avention noted that ABM is a long-term strategy that requires continuously updated account and contact intelligence if campaigns are to remain relevant.  For example, messages and offers may need to be adjusted due to key events such as executive changes.

Avention CEO Steve Pogorzelski summarized the ABM implementation problems found in the survey:

Almost two-thirds of the marketers responding to our survey report not having access to a single source of truth for customer data.  This obviously impedes starting an ABM program and running it to successful conclusion, as such programs demand access to accurate and continually updated market and account data.  ABM offers enterprises the opportunity to quickly fuel their customer acquisition, growth and retention strategies.

Pogorzelski noted that Avention provides three capabilities which support sales, marketing, and sales operations ABM responsibilities:

  • Marketing: “Consolidation and visualization of in-house customer data for sophisticated segmentation.” (For more on these capabilities, see my February blog on the launch of their DataVision platform).
  • Sales: Strategic intelligence concerning companies and contacts along with predictive indicators.
  • Sales Operations: CRM and Marketing Automation ecosystem connectivity.

Finally, I would note that traditional approaches to marketing data quality which involve annual data cleanses are insufficient to meet ABM and predictive marketing needs.  Marketing data, particularly contacts, ages quickly.  The lack of a continuous data quality strategy will result in a drop-off in sales and marketing productivity as contact and company data decays.

1 Gartner, Tech-Go-to-Market: Four Ways Marketers can Generate Demand with Existing Accounts, January 29, 2016


Full Disclosure: I broadly advise companies across the sales intelligence space including Avention.  While I periodically write commissioned blogs for Avention, none of my commentary on my own blog or social media accounts is commissioned.

Data Quality Isn’t Glamorous — Now Get over It

The Data Health Scan Report is one of the Optimized Customer Data Services.
A Data Quality report is an excellent way to start a data quality program.  It helps with sizing the problem and providing an initial remediation cost.

Unfortunately, data quality is a boring topic. No new CMO has ever joined a company and said, “First, let’s perform a merge/purge on our account and contact records, standardize the fields, and enrich the records.” (OK, I’m being hyperbolic, there may have been a few). No, they want a shiny new marketing automation platform, new branding, and an advertising campaign that gets the company noticed.

Sadly, there is little glory in improving your marketing database — unless, of course, you want to improve your lead nurturing, scoring, segmentation, routing, and sales ready lead quality.

Quality is generally seen as a cost center but it can just as easily be viewed as a cost saver. Bad quality erodes your marketing effectiveness, hurts your brand, kicks the knees out from under your nascent big data experiments, and demoralizes sales reps. A bad company or contact record is like a virus propagated from system to system raising the cost over time.

Furthermore, how can you think about predictive analytics when your databases are rife with bad, incomplete, and out of date records?

Bad data isn’t simply a mistyped address. It’s also:

  • Missing lead firmographics making it difficult to nurture, score, route, and qualify leads.
  • Invalid emails that hurt your deliverability scores and decrease the likelihood your messages will be delivered to inboxes instead of spam folders.
  • Junk fields on web forms because the individual didn’t want to fill out a dozen fields to read your whitepaper.
  • Large gaps in your segmentation analysis labeled UNKNOWN.
  • Hosting costs for storing out of date and duplicate data.
  • Leads with missing linkage that were held for nurture because the marketing automation system didn’t know the location was a subsidiary of a Fortune 500 company.
  • Poor marketing messaging and targeting that tell the recipient that you know nothing about their business, job function, industry, or company size.

Finally, bad lead quality incentivizes sales reps to ignore leads because marketing never seems to send the “Glengarry” leads. Instead, they become demoralized as they call invalid phone numbers or find that the contact “doesn’t work here anymore”.

Henry Schuck, CEO of DiscoverOrg, describes the situation well:

Sifting through crappy leads as a sales person is incredibly demoralizing. Their commission – which often translates into their ability to save for their family’s future, have disposable income or cover their mortgage and car payments – depends on them being able to close business. Their ability to close business, in turn, depends solely on their ability to find, set appointments with, and CLOSE new opportunities. If the leads provided by your company will not help them do that – how does that feel? They just moved companies to come work for you and their future is uncertain, at best.

So look at data quality holistically. Address it at the front end in your call centers and web forms and then enrich and maintain your database over time. As contact records decay at a 25% rate per annum you need to view data quality as an ongoing process, not simply an annual refresh (which is more than many companies even do).

So by flipping your perspective, it is easy to find myriad tangible benefits which justify the cost of data quality programs. It may not lead to glory, but by recognizing the distributed costs of bad data and then remediating them, you can generate significant ROI.

Photo Credit: Data Hygiene report from Dun & Bradstreet NetProspex Workbench

Data Science and Competitive Advantage

GlassDoor Tech Salaries

Social media job site Glassdoor recently published its second annual ranking of the top jobs in America and, of the top twenty-five jobs, ten were in technology.  The top ranked position was data scientist which jumped from ninth last year.  Other high ranked positions were Solutions Architect (#3), Mobile Developer (#5), and Product Manager (#8).  Glassdoor bases their rankings on three variables: the number of job openings, salary, and career opportunities rating.

The Median Base Salary for a data scientist is $116,840.  Other tech base salaries can be seen in the above graphic.

When Network World interviewed data scientists about their position, they noted the pleasure of discovery as a key benefit.  A common complaint amongst data scientists was the headache involved with data preparation.  “At times, munging [parsing] through data can get tedious,” said data scientist Jeff Baumes at Kitware. “The worst times are when I realize the quality, quantity, or other aspect of the data simply prevents me from gaining the level of insight that I hoped to gain from the data.”

The McKinsey Global Institute found there is a growing shortage of analytics talent in the United States.  By 2018, they projected a shortfall of 140,000 to 180,000 professionals with analytical expertise.  They also projected a deficit of 1.5 million analytics trained managers and analysts.

Data scientist talent acquisition and retention are a significant problem for organizations, particularly amongst firms looking to initially establish data science capabilities.  In an article in the MIT Sloan Management Review, Ransbotham, Kiron and Kirk Prentice found that 55% of analytically challenged firms had a problem recruiting and retaining analytical talent while firms described as innovators had much less difficulty.  Only 29% of innovators reported difficulty recruiting with 24% reporting difficulty retaining.  Innovators also are much more confident that they have the appropriate skill levels in house.  While 74% of Innovators believe they have hired the appropriate analytics talent, only 17% of the analytically challenged felt the same.

One advantage of partnering with sales predictive analytics companies such as Lattice Engines or Leadspace is the ability to bypass hiring of in-house data scientists and instead work with their resources and tools.  While it is still important to understand the results and train staff in data interpretation, much of the complexity is removed.

Furthermore, the strategic advantage accruing to analytics capabilities is declining as more firms develop such capabilities.  In 2012, 67% of surveyed respondents believed analytics capabilities conveyed a strategic advantage.  By 2014, the percentage had dropped to 61%.  The authors posited two reasons for the decline: an increase in the number of firms investing in analytics and a difficulty in converting analytical insights into business action.  Half the respondents noted difficulty in translating insight to action.

“Technology is no longer the main barrier to creating business value from data: The bigger barrier is a shortage of appropriate skills,” said Ransbotham et al.  “Companies with appropriate analytical skills are far more likely to say that analytics is creating a competitive advantage in their organization than are other organizations.”

Owler: Jim Fowler on Crowdsourcing Content

Owler Profile of Lyft

Jim Fowler, who founded three crowdsourcing startups (Jigsaw which was acquired by Salesforce.com and renamed Data.com,  InfoArmy, and ), was asked how crowdsourcing has changed over the past decade.  His observation was broader than crowdsourcing and applied to any tech company looking to gain mindshare:

I think they change in the same way that we all have. We all are just overloaded with information.  Getting people’s time and getting them to pay attention is much more difficult now than it was back in the beginning of Jigsaw for sure. Getting journalists and analysts to talk and write about you is different because there’s so much going on. In fact a lot of the big publications don’t even exist or don’t write about it anymore.

It’s become much more flat, if you will. More players in it, so that’s interesting, but I just think the biggest thing is just people … There’s so much stuff flying around out there now that really making sure you have a crisp clear message so that they understand the value is even more important than it ever was and that’s just been the big change. People are more sophisticated, they’re more … They know how to use data and I see that trend continuing.

Fowler also noted that Owler combines crowdsourcing and semantic mining with editors.  While machines can do much of the work around event aggregation and structured alerts for exec changes, M&A, and funding rounds, editors ensure that information is properly tagged and mapped.  While this editorial review of news introduces a short delay in information delivery, it reduces the number of false positives and passing mentions of companies.  Furthermore, it allows them to de-dupe the stories and accurately capture M&A and funding content.

Basically, it solves your signal to noise problem through the addition of a short editorial review step.

If you just used technology to try to do this, you would get a lot of noise in there because really it’s a lot harder than it looks to figure out that the article is actually about Apple. Apple gets mentioned in millions of articles. To know that it’s actually about Apple is … To just do it with technology is really hard. What technology can do is say, “We think this is an article about Apple and we think it’s an Apple acquisition and we think this is the company that they did and we think this is it,” but what you need to do is create a task that gets prioritized very highly that a human looks at really quick. Checks out all the data and goes, “Ah, that’s right. We’re good,” and then sends it on to the people.

Otherwise you get a lot of noise, what I’m getting at is that technology can get you way down the road, but you need humans to get you all the way down the road if you want high quality data.

It is this multi-process approach that is likely to be the future of data collection and aggregation.  Traditional methods of data collection via phone interviews or analyzing filings information are quite expensive while semantic mining can get tripped up on context (is this about company X? Is this a relevant story? Is this a discussion of current events? Is this an actual event, proposed event, or mere rumor?).  Likewise, crowdsourcing requires a very large audience to obtain the wisdom of the crowd and works best on easily defined fields such as address, phone, and email (i.e. Jigsaw contacts).  Crowdsourcing also works well at gauging sentiment.  For example, Owler captures sentiment around whether the CEO is doing a good job and the projected fate of private companies.  But crowdsourcing does a poor job around complex information such as industry code tagging or corporate linkage.  It is through complementary methods that vendors will drive qualify forward while keeping data costs in check.