Jim Fowler, who founded three crowdsourcing startups (Jigsaw which was acquired by Salesforce.com and renamed Data.com, InfoArmy, and ), was asked how crowdsourcing has changed over the past decade. His observation was broader than crowdsourcing and applied to any tech company looking to gain mindshare:
I think they change in the same way that we all have. We all are just overloaded with information. Getting people’s time and getting them to pay attention is much more difficult now than it was back in the beginning of Jigsaw for sure. Getting journalists and analysts to talk and write about you is different because there’s so much going on. In fact a lot of the big publications don’t even exist or don’t write about it anymore.
It’s become much more flat, if you will. More players in it, so that’s interesting, but I just think the biggest thing is just people … There’s so much stuff flying around out there now that really making sure you have a crisp clear message so that they understand the value is even more important than it ever was and that’s just been the big change. People are more sophisticated, they’re more … They know how to use data and I see that trend continuing.
Fowler also noted that Owler combines crowdsourcing and semantic mining with editors. While machines can do much of the work around event aggregation and structured alerts for exec changes, M&A, and funding rounds, editors ensure that information is properly tagged and mapped. While this editorial review of news introduces a short delay in information delivery, it reduces the number of false positives and passing mentions of companies. Furthermore, it allows them to de-dupe the stories and accurately capture M&A and funding content.
Basically, it solves your signal to noise problem through the addition of a short editorial review step.
If you just used technology to try to do this, you would get a lot of noise in there because really it’s a lot harder than it looks to figure out that the article is actually about Apple. Apple gets mentioned in millions of articles. To know that it’s actually about Apple is … To just do it with technology is really hard. What technology can do is say, “We think this is an article about Apple and we think it’s an Apple acquisition and we think this is the company that they did and we think this is it,” but what you need to do is create a task that gets prioritized very highly that a human looks at really quick. Checks out all the data and goes, “Ah, that’s right. We’re good,” and then sends it on to the people.
Otherwise you get a lot of noise, what I’m getting at is that technology can get you way down the road, but you need humans to get you all the way down the road if you want high quality data.
It is this multi-process approach that is likely to be the future of data collection and aggregation. Traditional methods of data collection via phone interviews or analyzing filings information are quite expensive while semantic mining can get tripped up on context (is this about company X? Is this a relevant story? Is this a discussion of current events? Is this an actual event, proposed event, or mere rumor?). Likewise, crowdsourcing requires a very large audience to obtain the wisdom of the crowd and works best on easily defined fields such as address, phone, and email (i.e. Jigsaw contacts). Crowdsourcing also works well at gauging sentiment. For example, Owler captures sentiment around whether the CEO is doing a good job and the projected fate of private companies. But crowdsourcing does a poor job around complex information such as industry code tagging or corporate linkage. It is through complementary methods that vendors will drive qualify forward while keeping data costs in check.