IoT: Staying Afloat in the Flood of Data

Apr 19, 2016 - by Art Shectman

This is part 2. Please see part 1 here.

Faced with a flood of data pouring forth from the Internet of Things, what’s a company to do?

If you approached the task of turning that data into something predictively useful in the normal way, you wouldn’t get far without great technical sophistication. You’d need Ph.D.-level data scientists just to create a predictive model, and they’d have to understand how the parameters they’re feeding into the model control the outputs. 

It’s a tall order, and not one that many organizations are equipped to fulfill, especially if they have to build everything from the ground up. That’s an approach available to big government and the biggest companies, as reflected in a recent survey by Tech Pro Research. 

Some 49 percent of companies with 1,000 or more employees had big data implementations under way, but smaller companies were implementing big data at no more than half that rate.

Given the requirements of the traditional big data implementation in terms of technical staffing power, that’s no surprise. Strangely enough, though, the smallest companies surveyed, those with fewer than 50 employees, are implementing IoT devices and systems at a higher rate than companies with anywhere from 50 to 999 employees. 

With Qubole’s Big Data-as-a-Service in our rapid protyping arsenal, we’ve created big data prototypes in days and weeks, instead of months.

The small fry are data enthusiasts, but they’re not equipped to deal with the data in the ways that the biggest companies can. It seems, then, the barrier to big data implementation is not one of data collection. It’s a problem of data analytics and the technical resources it takes to put all that data to productive and predictive use.

Smaller companies don’t have to be left behind, however, and what’s riding to their rescue is the development of services that obviate the need for ground-up development and in-house management.

And “service” is the operative word.

Qubole, for example, describes itself specifically as a company that provides “Big Data-as-a-Service,” and it has identified a marked increase in satisfaction from users opting for BDaaS, as it’s known, instead of an on-premises solution or a hosted service provider. Qubole points to a number of factors that inhibit the implementation of big data, chief among them the lack of qualified in-house staff and the administrative burden that comes along with an in-house approach. 

According to the company, its BDaaS model increases the number of analysts, as Qubole calls the end users of the data, that a single administrator can support from 1.5 to 21, and it dramatically cuts into the “normal” implementation time of up to 18 months, another obstacle to success. The idea of treating big data as a service – or, in fact, a utility – drives significant increases in efficiency on several fronts.

At Elephant Ventures we’ve incorporated Qubole’s BDaaS into our rapid protyping arsenal, and we’ve been able to create big data prototypes in days and weeks, instead of months. We’re huge fans of the platform and how it can help large enterprises and startups alike collect and analyze data at enormous scale. It serves as a key part of our Innovation framework.  (Contact us if you’d like to learn how we can leverage our Rapid Big Data pattern for you.)

In a similar vein, DataRobot turns predictive modeling into something that’s almost an exercise in plug-and-play. Bring it your data, tell it what you want to predict, and, with a single click, DataRobot produces hundreds of models, with the best of them making it to the company’s leaderboard. From there, an analyst can quickly move to prediction and deployment.

Interestingly enough, DataRobot isn’t working its magic through proprietary algorithms that will never see the light of day. It uses open source machine-learning algorithms from a great many sources, including H2O, Python, R and Spark, with the goal of enabling the user to select the best algorithmic and pre-processing options for a given purpose.

It’s a flexible approach to the most complex problems, one that opens big data analytics to a wider audience, and related, utility-like options are offered by a number of other companies in the field, including Iobeam and Databricks.

In a sense, what we’re seeing with big data and the IoT mirrors a progression that we’ve seen before, even in computing on a much simpler and more individual level. In the olden days, the PC wasn’t accessible without a working knowledge of the command line. You had to know something about FTP to accomplish much online. Times have changed, and users no longer need to look under the hood to make productive use of the machines they use.

It’s also the approach we’ve taken at Elephant with many of our clients. The right tools for the job can do wonders. One of our clients, a prominent digital advertising firm, brought us on board to revamp their approach to managing massive quantities of business-critical data. With our help, management and analysis that took days could be done in minutes. 

At the same time, our ability to increase analytic accuracy led directly to a doubling of marketing spend in the client’s favor. Data volumes ten times initial specifications – and well beyond client expectations – were managed seamlessly. Seven years later, the system continues to serve the client’s needs flawlessly.

While analytics involving big data and the IoT may be relevant to a different audience, the trend is similar. More and more, options to manage the rising tide of data are becoming available to organizations that aren’t staffed by people with advanced degrees in computer science and applied math. And that creates excellent opportunities.

Get ready for Part 3.