Back to Blog

HPE and BlueData − A Game-Changing Combination for Big Data

Last week, at the Strata + Hadoop World conference, an estimated 7,000 people descended upon New York City to hear about the latest and greatest in Big Data, Artificial Intelligence, and the Internet of Things.  I enjoy this conference for two main reasons: it’s a great opportunity to meet with dozens (if not hundreds) of enterprise customers and partners all in one place; and it’s an opportunity to learn about new innovations and lessons learned from experts in the field.

Hands down, the highlight of my week was a roundtable and dinner (sponsored by BlueData and Intel) attended by Big Data executives from organizations including ADP, Bank of America, Cornell University, Fidelity, GlaxoSmithKline, JPMorgan Chase, Nasdaq, USAA, and other Fortune 500 enterprises.  Albeit completely un-scientific and statistically irrelevant, there were three common themes I heard at the beginning of our roundtable discussion:

  • The majority of Big Data workloads are still on-premises;
  • Big Data deployments are too complex and the traditional bare-metal approach is too expensive; and
  • The complexity and cost of these on-premises deployments are increasingly forcing enterprises to seek out public cloud solutions for Big Data.

This really wasn’t a surprise.  I’ve been reading the prognostications from Gartner’s Merv AdrianForrester’s Mike Gualtieri, and ESG’s Nik Rouda echoing similar points.  And here at BlueData, I’ve written in the past about the complexity of Big Data deployments.

The graphic below illustrates the traditional bare-metal approach for on-premises Big Data deployments. It often takes several weeks to provision the systems and infrastructure required for each new cluster. There may be multiple clusters containing mostly the same data, leading to cluster sprawl and massive data duplication. And all this leads to management challenges on multiple fronts – including data management and governance, as well as the lack of skilled resources to manage each cluster.


However, the really good stuff started to come a little later in the roundtable discussion that evening in New York.  Three related themes became even louder and clearer:

  • The on-premises data center is under tremendous pressure in the enterprise;
  • Public cloud has become an increasingly viable option for all workloads, and the traditional Big Data deployment model is not a winning strategy; and
  • Enterprises need a game-changing alternative to the traditional bare-metal deployment for Big Data … or their Big Data workloads will move to the public cloud.

These themes are provocative, but they do indicate the sentiment within enterprises today.  At the end of the day, the Big Data experts and executives at these enterprises don’t want to see either on-premises or public cloud “win”.  What they want is faster time-to-value for their Big Data initiatives (and lower TCO in the process) – while ensuring enterprise-class security, data governance, performance, and scalability.

What do you get when you cross a VMware with an Amazon EMR?

So what does all of this have to with HPE?  Well, about five months ago, we first started working with our new friends at HPE.  The conversations started in the field, as most good partnerships do: with a goal of providing a compelling joint solution for our enterprise customers. While the public cloud has become attractive for some Big Data workloads, many of these customers need to retain their data on-premises due to security, regulatory, and data gravity considerations. That’s where we come in.

The HPE field teams and executives we met with wanted to provide their customers with an on-premises alternative to public cloud for Big Data: offering the agility, flexibility, and elasticity of Big-Data-as-a-Service (ala Amazon EMR) together with the performance, scalability, and efficiency of HPE hardware.  They knew they had all the right physical infrastructure components for high-performance, secure, and highly scalable enterprise Big Data deployments; but they needed a software solution to help provide a cloud-like experience for these on-premises Big Data deployments.

About an hour into our first meeting, one of the HPE executives interrupted and exclaimed: “To me, BlueData seems to be what you get when you cross VMware and Amazon EMR”.  I just nodded in the affirmative, and the rest is history.


Today we announced the formal partnership between HPE and BlueData.  Now, with BlueData software and HPE infrastructure, enterprises can deliver value from Big Data within days instead of months and with significantly lower TCO compared to traditional approaches. They can deliver the self-service agility, elasticity, and flexibility of Big-Data-as-a-Service in an on-premises deployment model.

Leveraging BlueData software and the power of Docker containers, data scientists and analysts can create their own Hadoop or Spark clusters on-demand within minutes. They can spin up clusters for their Big Data analytics tools of choice, with the ability to access common pools of data stored in local or remote systems. They can easily try out new versions, new applications, and new Big Data frameworks – and they can eliminate data duplication and cluster sprawl.  As illustrated in the graphic below, they can deploy a multi-tenant environment for multiple different use cases and applications with secure data isolation across shared infrastructure.


The result is a more flexible, agile, and cost-effective approach to Big Data infrastructure. Enterprise customers can leverage all the benefits of the cloud operating model (self-service, agility, and elasticity) for Hadoop, Spark, and other Big Data workloads – while keeping their data within their own data centers. It’s a game-changing new model for on-premises Big Data deployments.

To learn more about the game-changing combination of BlueData and HPE, read the press release here, check out the new white paper below, or send an email to our HPE/BlueData alliance alias at