Back to Blog

The Proof is in the Pudding (I Mean, in the Benchmarking)

Back in the fall of 2012, we started BlueData with the firm belief that Big Data workloads would run in a virtual environment – and achieve the inherent virtualization benefits of flexibility, agility, and cost-efficiency – without paying a performance penalty. At the time, we were in a fairly small fringe group of believers in […] Read More

QoS for Hadoop Using Docker Containers

There is a lot of focus and attention on Big Data analytics today – and as I wrote in a recent blog post, it’s all about the applications.  But there are (of course) many infrastructure considerations to make the analytics and applications work seamlessly for your data scientists, analysts, and other users. One vital issue […] Read More

Hadoop Virtualization: It’s About Time (and Value)

“Whoever wishes to foresee the future must consult the past.” – Machiavelli While there have been many amazing technologies that have impacted the data center, history books will certainly keep a warm spot open for “Virtualization”.  Virtualization is certainly not a new idea (indeed, it dates all the way back to the 1960s) but it […] Read More

Docker, and Spark, and Hadoop. Oh My.

When we founded BlueData in the fall of 2012, we built our Big Data infrastructure software platform around the best open source hypervisor technology then available. We knew that virtualization was a key enabling technology to simplify Hadoop and other Big Data deployments. There were plenty of naysayers prophesying that Hadoop could never run in […] Read More

Data Lakes: Keep Your Big Data Projects Out of the Swamp

Businesses are spending millions of dollars on Big Data-related initiatives (up to $41.5 billion by 2018 according to IDC), but their return on investment is no sure thing. What’s holding back the ROI? The IT infrastructure used today in most organizations was not designed specifically to handle Big Data workloads, the systems requirements of tools in […] Read More

Severing the Link between Big Data Compute and Storage

Call it Big Data meets hyper-convergence, or the next generation virtual storage network, or Hadoop on a hypervisor, but the time has come for a serious look at running honest to goodness real-world Big Data workloads in a virtual environment. Not just test and dev mind you, but real production Hadoop workloads using virtualization. Ever since we started running […] Read More

How to Hadoop: Public Cloud or Private Cloud?

Just about every Hadoop-powered Big Data initiative has its own set of business objectives and thus requires a distinct environment. Fortunately, there are several infrastructure options to choose from. Hadoop clusters can run on physical or virtualized servers, and in public or private clouds. A virtual Hadoop cluster is when the Hadoop distribution is installed […] Read More

Simplifying the Complexity of Big Data

Listen to co-founders Kumar Sreekanti and Tom Phelan discuss the overwhelming complexity of Big Data – from purchasing the right hardware and software, installing it and the 100 different ways to configuring it – and how enterprises like banks, pharmaceuticals and media companies can overcome this complexity.

What gaps in Big Data currently exist?

Co-founders Kumar Sreekanti and Tom Phelan reveal the limitations of Big Data today and how virtualization can help break these limitations and unleash the true potential of Big Data.