Back to Blog

Operation: Stateful. Introducing BlueK8s and Kubernetes Director

The juggernaut that is Kubernetes has been underway and gaining momentum for some time now. It provides an extensible container orchestration framework for automating the deployment, scaling, and management of any containerized application. It has a rich ecosystem of plugins for handling everything from storage to security. And while it was originally designed for running […] Read More

Hadoop 3.0 and the Decoupling of Hadoop Compute from Storage

The traditional Hadoop architecture was founded upon the belief that the only way to get good performance with large-scale distributed data processing was to bring the compute to the data. And in the early part of this century, that was true. The network infrastructure in the typical enterprise data center of that time was not […] Read More

Hadoop and Spark on Docker: Ten Things You Need to Know

For a while now, I’ve been struggling to understand why any enterprise would want to build their own solution for large-scale deployments of Big Data workloads like Hadoop and Spark on Docker containers. The arguments for “doing it yourself” (DIY) often play like a broken record: “If they <insert name of humongous tech giant here> […] Read More

The Proof is in the Pudding (I Mean, in the Benchmarking)

Back in the fall of 2012, we started BlueData with the firm belief that Big Data workloads would run in a virtual environment – and achieve the inherent virtualization benefits of flexibility, agility, and cost-efficiency – without paying a performance penalty. At the time, we were in a fairly small fringe group of believers in […] Read More

HDFS Upgrades Are Painful. But They Don’t Have to Be.

It’s hard enough to gather all the data that an enterprise needs for a Hadoop deployment; it shouldn’t be hard to manage it as well. But if you follow the traditional Hadoop “best practices”, it is. In particular, upgrades to the Hadoop Distributed File System (HDFS) are excruciatingly painful. By way of background, each version […] Read More

QoS for Hadoop Using Docker Containers

There is a lot of focus and attention on Big Data analytics today – and as I wrote in a recent blog post, it’s all about the applications.  But there are (of course) many infrastructure considerations to make the analytics and applications work seamlessly for your data scientists, analysts, and other users. One vital issue […] Read More

Hadoop: It’s All About the Apps

We’re starting a new year and so as technology professionals it’s time to take stock of our existing skill sets and prepare for the future by identifying those new skills that are likely to be most valuable. A recent ComputerWorld survey highlighted the “10 hottest tech skills for 2016” and Big Data came in at […] Read More

How to Implement a Secure, Multi-Tenant Hadoop Architecture

The content in this article originally appeared in TechSpective. Hadoop is an open source framework for storing and processing Big Data on large clusters of commodity hardware for massive data storage and faster processing. This would seem to make it a natural hub for all of an enterprise’s data. In this scenario, Hadoop could serve […] Read More

Docker, and Spark, and Hadoop. Oh My.

When we founded BlueData in the fall of 2012, we built our Big Data infrastructure software platform around the best open source hypervisor technology then available. We knew that virtualization was a key enabling technology to simplify Hadoop and other Big Data deployments. There were plenty of naysayers prophesying that Hadoop could never run in […] Read More