Back to Blog

Hadoop 3.0 and the Decoupling of Hadoop Compute from Storage

The traditional Hadoop architecture was founded upon the belief that the only way to get good performance with large-scale distributed data processing was to bring the compute to the data. And in the early part of this century, that was true. The network infrastructure in the typical enterprise data center of that time was not […] Read More

Hadoop Is Growing Up

This is a guest blog courtesy of Keith Manthey, CTO of Analytics at Dell EMC. The content originally appeared on the Dell EMC blog site here. As a part of my regular duties, my job is to pay attention to macro-level movements of various industries and technology sectors. One of those sectors that is facing […] Read More

Hadoop and Spark on Docker: Ten Things You Need to Know

For a while now, I’ve been struggling to understand why any enterprise would want to build their own solution for large-scale deployments of Big Data workloads like Hadoop and Spark on Docker containers. The arguments for “doing it yourself” (DIY) often play like a broken record: “If they <insert name of humongous tech giant here> […] Read More

Introducing BlueData EPIC 3.0

Today we announced version 3.0 of the leading software platform for Big-Data-as-as-Service in the enterprise: BlueData EPIC. This release incorporates powerful new innovations and functionality based on the feedback and input from our customers – delivering even greater scalability, security, performance, and flexibility for their Big Data deployments. But before I go into detail on what’s […] Read More

The Proof is in the Pudding (I Mean, in the Benchmarking)

Back in the fall of 2012, we started BlueData with the firm belief that Big Data workloads would run in a virtual environment – and achieve the inherent virtualization benefits of flexibility, agility, and cost-efficiency – without paying a performance penalty. At the time, we were in a fairly small fringe group of believers in […] Read More

Beyond Hadoop-as-a-Service: The Opportunity for Big-Data-as-a-Service

This is a guest blog courtesy of Raghunath Nambiar, distinguished engineer and chief architect of big data and analytics solution engineering at Cisco.  This post originally appeared on the Cisco blog site here. I’ve written in the past about the opportunity for Hadoop-as-a-Service (HaaS) – providing self-service provisioning, elastic scaling, and support for multi-tenancy. But in my […] Read More

HDFS Upgrades Are Painful. But They Don’t Have to Be.

It’s hard enough to gather all the data that an enterprise needs for a Hadoop deployment; it shouldn’t be hard to manage it as well. But if you follow the traditional Hadoop “best practices”, it is. In particular, upgrades to the Hadoop Distributed File System (HDFS) are excruciatingly painful. By way of background, each version […] Read More

QoS for Hadoop Using Docker Containers

There is a lot of focus and attention on Big Data analytics today – and as I wrote in a recent blog post, it’s all about the applications.  But there are (of course) many infrastructure considerations to make the analytics and applications work seamlessly for your data scientists, analysts, and other users. One vital issue […] Read More

Hadoop: It’s All About the Apps

We’re starting a new year and so as technology professionals it’s time to take stock of our existing skill sets and prepare for the future by identifying those new skills that are likely to be most valuable. A recent ComputerWorld survey highlighted the “10 hottest tech skills for 2016” and Big Data came in at […] Read More