Back to News

BlueData Introduces EPIC 3.0 for Large-Scale Deployments of Big Data Analytics and Data Science with Docker Containers

New Release Delivers Enterprise-Grade Scalability, Security, and Performance for Distributed Data Science

Santa Clara, Calif.BlueData®, provider of the leading Big-Data-as-a-Service (BDaaS) software platform, today announced BlueData EPIC™ version 3.0.  This major new release provides increased scalability, enhanced networking, significant security and performance optimizations, and new functionality for distributed data science operations.  With BlueData EPIC 3.0, enterprises can quickly and easily deploy large-scale production environments for Big Data analytics and data science running in Docker containers – either on-premises, in the public cloud, or in a hybrid architecture.

Enterprise deployments of the BlueData EPIC software platform have expanded significantly over the past year, driven by increased adoption of Docker containers and cloud computing for Big Data.  Many of these enterprises want their Big Data workloads running in a hybrid model that spans both on-premises and public cloud infrastructure, whether for initial dev/test and sandbox environments or large-scale production deployments. These organizations need a solution that can scale to thousands of containers, with agility and flexibility for distributed data science operations in a multi-tenant architecture.  And for Big Data workloads deployed on-premises, they want minimal to no changes to their existing network and security infrastructure.

BlueData EPIC 3.0 addresses these needs by introducing several significant upgrades. These include:

  • Flexible networking and monitoring for large-scale Big Data deployments using Docker containers: The new EPIC release can now scale to hundreds of virtual nodes and thousands of on-premises containers by supporting non-routable, private IP ranges that can be accessed via a set of BlueData EPIC gateway hosts.  This greatly simplifies the networking infrastructure for containerized Big Data applications in enterprise-wide production deployments.  And with EPIC 3.0, BlueData adds fine-grained monitoring for CPU, memory, and other key metrics with a pluggable framework based on Elasticsearch, Metricbeat, and Kibana.
  • Additional support for Big Data multi-tenancy on public cloud infrastructure and hybrid architectures: The new 3.0 version extends BlueData EPIC’s differentiation as the only unified Big-Data-as-a-Service solution for on-premises, public cloud, and hybrid For example, this new release delivers enhanced support for highly secure and highly available multi-tenant environments on Amazon Web Services (AWS): tenants and instances can now be isolated across different Amazon subnets, security groups, regions, and virtual private cloud (VPC) networks.
  • Performance and security optimizations for Big Data workloads, with decoupled compute and storage: This release introduces several new optimizations that enable separation of compute and storage for containerized Big Data infrastructure, while ensuring comparable performance to bare-metal deployments.  BlueData EPIC 3.0 also provides the option to utilize the same Kerberos principal for end-to-end security – from a Hadoop compute cluster and its associated services to a remote Hadoop Distributed File System (HDFS) – while supporting “mix and match” of different Hadoop versions for compute and storage. 
  • Increased productivity and automation for distributed data science operations: BlueData EPIC 3.0 introduces a new user interface and streamlined user experience for data science teams – including one-click launch of containerized data science environments using pre-defined best practice templates. Building upon functionality released earlier this year, it helps automate the end-to-end lifecycle of data science operations.  It includes new pre-integrated application images for Spark 2.x with Python (e.g. with Jupyter Notebook and PySpark) and R (e.g. with RStudio and sparklyr).  And it provides the flexibility to deploy Spark either in standalone mode, with YARN, or now with Apache Mesos for resource management.

“Our customers now include some of the world’s largest companies, with some of the biggest Big Data deployments in the world.  I’m very pleased to see our product roadmap being shaped by increased usage and expansion of production environments within these large enterprises,” said Kumar Sreekanti, CEO and co-founder of BlueData. “EPIC 3.0 is one of our most significant software releases so far. It delivers the enterprise-class scalability, security, and performance that these customers need – and solidifies EPIC’s position as the leading software platform for Big-Data-as-a-Service.”

BlueData will be featuring the new BlueData EPIC 3.0 release at the DataWorks Summit in San Jose this week (at booth #1104).

Supporting Resources
Blog post: Introducing BlueData EPIC 3.0
Blog post: Large-Scale Data Science Operations

About BlueData Software, Inc.
BlueData is transforming how enterprises deploy their Big Data applications and infrastructure.  The BlueData EPIC™ software platform uses Docker container technology to make it easier, faster, and more cost-effective for enterprises of all sizes to leverage Big Data – enabling Big-Data-as-a-Service either on-premises or in the cloud.  With BlueData, they can spin up virtual Hadoop or Spark clusters within minutes, providing data scientists with on-demand access to the applications, data, and infrastructure they need.  Based in Santa Clara, California, BlueData was founded by VMware veterans and its investors including Amplify Partners, Atlantic Bridge, Dell Technologies Capital, Ignition Partners, and Intel Capital.  To learn more about BlueData, visit or follow @bluedata.

Press Contact
Paul Doyle