The field of machine learning – and deep learning in particular – has made significant progress recently and use cases for deep learning are becoming more common in the enterprise. We’ve seen more of our customers adopt machine learning and deep learning frameworks for use cases like natural language processing with free-text data analysis, image […] Read More
For a while now, I’ve been struggling to understand why any enterprise would want to build their own solution for large-scale deployments of Big Data workloads like Hadoop and Spark on Docker containers. The arguments for “doing it yourself” (DIY) often play like a broken record: “If they <insert name of humongous tech giant here> […] Read More
Here at BlueData, I’ve worked with many of our customers (including large enterprises in financial services, telecommunications, and healthcare, as well as government agencies and universities) to help their data science teams with their Big Data initiatives. In this blog post, I want to share some of my recent experiences in working with the data […] Read More
Apache Spark is clearly one of the most popular compute frameworks in use by data scientists today. For the past couple years here at BlueData, we’ve been focused on providing our customers with a platform to simplify the consumption, operation, and infrastructure for their on-premises Spark deployments – with ready-to-run, instant Spark clusters. In previous […] Read More
In my experience as a Big Data architect and data scientist, I’ve worked with several different companies to build their data platforms. Over the past year, I’ve seen a significant increase in focus on real-time data and real-time insights. It’s clear that real-time analytics provide the opportunity to make faster (and better) decisions and gain […] Read More
“Whoever wishes to foresee the future must consult the past.” – Machiavelli While there have been many amazing technologies that have impacted the data center, history books will certainly keep a warm spot open for “Virtualization”. Virtualization is certainly not a new idea (indeed, it dates all the way back to the 1960s) but it […] Read More
Apache Spark has quickly become one of most popular Big Data technologies on the planet. By now, you probably know that it offers a unified, in-memory compute engine that works with distributed data platform such as HDFS. So what does that mean? It means that in a single program, you can acquire data, build a pipeline, and […] Read More
When we founded BlueData in the fall of 2012, we built our Big Data infrastructure software platform around the best open source hypervisor technology then available. We knew that virtualization was a key enabling technology to simplify Hadoop and other Big Data deployments. There were plenty of naysayers prophesying that Hadoop could never run in […] Read More
Organizations in every industry are faced with the need to analyze and solve Big Data challenges in a timely manner. Many of them are trying new data platforms that can help them adapt quickly and respond to these new data challenges. They need to find the right column of innovation and sustainability; it’s a continuous […] Read More
Big Data analysis is having an impact on every industry. This is no longer a tactic taken by a few visionary leaders to capitalize on new business insights. It’s quickly moving into the mainstream. The early adopters of Big Data gained a competition advantage. Today, it’s table stakes: Big Data is now a competitive imperative. If you aren’t […] Read More