Last summer, I wrote here about our BlueK8s initiative and a new open source project for deploying and managing complex stateful scale-out applications on Kubernetes: KubeDirector. KubeDirector enables data scientists familiar with data-intensive distributed applications such as Hadoop, Spark, Cassandra, TensorFlow, Caffe2, etc. to easily run these applications on Kubernetes.
In my blog post on the Kubernetes site in the fall, I introduced version 0.1 of KubeDirector and described how it works. Since then, we’ve seen a lot of interest in KubeDirector from the community we’re very excited about the progress so far. The BlueData team behind this effort is now part of HPE, and the KubeDirector project continues to move full steam ahead.
To that end, we just pushed out the next release and our first public update of KubeDirector: version 0.2. You can check out the full details on our github site here: https://github.com/bluek8s/kubedirector/releases/tag/v0.2.0
Some of the highlights of what’s new in version 0.2 of KubeDirector include:
- A fully deployable Cloudera 5.14.2 image is now available in the catalog of example applications
- Cluster launch performance has been enhanced through additional work on launch parallelization
- The “configcli” tool used in application setup is now included in this repo, in the “nodeprep” directory.
- We’ve made additional improvements to the Makefile support and functionality:
- KubeDirector can now be built and deployed on Ubuntu systems
- “make deploy” now waits for deployment to succeed before returning,
- “make teardown” now waits for teardown to finish before returning.
- KubeDirector actions are now recorded as Kubernetes events and can be viewed by the standard “kubectl describe” command
- KubeDirector has been tested on the following Kubernetes platforms:
- DigitalOcean Kubernetes (DOK)
- Google Kubernetes Engine (GKE)
- Amazon Elastic Container Service for Kubernetes (EKS)
- Kubernetes version 1.13.2 on CentOS kernels
See below for a screenshot of KubeDirector v0.2 running four pods of a Spark cluster:
One of those pods is a Jupyter notebook, as shown below:
Join the Community
We’re working towards the next version of KubeDirector (and the broader BlueK8s initiative) and we’d welcome your help as developers, contributors, and adopters. Follow @BlueK8s on Twitter and get involved through these channels: