Analysis of Blockchain transaction captured in a project that uses Jupyter notebook with GraphFrames and NetworkX, spark-notebook with GrapX. Notebooks attaches to a Spark cluster deployed in a standalone mode, everything containerized and running in Kubernetes or OpenShift.
I will show how you can leverage the containers and run the Spark cluster in PaaS, namely OpenShift and Kubernetes. For demonstration purposes, I'll be demoing the Blockchain analysis in Jupyter notebook using the Spark cluster running in OpenShift, everything dockerized. I am out of buzzwords.
Have you ever wondered how to implement your own operator pattern for you service X in Kubernetes? You can learn this in this session and see an example of open-source project that does spawn Apache Spark clusters on Kubernetes and OpenShift following the pattern. You will leave this talk with a better understanding of how spark-on-k8s native scheduling mechanism can be leveraged and how you can wrap your own service into operator pattern not only in Go lang but also in Java. Let's make the data science more scalable in a cloud native fashion.