The Canonical Data Fabric team is pleased to announce the first beta release of Charmed Spark, our solution for Apache Spark.
Apache Spark is a free, open source software framework for developing distributed, parallel processing jobs. It’s popular with data engineers and data scientists alike when building data pipelines for both batch and continuous data processing at scale. Engineers can write Python or Scala code to develop Spark jobs for ETL (extract-transform-load), analytics and machine learning.
Canonical is building a supported, packaged solution for running Spark jobs on Kubernetes. The preview release is the first milestone towards building a comprehensive solution for Spark users.
The beta release includes features for:
- Submitting jobs to the cluster
- Managing job configuration
- Security maintained container images
- A software operator to deploy and operate the Spark History Server
Charmed Spark is a part of Canonical Data Fabric, a…