Hey, this is AmBlue. How can I help you today?

databricks cost optimization

Databricks Cost Optimization Best Practices

What is Databricks?

Databricks is a fully managed cloud-based unified analytics platform built on Apache Spark. It provides a collaborative environment for data engineers, data scientists, and analysts to process big data, conduct data science, and implement machine learning workflows.

Databricks simplifies the use of Spark by offering a managed environment where users can spin up clusters, collaborate using interactive notebooks, and access various built-in libraries for analytics, machine learning, and data engineering tasks.

 This platform not only streamlines the development and deployment of big data applications but also promotes an environment of teamwork and innovation, enabling organizations to extract actionable insights from their data more efficiently.

Related reading: What is Databricks?

Understanding Databricks Pricing:

Databricks pricing comprises two main components:

Instance Cost: This is the cost of the underlying compute instances on which Databricks clusters run. These costs depend on the instance types and the duration for which the instances are running.

Databricks Unit (DBU) Cost: A Databricks Unit (DBU) is a unit of processing capability per hour, billed on a per-second usage basis. The cost depends on the type of cluster and its configuration. Each operation performed on Databricks consumes a certain number of DBUs.

Monitor and Analyze Performance Metrics:

It is essential to set up custom configurations to gather the metrics.

Enable Custom Metrics: To monitor performance metrics like CPU and memory usage, you need to enable custom metrics on your EC2 instances. This involves using initialization (INIT) scripts to send these metrics to AWS CloudWatch. Custom metrics provide deeper insights into cluster performance and help in making informed decisions.

Create INIT Scripts: Use INIT scripts to create custom namespaces in CloudWatch/Log Analytics for each cluster. This allows us to track performance metrics like CPU and memory usage for individual clusters. For instance, you can create an INIT script to capture metrics and send them to CloudWatch. This step ensures that all necessary performance data is collected systematically.

Attach INIT Scripts to Clusters: Attach the INIT scripts to the Databricks clusters. This ensures that the necessary performance metrics are collected and sent to CloudWatch/Log Analytics whenever the cluster is active. Regular monitoring of these metrics helps in identifying inefficiencies and optimizing resource usage.

Challenges in Databricks Cost Optimization:

 Lack of Direct Performance Metrics: Earlier in Databricks, there were no direct performance metrics available. Performance metrics must be gathered from the underlying computing instances. Memory metrics require custom configurations to be reported to AWS CloudWatch/Log Analytics, adding another layer of complexity. This lack of direct visibility can make it challenging to optimize and manage costs effectively. Now in august Databricks made it available for the public access.

Limited Visibility into Resource Usage: Understanding which workloads or departments are driving up the costs can be challenging, especially in multi-tenant environments. This can make it difficult to allocate costs accurately and find optimization opportunities.

Databricks Cost Optimization Best Practices:

Enable Cluster Termination Option: During cluster configuration, enable the automatic termination option. Specify the period of inactivity after which the cluster should be terminated. Once this period is exceeded without any activity, the cluster will move to a terminated state, thus saving costs associated with running idle clusters.

Optimize Cluster Configurations: Choosing the right configuration for the Databricks clusters is essential for cost efficiency. Consider the following:

Select Appropriate Node Types: Match the node types to your workload requirements to avoid over-provisioning resources. By selecting the most suitable instance types, you can ensure that your clusters are cost-effective and performant.

DBU: Understanding the DBU consumption patterns and optimizing workloads can lead to significant cost savings.

Why CloudCADI for Databricks?

CloudCADI helps you in optimizing,

1.     Autoscaling inefficiency

Though autoscaling in databricks brings enormous benefits, it can easily add up your cloud bills without adding any value.  CloudCADI gives multiple actionable recommendations on the node resizing possibilities that can potentially save your databricks costs.

Example: Instead of  5 nodes of type – Standard_D4ads_v5 that costs $0.21 /h, you can alter it to 2 nodes of type Standard_D8as_v5 and realize 20% savings.

 2.     Cluster-node resizing inefficiency

CloudCADI analyzes the number of anomalies (inefficient CPU and Memory utilization) with its intelligent engine and gives recommendations on resizing.

Example: “Reduce the worker count from 8 to 5 for optimal usage” 

Conclusion:

Optimizing costs on Databricks involves a combination of strategic configurations, attentive monitoring, and the use of best practices for the specific workloads. By implementing cluster termination policies, monitoring performance metrics, and optimizing cluster configurations, you can ensure that your Databricks environment is both cost-effective and efficient.

Want to explore CloudCADI? Call us today : Book a Demo

Nandhini - Author
Nandhini Kumar - Senior Software Engineer

Author

Nandhini Kumar, is our  Software Engineer L2 who was part of Databricks implementation team in CloudCADI.

Share it with your network:

Subscribe to our Newsletter

Optimize your current cloud resources and have better control over cloud spend.
...
Download our
whitepapers

Individual privacy preferences

We use cookies and similar technologies on our website and process your personal data (e.g. IP address), for example, to personalize content and ads, to integrate media from third-party providers or to analyze traffic on our website. Data processing may also happen as a result of cookies being set. We share this data with third parties that we name in the privacy settings.

The data processing may take place with your consent or on the basis of a legitimate interest, which you can object to in the privacy settings. You have the right not to consent and to change or revoke your consent at a later time. For more information on the use of your data, please visit our privacy policy.

Below you will find an overview of all services used by this website. You can view detailed information about each service and agree to them individually or exercise your right to object.

Essential services are required for the basic functionality of the website. They only contain technically necessary services. These services cannot be objected to.  •  Show service information