Red Hat OpenShift Service on AWS (ROSA) is a fully managed application platform that offers a more seamless experience for building, deploying, and scaling applications. For machine learning (ML) workloads, ROSA now supports On-Demand Capacity Reservations (ODCR) and Capacity Blocks for ML, allowing cloud architects and platform administrators to strategically utilize their existing AWS purchases to help deliver uninterrupted access to essential compute infrastructure. Today, ROSA is available in over 30 regions and supports over 600 instance types, allowing customers to run diverse workloads according to their business needs. However, maintaining guaranteed or uninterrupted access to a specific infrastructure type in a particular availability zone (AZ) is important for several critical scenarios: - GPU-based accelerated computing workloads: Gaining uninterrupted access to accelerated computing (GPU) instances is vital for AI/ML teams conducting training, fine-tuning, or inference workloads. Capacity reservation helps eliminate the risk of compute unavailability for these time-sensitive, resource-intensive tasks. - Planned scaling events: Enabling infrastructure scaling events to confidently support planned business events—such as peak traffic seasons, major product launches, or scheduled batch processing—without provisioning delays. - High availability and disaster recovery: Enhancing resiliency by guaranteeing capacity when deploying workloads across multiple AZs or executing disaster recovery protocols across regions. Amazon EC2 Capacity Reservations allow you to reserve compute capacity for your Amazon EC2 instances in a specific AZ for any duration. Capacity Blocks for ML allow you to reserve GPU-based accelerated computing instances on a future date to support your short duration ML workloads. With the support for Capacity Reservations for clusters with hosted control planes (HCP), platform administrators can now create ROSA machine pools in their cluster that directly consume the capacity already reserved with AWS. Key best practices for effectively leveraging Capacity Reservations with ROSA: - Pre-planning of AZs, instance types, and capacity: Before creation, ensure a precise match between the reserved capacity and the ROSA machine pool attributes. This includes VPC subnets, the number of node replicas, and the instance type. When reserving capacity for a future date, carefully balance the relative costs of purchasing capacities across different AZs with technical configurations like VPC subnet size, available IPs, and node replica requirements. You must wait until the AWS Capacity Reservation status is active before attempting to provision ROSA machine pools utilizing it. - Informed decision on instance matching criteria: AWS provides two types of instance matching criteria for ODCRs: "Open" and "targeted." Choose a strategy based on your workload distribution. If you run multiple workloads across different services and intend to reserve capacity exclusively for your ROSA clusters, using the targeted matching criteria is strongly recommended. Remember that ODCRs operate on a ‘use it or lose it’ principle, as they are billed at on-demand rates regardless of utilization. - Precise control over reserved capacity consumption: ROSA offers flexible controls to define how workloads should utilize EC2 instances across on-demand and capacity reservations. For example, decide whether you want the machine pool to either use on-demand instances as fall back or to fail when the configured capacity reservation is exhausted. - Centralized management and allocation of purchases: For organizations managing multiple AWS accounts, the ability to centralize the purchase of ODCRs and allocate them across member accounts with AWS Resource Access Manager is a significant benefit. ROSA fully supports utilizing Capacity Reservations that are shared to the AWS account where the cluster is created, simplifying financial management and ensuring all teams benefit from reserved capacity. - Proactive monitoring of Capacity Reservation utilization: Given that multiple workloads or accounts may share reservations, it's crucial to monitor Capacity Reservation utilization continuously. Cluster-specific utilization can fluctuate widely over time. Proactively planning for conditions, such as the exhaustion of reserved capacities, can prevent a ROSA cluster node from becoming unavailable for critical workloads. To learn more about how to purchase Capacity Reservations and Capacity Blocks for ML, read the AWS documentation. To learn more about managing machine pools and setting capacity preferences in your ROSA cluster, read the Managing Nodes chapter in the ROSA documentation. To get started with ROSA, visit the ROSA product page. Product trial Red Hat OpenShift Container Platform | Product Trial About the authors More like this Optimize Cloud Costs with Red Hat OpenShift Virtualization and ROSA on AWS DxOperator from DH2i is now certified for Red Hat OpenShift 4.19 SREs on a plane | Technically Speaking Get started with ROSA Launch a hands-on experience of Red Hat® OpenShift® Service on AWS (ROSA). Visit the ROSA learning hub Dive deeper into ROSA and troubleshoot with learning materials and tools. Browse by channel Automation The latest on IT automation for tech, teams, and environments Artificial intelligence Updates on the platforms that free customers to run AI workloads anywhere Open hybrid cloud Explore how we build a more flexible future with hybrid cloud Security The latest on how we reduce risks across environments and technologies Edge computing Updates on the platforms that simplify operations at the edge Infrastructure The latest on the world’s leading enterprise Linux platform Applications Inside our solutions to the toughest application challenges Virtualization The future of enterprise virtualization for your workloads on-premise or across clouds