CertHub - RSS Feed Reader

Many existing and popular workloads are getting infused and enhanced with AI, and there will likely emerge a new wave of AI applications in the future. This has led to the increasing importance of AI accelerators, including graphics processing units (GPU) and custom training and inference engines. From discrete GPUs to AI acceleration integrated on-die with the traditional CPU, it's clear that specialized, accelerated hardware is required to provide the performance needed to develop and deploy tomorrow's workloads. That's why we're announcing a new, simplified AI accelerator driver experience on Red Hat Enterprise Linux (RHEL). Whether you're a developer building the next ground-breaking AI application, or an IT systems administrator provisioning servers to deploy AI workloads, RHEL provides a seamless experience to get accelerated systems up and running. You can now acquire AI accelerator drivers from NVIDIA and AMD from Red Hat repositories, built and signed by Red Hat using secure software supply chain practices and Secure Boot technologies. In just one command, you can install the latest available accelerator drivers. The challenge of GPU driver management, and our solution Historically, installing and maintaining GPU accelerator drivers with enterprise-grade Linux distributions has presented a unique set of challenges. Users often faced hurdles like: - Driver compatibility: Ensuring the correct driver version for specific kernels and hardware. - Security and trust: Verifying the authenticity and integrity of third-party drivers by supporting Secure Boot. - Maintenance overhead: Manually updating drivers and managing potential conflicts with system updates. This new offering from Red Hat addresses these challenges head-on. By providing AMD, Intel and NVIDIA drivers through Red Hat repositories, we're simplifying the deployment and management of AI workloads on RHEL, giving you greater confidence and control. Our new experience includes: - NVIDIA and AMD AI accelerator kernel and user mode drivers, built and signed by Red Hat (when applicable), and packaged in Red Hat repositories. - A script to seamlessly install the latest NVIDIA and AMD data center AI accelerator drivers. - AMD and Intel kernel mode drivers integrated with the upstream Linux kernel. Kernel Mode Driver User Mode Driver NVIDIA RHEL Extensions Repository CUDA Toolkit: Supplementary Repository AMD BaseOS, RHEL Extensions Repository ROCm: RHEL Extensions Repository Intel BaseOS N/A Why this matters for your AI initiatives This new capability brings several key benefits to RHEL users leveraging AI accelerators: - Faster time to value: By reducing the friction of driver installation and management, your teams can spend more time on building and deploying mission critical AI workloads that matter to your business, and less time getting things to work. - Enhanced security and trust: All drivers are built and signed by Red Hat, driving greater supply chain security and integrating with confidential computing. You can deploy with more confidence, knowing the drivers are authentic and haven't been tampered with. - Streamlined access: Get all of the drivers needed to operate your AI accelerator hardware, delivered through the Red Hat ecosystem (Extensions and Supplementary repositories), integrating seamlessly with your existing RHEL update workflows using dnfcommands. - Confidence in compatibility through partner validation: Drivers are tested and validated by our partners, ensuring stability and compatibility with RHEL kernels. This reduces the risk of system instability and improves the overall reliability of your AI infrastructure. Easy installation with rhel-drivers The new rhel-drivers command automatically detects the data center-class AI accelerator hardware present in your system, and then automatically installs the latest available kernel mode driver based on your Linux kernel version. This powerful tool takes away having to sift through documentation or product compatibility pages, delivering the latest accelerator innovation that's needed to take advantage of the AI tooling you want to use. Partner validation: Confidence in running AI accelerators on RHEL Red Hat has a long history of collaborating with AMD, Intel and NVIDIA to deliver enterprise solutions to our shared customers. Our partners have done meaningful testing on RHEL to ensure its compatibility, performance, and stability. RHEL Extensions Repository and Supplementary Repository Today's software ecosystem has a wide mix of development models and licensing. We understand that the modern IT environment relies on a diverse set of software and tools to deliver the required business value. This is why we provide customer access to multiple repositories to address this diverse ecosystem. The AI accelerator ecosystem similarly relies on a mixture of open source and proprietary content. With the RHEL Extensions and Supplementary repositories, you can get what you need to run your AI accelerators, all from within the Red Hat ecosystem. RHEL Extensions Repository The RHEL Extensions Repository was created to distribute third party, open source content built and signed by Red Hat to provide the confidence in a secure supply chain. Red Hat Supplementary Repository The Red Hat Supplementary Repository is the location for third party, proprietary content, built and signed by Red Hat. Confidential computing Drivers built and signed by Red Hat enables confidential computing, which is critically important for secure, multi-tenant cloud deployments. Getting started Here's a step-by-step guide to help you get started with these new drivers on RHEL. Prerequisites - Red Hat Enterprise Linux 10.1: Ensure that your system is running RHEL 10.1 or greater. - Active Red Hat subscription: You need an active subscription that provides access to the Red Hat Extensions and Supplementary repositories. - Compatible NVIDIA or AMD AI accelerator: Make sure your system has a compatible GPU installed. For AMD, read System requirements (Linux) — ROCm installation (Linux) and for Instinct GPUs follow this system optimization advice for BIOS settings and kernel arguments. Single-Command Installation with rhel-drivers rhel-drivers is a new command-line tool which provides a streamlined and smooth installation experience for NVIDIA and AMD AI accelerator drivers. The package is available in the Application Streams (AppStreams) repository on RHEL 10.1. AppStreams is enabled by default. All you need to do is install the rhel-drivers package, and you’re ready to go. rhel-drivers automates several steps, which otherwise would need to be done manually: - Automatically detects the AI accelerator present on the local system - Enables the RHEL Extensions and Supplementary Repositories - Installs the latest available drivers from the Red Hat repositories - For NVIDIA data center AI accelerators, it will install the latest OpenRM as well as cuda-toolkit drivers. - For AMD data center AI accelerators, it will install the latest AMDGPU driver from the RHEL Extensions Repository. The user will need to separately install AMD ROCm package from the Extensions Repository. Installing NVIDIA Kernel and User Mode Drivers with rhel-drivers # Install the rhel-drivers package (not installed by default) sudo dnf install rhel-drivers # Install the NVIDIA kernel and user mode drivers sudo rhel-drivers install nvidia sudo reboot To test that it installed correctly, run the following command: nvidia-smi Installing AMD kernel and user mode drivers with rhel-drivers # Install the rhel-drivers package (not installed by default) sudo dnf install rhel-drivers # Install the AMD kernel mode drivers sudo rhel-drivers install amdgpu # Install the AMD ROCm (user mode drivers) from the Extensions Repository sudo dnf install rocm rocm-devel sudo reboot Test that it installed as expected: $ rocm-smi --showid --showtemp --showpower --showmeminfo vram Manual driver installation We know that every IT environment is often different, requiring different versions of drivers that aren’t always the latest version. For environments which require other existing versions of the AI accelerator drivers, customers can install directly from RHEL Extensions and Supplementary Repositories. 1. Enable the Extensions and Supplementary Repositories First, enable the appropriate repository for your RHEL version. For RHEL 10: sudo subscription-manager repos --enable=rhel-10-for-x86_64-supplementary-rpms sudo subscription-manager repos --enable=rhel-10-for-x86_64-extensions-rpms Ensure that your RHEL system is up-to-date with the latest packages: sudo dnf update sudo reboot Parallel use of Extensions and EPEL repositories While not recommended, in case you wish to enable the Extensions and Extra Packages for Enterprise Linux (EPEL) repositories in parallel, you should adjust the repository priority to ensure packages available from both repositories are installed from Extensions by default. sudo subscription-manager repo-override --repo=rhel-10-for-x86_64-extensions-rpms --add=priority:98 Refer to the DNF Configuration Reference for the definition of the repository priority. 2. Identify and install the driver packages The specific package names vary slightly between NVIDIA and AMD. NVIDIA drivers $ sudo dnf install nvidia-driver cuda-toolkit For a list of available meta packages, refer to NVIDIA's list of meta packages. NVIDIA AI accelerator drivers You'll typically install the kmod-nvidia package along with the nvidia-driver user-space components. $ sudo dnf install kmod-nvidia nvidia-driver This command automatically pulls in the correct kernel module and user-space drivers for your system. AMD AI accelerator drivers (ROCm) For AMD, install the latest amdgpu kernel driver and the ROCm user-space stack. $ sudo dnf install kmod-amdgpu rocm rocm-devel 3. Reboot your system After installation, it's crucial to reboot your system to ensure the new kernel modules are loaded correctly. $ sudo reboot 4. Verify the installation Once your system has rebooted, you can verify that the driver loaded with a vendor-specific command. For example, to verify that the NVIDIA driver is loaded and the GPU is recognized: $ nvidia-smi You see output similar to this, detailing your NVIDIA GPU and driver version: Intel NPU Kernel Mode Driver: Validating In BaseOS The Intel driver is included in the BaseOS repository because it's in the Linux kernel. The Intel NPU-compatible CPUs are validated on Core Ultra Meteor Lake, Arrow Lake, and Lunar Lake SoCs. To verify kernel driver support: sudo modprobe -v intel_vpu lsmod | grep intel_vpu RHEL: the Foundation for building tomorrow's AI applications Here at Red Hat, we are working to make RHEL the Enterprise Linux platform that enables the development and deployment of the most advanced AI applications and workloads. We’d love to hear from you about how we can continue to enhance the accelerator driver experience on RHEL. Try out these drivers and the new installation experience today on RHEL. Product trial Red Hat Enterprise Linux | Product trial About the authors More like this Browse by channel Automation The latest on IT automation for tech, teams, and environments Artificial intelligence Updates on the platforms that free customers to run AI workloads anywhere Open hybrid cloud Explore how we build a more flexible future with hybrid cloud Security The latest on how we reduce risks across environments and technologies Edge computing Updates on the platforms that simplify operations at the edge Infrastructure The latest on the world’s leading enterprise Linux platform Applications Inside our solutions to the toughest application challenges Virtualization The future of enterprise virtualization for your workloads on-premise or across clouds