Mellanox Technologies ********************* InfiniBand OFED Driver for VMware(R) Infrastructure 4.X Release Notes Version ESX 1.4.1-2.0.000 May 2011 =============================================================================== Table of Contents =============================================================================== 1. Overview 1.1 Package Contents 1.2 Supported Platforms 1.3 Supported HCA Cards and Firmware Versions 1.4 Supported Switches, Gateways and Storage 2. Changes and New Features 2.1 Changes in Rev 1.4.1-2.0.000 From Rev 1.4.1-222 3. Known Issues 3.1 Memory Considerations 3.2 IPoIB 3.3 SRP 3.4.General =============================================================================== 1. Overview =============================================================================== These are the release notes of "InfiniBand OFED Driver for VMware(R) vSphere 4.X". This document provides instructions on IB drivers for Mellanox Technologies ConnectX(R) based adapter cards with VMWare ESX/ESXi Server environment. The sources of this package are based on Linux OFED 1.4.1. This release notes file and the user's manual are available from http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=36 &menu_section=34 1.1 Package Contents -------------------- VPI support for ESX4.X contains the following separate packages: o mlx4_en package - Contains the ConnectX device low level driver. This driver can also be used as a standalone 10GigE driver in VPI mode. o ib_basic package - Exports basic InfiniBand functionality. o ib_ulp package, contains: - IPoIB which exports Ethernet device functionality over IB - SRP initiator which exports storage functionality over IB o mlx_tools: Tools for updating device firmware. This package is not mandatory. The driver packages are distributed as offline bundles (.zip files). The mlx4_en is also available as a driver CD. 1.2 Supported Platforms ----------------------- o This package version is intended for installation on VMware ESX and ESXi Server 4.X o For the supported hardware compatibility list (HCL) and guest operating systems, please refer to VMware support documentation at http://www.vmware.com/support/pubs and follow the "Compatibility Guides" link 1.3 Supported Host Channel Adapters (HCAs) ------------------------------------------ This release supports Mellanox Technologies InfiniBand (IB) HCAs: o ConnectX® (firmware fw-25408 / fw-ConnectX Rev 2.6.000 and above) o ConnectX®-2 (firmware fw-ConnectX2 Rev 2.7.000 and above) Please note that older FW versions were not tested with this release. When using old FW with ConnectX devices the driver will not load on the HCA with the old FW and an error message will be printed to /var/log/vmkernel. For the latest firmware versions, visit: http://www.mellanox.com/content/pages.php?pg=firmware_download 1.4 Supported Switches, Gateways and Storage -------------------------------------------- o All production InfiniBand switches and gateways are supported. o Tested platforms: - This release was tested with switches provided by Mellanox: IS5025 & IS5035 - Storage: o OFED 1.5 SRP target o EMC CLARION CX4-120 - FC storage paired with QLogic QMH2562 8Gb FC HBA. o SuperMicro X7DB8 - NFS over Infiniband storage =============================================================================== 2. Changes and New Features =============================================================================== 2.1 Changes in Rev 1.4.1-2.0.000 From Rev 1.4.1-222 --------------------------------------------------- o Added support for every ESX 4.X GA release o Dedicated bundle for every ESX 4.X version o Driver robustness o Bug fixes: - Vmotion connectivity loss - Driver instability at openSM restart =============================================================================== 3. Known Issues =============================================================================== 3.1 Memory Considerations ------------------------- o When working with MSI-X and multiple RX rings the packet memory required might exceed the PacketHeapSize. In this case the link (for IPoIB) will stay down. Workarouns: - Reduce the number of RX rings (first priority) and/or their size. To change any of these values from their default value use esxcfg_module -s with the relevant module parameter. - Increase the PacketHeapSize. To read packet heap size run: COS#> esxcfg-advcfg -j netPktHeapMaxSize To update heap size run: COS#> esxcfg-advcfg -k 256 netPktHeapMaxSize This value can also be updated from the vClient / vCenter at: configuration->Advanced Settings->VMKernel->Boot->VMkernel.Boot.netPktMaxSize or by using remote cli vicfg-advcfg command. Note: Please keep in mind that the netPktHeapMaxSize value is only a recommendation to the vmkernel and it is not forced to accept the new value. o When opensm and the DV swith on the ESX are set with 4k mtu support, restarting the opensm with 2k mtu, will not be registered by th DV. 3.2 IPoIB --------- o IPoIB does not support IPv6. o When using VLANs, at least one packet must be transmitted for each new interface created. For example: 1. Create a new interface for a VM (e.g., with IP address ). 2. To guarantee that a packet is transmitted from the new interface, run: arping -I -c 1 o A DHCP server application that is meant to manage VMs over an IPoIB fabric must run from a virtual machine (and not from a native machine). o Only one VM at a time can communicate with a Physical Windows machine. 3.3 SRP ------- o ib_srp should not be unloaded. o When shutting down ESX Server and all paths to the SRP target are dead (or the SRP target is offline), the ESX Server may hang up to "dead_state_time" minutes. Please see the section SRP Module Parameters (above) for details. Workaround: Power off the ESX Server machines or wait up to "dead_state_time" minutes. o After ESX Server reboot, SCSI targets connected to the SRP storage adapter may not be recognized. Workaround: 1. Use VMware Virtual Infrastructure Client (VIC) to connect to ESX Server 2. Go to the host "Configuration" tab 3. Select "Storage Adapters" 4. Press "Rescan" o SRP storage adapters (vmhbas) appear in the VIC under "Unknown" family description. o Under a heavy load of I/Os from multiple VMs running over a single ESX Server machine, the file system may move to a "read only" state. Workaround: Restart the VM.