Mellanox Technologies --------------------- Mellanox WinOF for Windows Readme February 2010 =============================================================================== Table of Contents =============================================================================== 1. Overview 2. Legal Notice 3. Mellanox WinOF Package Contents 4. HW and SW Requirements 5. Managing Firmware 5.1 Downloading the Firmware Tools Package (MFT) 5.2 Updating Adapter Card Firmware 5.3 Firmware Issues 6. Mellanox WinOF Installation Process 6.1 HCA HW & Driver Installation 6.2 IPoIB Installation 7. OpenSM 8. IPoIB 8.1 MAC Generation 8.2 IGMP Configuration 9. SDP 9.1 SDP Limitations 9.2 SDP Installation 9.3 Running Applications over SDP 9.4 Running an Application over SDP and Ethernet 9.5 Available Programs 9.6 Troubleshooting 10. WSD 10.1 Running Applications over WSD 10.2 Performance 11. SRP 12. Starting and Verifying the IB Fabric 13. Low Level Performance Test 14. Debug options 15. Driver Update and Uninstall Process 16. Documentation =============================================================================== 1. Overview =============================================================================== This is Readme for the Mellanox InfiniBand (IB) driver and tools package, Mellanox WinOF for Windows XP, Windows Server 2003 and Windows Server 2008. Mellanox WinOF is composed of several software modules intended for use on a computer cluster configured as an InfiniBand network. Please refer to the MLNX_WinOF_2_0_0_ReleaseNotes.txt file to check for known issues and fixed bugs. Note: If you plan to upgrade any previous driver or SW component on your cluster, please uninstall the previous Mellanox WinOF version and install the new one on all nodes. =============================================================================== 2. Legal Notice =============================================================================== Copyright © 2005-2010, Mellanox Technologies LTD. All rights reserved. Except as specifically permitted herein, no portion of the information, including but not limited to object code and source code, may be reproduced, modified, distributed, republished or otherwise exploited in any form or by any means for any purpose without the prior written permission of Mellanox Technologies LTD. Use of software subject to the terms and conditions detailed in the file "license.rtf". =============================================================================== 3. Mellanox WinOF Package Contents =============================================================================== The Mellanox WinOF for Windows package contains the following components: - Core and ULPs: o IB HCA low-level drivers (mthca, mlx4) o IB Access Layer (IBAL) o Upper Layer Protocols (ULPs): - IP over InfiniBand (IPoIB) - NetworkDirect (ND) - Winsock Direct (WSD) - Beta: Sockets Direct Protocol (SDP) - Beta: SCSI RDMA Protocol (SRP) - Utilities: o OpenSM (OSM): InfiniBand Subnet Manager o Low level performance tests o vstat - get the card status o SdpConnect - SDP\WSD test - IB Diagnostics tools - SW Development Kit (SDK) - Documentation Note: Core drivers, IPoIB, WSD are at GA level. SDP and SRP are now at Beta stage. =============================================================================== 4. HW and SW Requirements =============================================================================== - Administrator privileges on your machine(s) - Disk Space for installation: 100MB - Supported Mellanox Technologies HCAs: o InfiniHost (fw-23108 Rev 3.5.000) o InfiniHost III Ex SDR/DDR (MemFree: fw-25218 Rev 5.3.000 or later; with memory: fw-25208 Rev 4.8.200 or later) o InfiniHost III Lx SDR/DDR (fw-25204 Rev 1.2.000 or later) o ConnectX SDR/DDR/QDR (fw-25408 Rev 2.7.000 or later) For official firmware versions please see: http://www.mellanox.com/content/pages.php?pg=firmware_download - Supported Operating Systems and Service Packs: o Windows XP SP2 (x86, x64) o Windows XP SP3 (x86) o Windows Server 2003 SP1 and SP2 (x86, x64) o Windows Server 2003 CCS (x64) o Windows Server 2008 / 2008-R2 (x86, x64) o Windows HPC Server 2008 (x64) - Supported CPU architectures: o x86 o x64 (EM64T and AMD64) =============================================================================== 5. Managing Firmware =============================================================================== The adapter card was shipped with the most current firmware available. This section is intended for future firmware upgrades, and provides instructions for (1) installing firmware update tools, and (2) updating adapter card firmware. 5.1 Downloading the Firmware Tools Package ------------------------------------------ Step 1: Download Mellanox Firmware Tools ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please download the current firmware tools package (MFT) from http://www.mellanox.com/products/management_tools.php. The tools package to download is "MFT_SW for Windows" (WinMFT). Step 2: Install and Run WinMFT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To install the WinMFT package, double click the MSI, or run it from the command prompt. Note: On a windows 2008 server, install WinMFT package from command line with administrator privileges. Enter: msiexec.exe /i WinMFT__.msi Step 3: Check IB Device Status ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To start the mst service (required by the tools), run > sc start mst To check device status run > mst status If there are no card installation problems, the status command should produce the following output: mt_pciconf0 mt_pci_cr0 where device id will be one of the following: 23108, 25204, 25208, 25218, 25408, 25418, 26418, and 26428 5.2 Download the Firmware Image of the Adapter Card --------------------------------------------------- To download the correct card firmware image, please visit http://www.mellanox.com/support/firmware_download.php For help in identifying your adapter card, please visit http://www.mellanox.com/support/HCA_FW_identification.php 5.3 Updating Adapter Card Firmware ---------------------------------- Using a card specific binary firmware image file, enter the following command: > flint -d mt_pci_cr0 -i burn Note: You may need to unzip the downloaded firmware image prior to burning. For additional details, please check the MFT user's manual under http://www.mellanox.com/products/management_tools.php. =============================================================================== 6. Mellanox WinOF Installation Process =============================================================================== 6.1 HCA HW & Driver Installation -------------------------------- Please refer to the MLNX_WinOF Installation Guide document for installation instructions. 6.2 IPoIB Setup --------------- *** Important Note *** You may skip this section if you have configured one of the machines as a DHCP server for the IPoIB interface. - In Control Panel, double-click Network Connection - Select the desired adapter (from Mellanox IPoIB Adapters), then right click and select Properties - Under the General tab, select Internet Protocol (TCP/IP) then click the Properties button - In the Internet Protocol (TCP/IP) Properties dialog box, click "Use the following IP address" - Enter the appropriate IP address and Subnet Mask. Use a different IP subnet for each IB port. This address must also be different from Ethernet subnet addresses. In most cases the first number is a constant. It is common to assign new IPoIB addresses by changing the first number. For example: Host Ethernet IP: 10.2.3.4 IPoIB IP address: 11.2.3.4 *** Important Note *** OpenSM must be active continuously on at least one machine in the cluster to allow proper IPoIB functioning. =============================================================================== 7. OpenSM =============================================================================== OpenSM is an InfiniBand Subnet Manager. For Mellanox WinOF to operate, OpenSM must be running on at least one host machine in the InfiniBand cluster. OpenSM can either run as a Windows service which starts automatically during boot or it an be started manually. OpenSM is installed under \tools. Please configure at least one machine to start the service automatically: - Right click on "My computer" and select Manage - Go to "Services and Applications" and select Services - Right click "OpenSM" and select Properties - Change "Startup type" to Automatic - Change service to start mode OpenSM as a service will use the first port which is not in "down" state. To run OpenSM manually enter on the command line: opensm.exe For additional run options, enter: opensm.exe -h Notes: o For long term running, please avoid using the '-v' (verbosity) option to avoid exceeding disk quota. o Running OpenSM on multiple servers may lead to incorrect OpenSM behavior. Please do not run OpenSM on more than 2 machines in the subnet. o IBDiagnet cannot run on the same IB port that OpenSM is running on. =============================================================================== 8. IPoIB =============================================================================== IPoIB is a network driver implementation that enables transmitting IP and ARP protocol packets over an InfiniBand UD channel. The implementation conforms to the relevant IETF working group's RFCs (http://www.ietf.org). 8.1 MAC Generation ------------------- IPoIB generates MAC addresses based on the GUID of the port. These MAC addresses are reported to Windows to enable normal communication. These addresses are replaced by the IPoIB driver before messages are sent on the wire, and are only for local usage. Mellanox cards are usually shipped with GUIDs of the form: 00-02-C9-02-00-XX-YY-ZZ or 00-02-C9-03-00-XX-YY-ZZ. Since a GUID contains 8 bytes, the appropriate truncation should be done as illustrated in the following example: Mellanox Port GUID = "0002c90200XXYYZZ" => MAC = "0002c9XXYYZZ". Mellanox Port GUID = "0002c90300XXYYZZ" => MAC = "0002caXXYYZZ". This release supports generic MAC address generation according to a user- defined bitwise GUID mask. A GUID mask is an 8-bit field that indicates which bytes of a GUID should be used in MAC address generation. Since a MAC address has a fixed 6-byte length, the mask must contain exactly 6 non-zero bits. Examples of valid masks: 0xfc (binary: 1111 1100); 0x3f (binary: 0011 1111) Examples of invalid masks: 0xfd - contains 7 non-zero binary digits; 0x2d contains only 4 non-zero binary digits Example of MAC generation given a mask of 0xe7: Port GUID = "0002c90200112233" => (mask == 0xe7) => MAC = "0002c9112233". To specify the mask, the user should change the appropriate registry value (GUIDMask) located under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\ {4D36E972-E325-11CE-BFC1-08002bE10318}\ This value is also accessible via the adapter's Properties user interface box. IPoIB supports other companies' GUIDs such as Cisco, HP, SuperMicro, SilverStorm and Voltaire. If the port GUID is not another company's GUID, or if it is not in one of the forms above, IPoIB will not be able to generate correct MAC addresses from the HCA port GUID. In this case, the GUID that is generated will be an integer starting with the number 02-00-00-00-00-00. Please use "ipconfig /all" to obtain the MAC that was reported to Windows. If the installation was successful yet the DHCP did not assign an IP to the IPoIB interface, most likely the IPoIB driver did not recognize the port's GUID. You can run the utility 'guid2mac_checker.exe' which is available via www.mellanox.com > Products > InfiniBand SW/Drivers > Mellanox WinOF. The utility checks whether the port's GUID is recognized by the driver, and performs one of the following actions: a. If the IPoIB driver recognizes the GUID, then it prints a confirmation message; b. If the driver does not recognize the GUID but guid2mac_checker.exe recognizes it, then the utility writes an appropriate GUID mask to the registry; c. If neither the driver nor guid2mac_checker.exe recognize the GUID, then the utility instructs the user how to create an appropriate GUID mask. *** Important Notes *** o An invalid GUID mask will be rejected and IPoIB will return to its default flow. o It is not possible to change MAC address generation for known vendors like Cisco, HP, DELL etc. In order to change HCA GUIDs, add the following flags when burning firmware: "-guid -mac " for ConnectX HCA devices, and "-guid " for the other (InfiniHost family) HCA device. See Section 5.3 above for details. Four GUIDs will be assigned to the values listed below based on the parameter: node GUID=, port1 GUID=+1, port2 GUID=+2, and system image GUID=+3. For example, to burn firmware on a ConnectX HCA, enter: flint -d mt25418_pci_cr0 -i -guid 0002c90200123456 -mac 0002c9123457 -burn 8.2 IGMP Configuration ---------------------- Multicast traffic on IPoIB works only with IGMP v2 and not with IGMP v3 which is the default. To configure your machine to use IGMP v2, please follow the instructions below. For Windows 2003 and Windows XP, run the following commands from the command line: o netsh routing ip igmp install o netsh routing ip igmp install add interface "interface name of IPoIB adapter" igmpprototype=igmprtrv2 Note: If after executing the commands above IGMP V3 remains in use, please follow the instructions on http://support.microsoft.com/default.aspx/kb/815752 For Windows 2008, run the following commands from the command line: o servermanagercmd.exe -install NPAS-RRAS-Services o netsh routing ip igmp install o netsh routing ip igmp install add interface "interface name of IPoIB adapter" igmpprototype=igmprtrv2 =============================================================================== 9. SDP =============================================================================== SDP is currently under development. This is a preliminary version of the ULP, and it supports a limited set of API functions. 9.1 SDP Limitations ------------------- A limited set of API functions (w/ major flags) is supported by this version. These are: socket, connect, bind, listen, accept, send, WSASend, receive, WSARecv, select, AcceptEx, WSPShutdown and closesocket. WSASend and WSARecv currently support all types of completion methods, including synchronous, completion routine, event and completion ports. Non-blocking IO is also supported. Additionally: getsockopt supports SO_PROTOCOL_INFOW and SO_CONNECT_TIME setsockopt supports SO_LINGER and SO_DONTLINGER WSPIoctl supports FIONBIO 9.2 SDP Installation -------------------- SDP should be installed and activated at Mellanox WinOF install time. If SDP is not installed, then please uninstall the Mellanox WinOF package and reinstall it with SDP. See MLNX_WinOF_1_1_0_ReleaseNotes.txt for details. 9.3 Running Applications over SDP --------------------------------- - Run 'sc start sdp' to verify that the SDP service is running. This is needed after each reboot. - Set the environment variable 'SdpApplications' with the name of the program to use SDP. In case of more than one program, separate the names using semi- colons. Examples: SdpApplications=telnet.exe SdpApplications=telnet.exe;ftp.exe ** Note: If this variable is not set, then only programs named SdpConnect.exe can use SDP to connect. - Run the application using the IPoIB interface IP address. 9.4 Running an Application over SDP and Ethernet ------------------------------------------------ In order to allow your program to run both SDP sockets and Ethernet sockets, perform the following: - Set the registry value MIXED_SDP_APPLICATIONS to 1. It is located under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\sdp\Parameters - Restart the SDP driver. - Make sure that the SdpApplications is *NOT* set to the name of your application. - Your program will now use only TCP connections and not SDP. In the places that you do want to use SDP and not TCP replace the call s = socket (AF_INET_FAMILY , SOCK_STREAM, IPPROTO_TCP); with the call s = WSASocket(AF_INET_FAMILY, SOCK_STREAM, IPPROTO_TCP, NULL, 0, WSA_FLAG_OVERLAPPED | 0x40); Only that socket will use SDP. 9.5 Available Programs ---------------------- The following applications were verified to work over SDP: - Iometer: To obtain the program please refer to http://www.iometer.org - iperf-2.0.1, iperf-1.7.0: These are test programs for 32-bit and 64-bit systems. To download them visit http://sourceforge.net/projects/iperf. Instructions for usage are included in the download package. - TTcp.exe: Testing was conducted using the TTcp.exe version shipped with Windows XP SP2. Both synchronous and overlapped operations can be used. **Note: Other TTcp.ext versions may also work. - Ntttcp.exe: This is a benchmark developed by Microsoft. Please contact Microsoft to obtain the program. - NetPipe: Used to measure latency. To download visit http://na-inet.jp/na/ - Microsoft CCS MPI - SdpConnect.exe: This is a simple test program located under the SDP example directory. The program has two modes: client and server. In the server mode the program listens for connection; in the client mode the program connects to the server. The program can be used to test SDP with synchronous and overlapped operations. Example 1: At node 1: SdpConnect.exe server 2222 At node 2: SdpConnect.exe client 11.4.8.63 2222 0 1 0 0 0 1 3000 16000 Example 2: At node 1: SdpConnect.exe server 2222 At node 2: SdpConnect.exe pingpong 11.4.8.63 2222 10000 10 For more options, enter: SdpConnect.exe ** Note: SdpConnect source code is included in the SDK component of Mellanox WinOF. 9.6 Troubleshooting ------------------- - How can I verify that SDP is being used? Currently, there is no simple way to indicate SDP is being used. However, if you know that your program consumes a lot of bandwidth, then there is an indirect way to find out. Open the Task Manager and switch to the networking tab. If you see that network utilization is low, this means that SDP is being used. Alternatively, if the program is running (i.e., the two sides communicate), stop the SDP on one side (via "net stop sdp") then try to reconnect it. If it succeeds then SDP was NOT used; if it fails then SDP was used. - My program does not seem to use SDP. Suggestions: a. Ping the remote node (ping ) to verify IPoIB is up. b. Verify that the SDP driver is loaded (net start sdp). c. Verify that the SdpApplications environment variable is correctly set (see Section 9.3 above). d. Verify that the SDP provider is installed by running \Program Files\Mellanox\MLNX_WinOF\SDP\InstallSdpProvider.exe -l The output of this command should include 'SDP provider'. Otherwise, install the SDP provider using <...>\InstallSdpProvider.exe -i - My system is experiencing instability and/or no network connectivity. Suggestions: a. Remove the SDP provider using \Program Files\Mellanox\MLNX_WinOF\SDP\InstallSdpProvider.exe -r then restart your computer. - Interoperability with Linux SDP is broken on OFED 1.2.5, 1.3.0, and 1.3.1. A complete fix for the problem is only expected with the next OFED release. Until then, please use the following workaround: Click Start->Run and enter regedit. Then go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\sdp\Parameters and change the value for MaximumRecvBufferSize and MaximumSendBufferSize to 0x810. This will allow both stacks to work but with lower BW due to the small message size. =============================================================================== 10. WSD =============================================================================== 10.1 Running Applications over WSD ---------------------------------- - Install the WSD provider on both computers. Enter: \Program Files\Mellanox\MLNX_WinOF\IPoIB\installsp.exe -i - Check which providers are installed. Enter: \Program Files\Mellanox\MLNX_WinOF\IPoIB\installsp.exe -l - Run the application. Please note that WSD has a fall back option; thus, if the connection fails over WSD, the connection will be attempted over IPoIB. - Remove the WSD provider: \Program Files\Mellanox\MLNX_WinOF\IPoIB\installsp.exe -r 10.2 Performance ---------------- WSD has its performance counters. Open perfmon and select add counters. Locate the performance object called "IB winsock direct" and select "total sent bytes" or "total received bytes" -- this will display how much traffic is going through WSD (if any). =============================================================================== 11. SRP =============================================================================== The Mellanox WinOF stack does not install the SRP driver by default. If SRP is selected in the custom installation window, it will only be copied during installation. To complete the SRP driver installation, an SRP target must be detected. This requires a Subnet Manager to be running somewhere in the InfiniBand subnet. Upon the detection of an SRP target, the "New Hardware Found" Wizard pops up. Select Install Automatically and click Next. This installs the I/O unit device. Once completed, the "New Hardware Found" Wizard pops up again. Select Install Automatically and click Next. This installs the SRP driver. =============================================================================== 12. Starting and Verifying the IB Fabric =============================================================================== - If you rebooted your machine after the installation process completed, then IB interfaces should be up. - Check that the IB driver is running on all nodes by using 'vstat'. The vstat utility displays the status and capabilities of the HCA card(s). It is placed under \tools. On the command line, enter: vstat (use -h for options) vstat should return information about one HCA port for PCI Device ID 25204 or two HCA ports for all other PCI Device IDs. The field port_state will be equal to PORT_DOWN - when there is no InfiniBand cable ("no link"); PORT_INITIALIZED - when the port is connected to some other port ("physical link"); PORT_ACTIVE - when the port is connected and OpenSM is running ("logical link"). - Run OpenSM - see OpenSM operation instructions in the OpenSM section above. - Verify the status of ports by using vstat: All connected ports should report "PORT_ACTIVE" state. =============================================================================== 13. Low level Performance Tests =============================================================================== The following performance tests are provided with the Mellanox WinOF release under \tools: - Latency tests o ib_write_lat: RDMA write o ib_read_lat: RDMA read o ib_send_lat: UD, UC and RC (default) send - Bandwidth tests o ib_write_bw: RDMA write o ib_read_bw: RDMA read o ib_send_bw: UD, UC and RC (default) send For usage information, run: -h ** Note: Since the default MTU value is different per HCA type, use "-m MTU" to set the MTU value on both the server and the client to the same value. This should be done only on heterogeneous systems (different HCA on different servers). =============================================================================== 14. Debug options =============================================================================== - IBAL supports WPP tracing tools by using the following GUIDs: o "B199CE55-F8BF-4147-B119-DACD1E5987A6" for user debug o "99DC84E3-B106-431e-88A6-4DD20C9BBDE3" for kernel debug - MTHCA supports WPP tracing tools by using the following GUIDs: o "2C718E52-0D36-4bda-9E58-0FC601818D8F" for user debug o "8BF1F640-63FE-4743-B9EF-FA38C695BFDE" for kernel debug - MLX4_HCA supports WPP tracing tools by using the following GUIDs: o "1752F07C-7E5C-402c-9C5F-AD21E572F852" for user debug o "F8C96A49-AE22-41e9-8025-D7E416884D89" for kernel debug - MLX4_BUS supports WPP tracing tools by using the following GUIDs: o "E51BB6E2-914A-4e21-93C0-192F4801BBFF" for kernel debug - IPoIB supports WPP tracing tools by using the following GUID: o "3F9BC73D-EB03-453a-B27B-20F9A664211A" - WSD supports WPP tracing tools by using the following GUID: o "156A98A5-8FDC-4d00-A673-0638123DF336" - SDP supports WPP tracing tools by using the following GUIDs: o "D6FA8A24-9457-455d-9B49-3C1E5D195558" for user debug o "2D4C03CC-E071-48e2-BDBD-526A0D69D6C9" for kernel debug - SRP supports WPP tracing tools by using the following GUID: o "5AF07B3C-D119-4233-9C81-C07EF481CBE6" The flags and level of debug can be controlled at load time or runtime. =============================================================================== 15. Driver Update and Uninstall Process =============================================================================== - Clean Uninstall Uninstall the package using the "Add or Remove Programs" utility. - Driver Update Driver update is currently not supported. Uninstall the driver package and then install a new version. - Reboot the server to complete the uninstall process =============================================================================== 16. Documentation =============================================================================== - Under \documents: o Release Notes for: core (IBAL), IPoIB, WSD o README and user manuals for: opensm, SDP - Under : o License file o This document - Under \SDK: o core (IBAL) API HTML documentation (in SDK package) o hello_world code example (in SDK package): This is a 'two-sided' code example built by the DDK environment. Activation: Side A: hello_world.exe -d [daemon options] Side B: hello_world.exe --ip= [client options] For options, enter: hello_world.exe --help