~ardavis/dapl.git
2 years agoucm, mcm: fix backlog parameter for socket master
Nicolas Morey-Chaisemartin [Wed, 30 May 2018 16:00:05 +0000 (09:00 -0700)]
ucm, mcm: fix backlog parameter for socket

Using listen(, 0) forces a synchronization barrier between connect and accept if net.ipv4.tcp_syncookies.
As this is done by a single thread, it causes connect to timeout with a similar message:
open_hca: failed to init cr pipe - Connection timed out

Replace with listen(, 1) so the kernel can accept the connection itself and remove the synchronisation
point.

Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
3 years agoAllow for reproducible builds, add source-date-epoch variable
Bernhard M. Wiedemann [Tue, 18 Jul 2017 17:31:39 +0000 (10:31 -0700)]
Allow for reproducible builds, add source-date-epoch variable
Note: variant works with GNU date

Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
3 years agoRelease 2.1.10-1 dapl-2.1.10-1
Arlin Davis [Tue, 13 Dec 2016 22:52:46 +0000 (14:52 -0800)]
Release 2.1.10-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
3 years agodtest_suite: add option to pause the test.
Amir Hanania [Tue, 13 Dec 2016 22:25:13 +0000 (14:25 -0800)]
dtest_suite: add option to pause the test.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
3 years agodtestcm: add client retry, give server time to queue up all listens
Amir Hanania [Tue, 29 Nov 2016 23:48:58 +0000 (15:48 -0800)]
dtestcm: add client retry, give server time to queue up all listens

From: Amir Hanania <amir.hanania@intel.com>

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
3 years agodtest: Add new man pages. (dtestx dtestcm dtestsrq)
Amir Hanania [Tue, 29 Nov 2016 23:15:49 +0000 (15:15 -0800)]
dtest: Add new man pages. (dtestx dtestcm dtestsrq)

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
3 years agocma: fix open_query mode, initialize attributes
Arlin Davis [Tue, 29 Nov 2016 21:18:47 +0000 (13:18 -0800)]
cma: fix open_query mode, initialize attributes

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
3 years agoucm: up level CM timer logging, increase drep time at scale
Arlin Davis [Tue, 29 Nov 2016 21:11:22 +0000 (13:11 -0800)]
ucm: up level CM timer logging, increase drep time at scale

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: fix return value check on do_rdma_write_with_msg
Amir Hanania [Wed, 28 Sep 2016 23:13:26 +0000 (16:13 -0700)]
dtest: fix return value check on do_rdma_write_with_msg

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtestx: check device capabilities and do atomic tests only if supported by HW
Amir Hanania [Wed, 28 Sep 2016 21:44:18 +0000 (14:44 -0700)]
dtestx: check device capabilities and do atomic tests only if supported by HW

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agocommon: set atomic attributes based on provider/device capabilities
Amir Hanania [Wed, 28 Sep 2016 21:41:56 +0000 (14:41 -0700)]
common: set atomic attributes based on provider/device capabilities

DAT_IB_FETCH_AND_ADD and DAT_IB_CMP_AND_SWAP values in provider_specific_attr are always set to TRUE.
Set their value according to the device atomic capability.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agobuild: dtest_suite.sh was moved to test/scripts
Arlin Davis [Tue, 20 Sep 2016 00:17:11 +0000 (17:17 -0700)]
build: dtest_suite.sh was moved to test/scripts

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: let TX thread sleep if no open devices are referenced
Amir Hanania [Mon, 19 Sep 2016 23:44:25 +0000 (16:44 -0700)]
mpxyd: let TX thread sleep if no open devices are referenced

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agoMCM MIX: When mmap req from MIC return with fail stat print WARN.
Amir Hanania [Mon, 19 Sep 2016 23:42:39 +0000 (16:42 -0700)]
MCM MIX: When mmap req from MIC return with fail stat print WARN.

When MIC mmap req response return with fail stat, print WARN as it only means that the host is not in polling mode and does not support send op via mmap.
Not an ERR.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agodtest_suite: remove duplicate dtest_suite.sh
Arlin Davis [Mon, 19 Sep 2016 23:41:07 +0000 (16:41 -0700)]
dtest_suite: remove duplicate dtest_suite.sh

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: Enable -D option (data check) to work with scif provider
Amir Hanania [Mon, 19 Sep 2016 23:37:11 +0000 (16:37 -0700)]
dtest: Enable -D option (data check) to work with scif provider

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agodtest_suite: fix typo in user_string var
Amir Hanania [Mon, 19 Sep 2016 23:27:01 +0000 (16:27 -0700)]
dtest_suite: fix typo in user_string var

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agomcm: remove logs from post send speed path
Arlin Davis [Mon, 19 Sep 2016 23:22:57 +0000 (16:22 -0700)]
mcm: remove logs from post send speed path

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agomcm proxy: push WR from MIC to host with scif mmap memory instead of scif_send.
Amir Hanania [Thu, 26 May 2016 21:28:32 +0000 (14:28 -0700)]
mcm proxy: push WR from MIC to host with scif mmap memory instead of scif_send.

Mapping host memory to the MIC. Use this memory, in a ring buffer way,
to send the post send work requests from MIC to host. This is replacing
the scif_send to scif_recv and the recv data FD event mechanism.
Since there is no use of FD to wake up the host proxy service,
the host needs to run in polling mode to use this option.

How to run the host in polling mode:

By default, the proxy is now running in polling mode.
You can verify that it is the case in the mpxyd.log file.
Or, edit the mpxyd.conf file: set mcm_affinity to 2.

This optimization improves small message latencies on MFO
devices by as much as 50%.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: the default size in pingpong test is set to 1 byte regardless to user input.
Amir Hanania [Thu, 26 May 2016 21:14:11 +0000 (14:14 -0700)]
dtest: the default size in pingpong test is set to 1 byte regardless to user input.
Keep the user input if one provided.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: Clean 4 printf from the middle of performance test. That may add time fot...
Amir Hanania [Wed, 18 May 2016 17:36:31 +0000 (10:36 -0700)]
dtest: Clean 4 printf from the middle of performance test. That may add time fot the test to be completed and reduce our performance.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoRelease 2.1.9-2 dapl-2.1.9-2
Arlin Davis [Wed, 4 May 2016 01:10:20 +0000 (18:10 -0700)]
Release 2.1.9-2

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoIn case of MCM get the GID from the tp struct which was initialized in open device.
Amir Hanania [Wed, 4 May 2016 00:43:44 +0000 (17:43 -0700)]
In case of MCM get the GID from the tp struct which was initialized in open device.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agoib_hca_handle in MFO mode is N/A and access it caused a seg fault.
Amir Hanania [Wed, 4 May 2016 00:43:25 +0000 (17:43 -0700)]
ib_hca_handle in MFO mode is N/A and access it caused a seg fault.
Set MFO mode as not iwarp device.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agompxyd: modify_qp uses wrong md->dev_attr with MTU changes
Arlin Davis [Mon, 2 May 2016 22:13:27 +0000 (15:13 -0700)]
mpxyd: modify_qp uses wrong md->dev_attr with MTU changes

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agorpmbuild: fix specfile, don't overwrite build options
Arlin Davis [Fri, 29 Apr 2016 16:30:46 +0000 (09:30 -0700)]
rpmbuild: fix specfile, don't overwrite build options

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoopenib: new provider specific attribute - port GID
Arlin Davis [Fri, 29 Apr 2016 16:22:01 +0000 (09:22 -0700)]
openib: new provider specific attribute - port GID

The IB GID is returned with IA query, provider specific
option for scm, ucm, cma, and mcm openib providers.
Includes subnet prefix and interface ID.

DAT_IB_GID = fe80:0000:0000:0000:0002:c903:0032:2f31

dtest -v can be used for testing example.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoRelease 2.1.9 dapl-2.1.9-1
Arlin Davis [Fri, 15 Apr 2016 01:46:12 +0000 (18:46 -0700)]
Release 2.1.9

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtestcm: exchange provider IA address info via sockets
Arlin Davis [Tue, 12 Apr 2016 20:58:34 +0000 (13:58 -0700)]
dtestcm: exchange provider IA address info via sockets

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoucm: increase default REQ/RTU timers on scaling threshold
Arlin Davis [Tue, 12 Apr 2016 20:56:02 +0000 (13:56 -0700)]
ucm: increase default REQ/RTU timers on scaling threshold

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: m_req_event assert during large io streams, HST to MIC
Arlin Davis [Wed, 6 Apr 2016 22:41:44 +0000 (15:41 -0700)]
mpxyd: m_req_event assert during large io streams, HST to MIC

When proxy-in (PI) WR queue is full and client is blocked on
new WR entries, the WR completion processing can
incorrectly reference a PI WR field after the client is
given remote access.

m_qp = (struct mcm_qp *)m_wr_rx->context;
assert(m_qp);

After data is forwarded to the appropriate MIC, the proxy
service will send a RW_imm WC message. This releases
the m_wr_rx entry for re-use by remote mcm provider client.
At the same time, the proxy can be processing the RW_imm
completion and incorrectly use the wr_rx->context field for
m_qp reference. Change the proxy_in event processing code
to avoid dependencies on any wr_rx content.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agomcm: HST->MXS IO streams can overrun MPXYD proxy-in WR queue
Arlin Davis [Wed, 6 Apr 2016 21:02:59 +0000 (14:02 -0700)]
mcm: HST->MXS IO streams can overrun MPXYD proxy-in WR queue

MPXYD proxy-in service cannot consume HST->MIC WR's fast
enough on 100Gb/s fabrics and from server based clients. This
results in post_send failing with DAT_INSUFFICIENT_RESOURCES.
Add retry mechanism, with limited retries, for the
host side mcm provider via dat_ep_post_send.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: cleanup warnings in MIC proxy code
Arlin Davis [Tue, 5 Apr 2016 21:04:12 +0000 (14:04 -0700)]
mpxyd: cleanup warnings in MIC proxy code

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoopenib: cleanup warnings in openib providers
Arlin Davis [Tue, 5 Apr 2016 21:03:41 +0000 (14:03 -0700)]
openib: cleanup warnings in openib providers

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agocommon: cleanup warnings in common code base
Arlin Davis [Tue, 5 Apr 2016 21:02:27 +0000 (14:02 -0700)]
common: cleanup warnings in common code base

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodapltest: cleanup warnings, unused variables, etc
Arlin Davis [Tue, 5 Apr 2016 21:00:33 +0000 (14:00 -0700)]
dapltest: cleanup warnings, unused variables, etc

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoscm: backward compatibility issue with MTU negotiation
Arlin Davis [Thu, 17 Mar 2016 23:06:35 +0000 (16:06 -0700)]
scm: backward compatibility issue with MTU negotiation

The scm provider builds the CM reply message on stack
and doesnt memset to zero so resv fields are unknown.
The client cannot check mtu/resv field for MTU adjustments.
Bump provider CM message version to DCM_VER_MTU and add
check for appropriate version.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agomcm: fix mtu interop issues when MIC and HOST differ
Arlin Davis [Mon, 14 Mar 2016 16:46:14 +0000 (09:46 -0700)]
mcm: fix mtu interop issues when MIC and HOST differ

result of commit: ab67173b8024e14009c266d76ab9ec0bdd0c5d1f

New MCM provider on MIC side needs to open in compat mode
with MTU set to 2048. It needs to allow proxy, if new, to
adjust to active MTU. If old proxy is on host side, 2048
is returned as normal and new MCM provider remains in
compat mode with MTU at 2048.

New proxy on host side needs to support an old version of
MCM provider and adjust MTU only if MIC side changes
dev_attr.mtu settings. It will bump up to active_MTU
only if the MCM provider is new and sets the MIX_OP_SET
bit on the mic->host proxy device open call.

Proxy open device MUST set new dev attributes in client SMD
device object and not in the shared MD device object since
there can be multiple clients with different attribute
settings from MIC side.

MCM provider MUST query and setup MTU in open instead of query
so subsequent queries don't override negotiated setting.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: clean up warnings, keep variables and functions local
Arlin Davis [Fri, 4 Mar 2016 21:15:05 +0000 (13:15 -0800)]
dtest: clean up warnings, keep variables and functions local

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoRelease 2.1.8
Arlin Davis [Wed, 17 Feb 2016 23:35:30 +0000 (15:35 -0800)]
Release 2.1.8

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: fix segfault in proxy_out debug logging
Arlin Davis [Tue, 16 Feb 2016 21:12:16 +0000 (13:12 -0800)]
mpxyd: fix segfault in proxy_out debug logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: fix debug memory buffer log function
Amir Hanania [Tue, 16 Feb 2016 21:04:56 +0000 (13:04 -0800)]
mpxyd: fix debug memory buffer log function

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: -D option is not valid with scif providers
Amir Hanania [Tue, 16 Feb 2016 20:53:53 +0000 (12:53 -0800)]
dtest: -D option is not valid with scif providers

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest/dapltest: add new automated test suite for HOST to MIC testing
Amir Hanania [Tue, 16 Feb 2016 20:47:04 +0000 (12:47 -0800)]
dtest/dapltest: add new automated test suite for HOST to MIC testing

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agoopenib: update attributes correctly for iWARP transports
Arlin Davis [Tue, 16 Feb 2016 20:15:08 +0000 (12:15 -0800)]
openib: update attributes correctly for iWARP transports

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agoopenib_common: set providers mtu to active_mtu instead of 2048
Arlin Davis [Wed, 10 Feb 2016 22:45:12 +0000 (14:45 -0800)]
openib_common: set providers mtu to active_mtu instead of 2048

Better out of the box performance when setting mtu to active_mtu
instead of default settings of 2K. The new mtu settings are applied
on a per QP basis and negotiated via CM mtu 8-bit field. One of the
reserved 8 bit CM message fields is used to insure compatibility
with older versions.

If older endpoints are mixed with newer versions it will fallback to
the pre-existing 2K MTU settings, unless overriden by DAPL_IB_MTU.

The change has been made across all providers including ucm, scm, mcm,
and cma (rdma_cm). The mcm provider on a MIC will notify the CCL Proxy
service of a DAPL_IB_MTU override via a new MIX_OP_FLAGS bit
MIX_OP_MTU during the open call.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: set affinity default to 2 for best performance
Arlin Davis [Wed, 10 Feb 2016 22:44:46 +0000 (14:44 -0800)]
mpxyd: set affinity default to 2 for best performance

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agomcm: cleanup unused variable in dapls_ib_mr_register
Arlin Davis [Tue, 9 Feb 2016 17:37:47 +0000 (09:37 -0800)]
mcm: cleanup unused variable in dapls_ib_mr_register

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: enhancement to test, -D option for data check
Amir Hanania [Tue, 26 Jan 2016 22:03:16 +0000 (14:03 -0800)]
dtest: enhancement to test, -D option for data check

With -D option, dtest will run pingpong rdma write test
with data validation. Changes pattern during iterations.
Aborts and reports location/pattern with any miscompare.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agomcm: add support for Intel Omni-Path driver (hfi) via mic MFO mode
Amir Hanania [Mon, 25 Jan 2016 20:30:38 +0000 (12:30 -0800)]
mcm: add support for Intel Omni-Path driver (hfi) via mic MFO mode

Set MIC based consumer to MFO (full offload) mode for both qib and new hfi devices.
Add to dat.conf entries for hfi verbs support. This can be run from mic or host
endpoints.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agompxyd: fix ordering issues with the CCL Proxy receive side forwarding mechanism
Arlin Davis [Mon, 25 Jan 2016 19:51:33 +0000 (11:51 -0800)]
mpxyd: fix ordering issues with the CCL Proxy receive side forwarding mechanism

scif_writeto doesn't guarantee ordering on DMA posting like IB rdma writes.
Since CCL Proxy is emulating IB semantics we must perserve order of
the rdma write request from MIC consumers via any proxy scif operations.

Changes made to proxy-in to defer forwarding RR completed segments
unless they are middle segments of a larger write operation. On FS or LS
the previous scif_writeto DMA operations must be completed and signaled
before posting a first or last segment. Last segment scif_writeto
operation is ordered to insure last byte is the last byte of
complete rdma write proxied operation.

During scif_wt errors send WC error status for each pending segment
with rdma write operation for accurate proxy-out error processing.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agodtest: report results only if one of the pingpong tests are run
Amir Hanania [Thu, 10 Dec 2015 23:17:03 +0000 (15:17 -0800)]
dtest: report results only if one of the pingpong tests are run

There are two diff ping pong test cases.
It was possible to run dtest with none of them.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
4 years agompxyd: with abnormal CM termination a CM object can be referenced after QP destroy
Arlin Davis [Thu, 10 Dec 2015 22:48:05 +0000 (14:48 -0800)]
mpxyd: with abnormal CM termination a CM object can be referenced after QP destroy

The proxy-in CQ is not flushed and processes properly during
mix_qp_destroy. Depending on the EP mode there can be 2 seperate
connections with multiple CQs to process. Add new mix_cq_flush
function that will flush all pending work on TX and RX side of
proxy engine. CM object is destroyed and reset only after all
pending work is processed on ALL endpoint CQ associations.
Add error logging when WR resources are exhausted.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
4 years agompxyd: proxy out WR resources exhausted with MFO mode endpoints
Arlin Davis [Thu, 10 Dec 2015 22:36:22 +0000 (14:36 -0800)]
mpxyd: proxy out WR resources exhausted with MFO mode endpoints

WC status of IBV_WC_RETRY_EXC_ERR reported back to MIC client

Operation processing thread doesn't yield properly
to enable tx thread to process completions and replenish
WR resources. Retries occur to quickly.

add some new error logs for resource issues.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agorelease note update for CCL Proxy and Platform BIOS recommendations
Arlin Davis [Wed, 21 Oct 2015 16:49:45 +0000 (09:49 -0700)]
release note update for CCL Proxy and Platform BIOS recommendations

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agodtestx: add dat_ib_open_query only option with -q
Arlin Davis [Fri, 16 Oct 2015 20:08:11 +0000 (13:08 -0700)]
dtestx: add dat_ib_open_query only option with -q

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoscm: CONN_PENDING: SOCKOPT ERR Connection refused ->
Arlin Davis [Fri, 16 Oct 2015 17:21:19 +0000 (10:21 -0700)]
scm: CONN_PENDING: SOCKOPT ERR Connection refused ->

Error caused by cm_msg size compatability issue with new v8
protocol and older socket cm providers (2.1.4 and older).
The ucm, cma, and mcm providers are not affected.

Modify socket data sizes for SCM request/reply to interoperate
between new v8 with smaller private data and older protocols.

Adjust SCM reply/rtu based on remote CM version and retry a failed
request with pre-v8 adjusted size in case of server side failure.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoRelease 2.1.7 dapl-2.1.7-1
Arlin Davis [Wed, 30 Sep 2015 03:23:58 +0000 (20:23 -0700)]
Release 2.1.7

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agodtest: add -a -i options, all data sizes, incremental size
Arlin Davis [Tue, 29 Sep 2015 16:05:27 +0000 (09:05 -0700)]
dtest: add -a -i options, all data sizes, incremental size

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agodapl: Fix segfault while freeing qp
Bharat Potnuri [Tue, 29 Sep 2015 15:49:10 +0000 (08:49 -0700)]
dapl: Fix segfault while freeing qp

In function dapls_ib_qp_free(), pointers qp and cm_ptr->cm_id->qp are pointing to the same qp
structure, initialized in function dapls_ib_qp_alloc(). The memory pointed by these pointers are freed
twice in function dapls_ib_qp_free(), using rdma_destroy_qp() for the case _OPENIB_CMA defined and
then further using ibv_destroy_qp(), causing a segmentation fault while freeing the qp. Therefore
assigned NULL value to qp to avoid freeing illegal memory.

Fixes: 7ff4f840bf11 ("common: add CM-EP linking to support mutiple CM's and proper protection during
destruction")

Signed-off-by: Bharat Potnuri <bharat@chelsio.com>
Acked-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: add P2P inline support for data size <= 96 bytes
Amir Hanania [Wed, 23 Sep 2015 21:43:38 +0000 (14:43 -0700)]
mpxyd: add P2P inline support for data size <= 96 bytes

Improve small message latency for proxy to proxy service
by including data with the proxy work request. Necessary
changes made to preservie order across WR's regardless
of size. Additional logging included. Improves single byte
one-way latency of about 27% on MFO configurations.

Changes made to avoid forwarding 0-byte rdma write to
scif_writeto, remove CPU hand copies, and order.

Changes for numa_node == -1 such that mic0 assumes MSS
and mic1 assumes MXS modes.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agodtest: change rdma_write_ping_pong so client is always last receiver
Arlin Davis [Mon, 21 Sep 2015 22:48:15 +0000 (15:48 -0700)]
dtest: change rdma_write_ping_pong so client is always last receiver

server always waits after test loops for DREQ event so in order
to gracefully shutdown client should always receive last handshake
message and issue DREQ. Remove logging in loop.

Always init data and increase min rdma buffer size to 4KB.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoucm: add DAPL_NETWORK_PROCESS_NUM option for total ranks
Arlin Davis [Mon, 21 Sep 2015 15:24:01 +0000 (08:24 -0700)]
ucm: add DAPL_NETWORK_PROCESS_NUM option for total ranks

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoucm: fca create group incorrectly using IB addr instead of socket address.
Amir Hanania [Thu, 17 Sep 2015 00:31:13 +0000 (17:31 -0700)]
ucm: fca create group incorrectly using IB addr instead of socket address.

need the socket address for socket based create group info exchange.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agoucm: fca_comm_destroy called with NULL
Amir Hanania [Thu, 17 Sep 2015 00:27:27 +0000 (17:27 -0700)]
ucm: fca_comm_destroy called with NULL

In some cases dapli_free_collective_group is called without the comm was initialized.
fca_comm_destroy call in this func seg fault.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agodtest: add -W option for rdma write pinpong, similiar to ib_write_lat
Arlin Davis [Tue, 15 Sep 2015 15:45:03 +0000 (08:45 -0700)]
dtest: add -W option for rdma write pinpong, similiar to ib_write_lat

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agodocs: update release notes for collective build
Arlin Davis [Mon, 31 Aug 2015 22:14:46 +0000 (15:14 -0700)]
docs: update release notes for collective build

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: reduce log level for rcv message flush
Amir Hanania [Mon, 24 Aug 2015 20:22:53 +0000 (13:22 -0700)]
mpxyd: reduce log level for rcv message flush

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agodapltest: dapltest with no argument not working in ppc64 arch
Carol L Soto [Mon, 24 Aug 2015 19:58:58 +0000 (12:58 -0700)]
dapltest: dapltest with no argument not working in ppc64 arch

If dapltest is run with no args then the client was getting
Warning: conn_event_wait DAT_CONNECTION_EVENT_NON_PEER_REJECTED
Reference to RH1056487- dapltest Read and Write performance
tests are not working

Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
5 years agoRelease 2.1.6 dapl-2.1.6-1
Arlin Davis [Thu, 13 Aug 2015 16:55:47 +0000 (09:55 -0700)]
Release 2.1.6

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoucm: add cluster size environments to adjust CM timers
Arlin Davis [Thu, 13 Aug 2015 00:30:23 +0000 (17:30 -0700)]
ucm: add cluster size environments to adjust CM timers

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: proxy_in data transfers can improperly start before RTU received
Arlin Davis [Wed, 12 Aug 2015 16:46:30 +0000 (09:46 -0700)]
mpxyd: proxy_in data transfers can improperly start before RTU received

Proxy-in data transfers must be defered until RTU is received
and QP is in CONN state. Otherwise, the remote PI WC address/rkey
information is still unitialized.

Check for initial CONN state before processing RR or WT data phase
and set RR to pause state until RTU and remote PI WRC information
is processed. Update pi_req_event error logging.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: forward open/query for MFO devices in query only mode
Arlin Davis [Wed, 12 Aug 2015 16:19:07 +0000 (09:19 -0700)]
mcm: forward open/query for MFO devices in query only mode

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: byte swap incorrect on WRC wr_len
Arlin Davis [Wed, 12 Aug 2015 15:51:03 +0000 (08:51 -0700)]
mpxyd: byte swap incorrect on WRC wr_len

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agodtest: remove ERR message from flush QP function
Amir Hanania [Tue, 11 Aug 2015 00:24:15 +0000 (17:24 -0700)]
dtest: remove ERR message from flush QP function

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agodapltest: Quit command with "-n port" number will core dump
David Dai [Fri, 7 Aug 2015 20:05:56 +0000 (13:05 -0700)]
dapltest: Quit command with "-n port" number will core dump

-n option specified with n, should be n:

Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoconfig: update dat.conf for MFO qib devices, 2 adapters/ports
Amir Hanania [Wed, 5 Aug 2015 22:01:49 +0000 (15:01 -0700)]
config: update dat.conf for MFO qib devices, 2 adapters/ports

ofa-v2-qib0-1m and libdaplomcm.so

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agompxyd: add MFO support on proxy side
Amir Hanania [Wed, 5 Aug 2015 21:55:30 +0000 (14:55 -0700)]
mpxyd: add MFO support on proxy side

Add checking for MFO and MXS and provide proxy-in and proxy-out
services for each mode. MXS_EP check is now MXF_EP (MFO or MXS).
Add new MIX device open, query, port query, pz operations.
Add new pz list and object management via scif_dev structure.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agomcm: add MFO proxy commands, device, and CM support
Amir Hanania [Wed, 5 Aug 2015 21:46:20 +0000 (14:46 -0700)]
mcm: add MFO proxy commands, device, and CM support

CM will support Proxy-in services on both MFO and MXS modes.
CM thread will not process ibv channels when in MFO mode.

Device open/close will export all verbs calls in MFO mode.

Add MIX (MIC to Proxy) functions for pz, device query, port query.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agomcm: add MFO support to openib_common code base
Amir Hanania [Wed, 5 Aug 2015 20:41:32 +0000 (13:41 -0700)]
mcm: add MFO support to openib_common code base

Provide full proxy support of CQ, QP, PZ, MR and device.
Use use new MXF_EP macro to switch proxy service based
on MXS (cross socket) or MFO (full offload) modes.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agomcm: add full offload (MFO) mode to provider to support qib on MIC
Amir Hanania [Wed, 5 Aug 2015 20:35:28 +0000 (13:35 -0700)]
mcm: add full offload (MFO) mode to provider to support qib on MIC

Add new MIX proxy definitions and commands for query device, query port,
pz create, and pz free.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agodtest: pre-allocated buffer too small for RMR, DTO ops timeout
Amir Hanania [Wed, 5 Aug 2015 20:16:12 +0000 (13:16 -0700)]
dtest: pre-allocated buffer too small for RMR, DTO ops timeout

The buf_len settings (-b) for small IO may cause segfault.
Increase allocation and adjust DTO operations to infinite.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
5 years agompxyd: fix buffer initialization when no-inline support is active
Amir Hanania [Fri, 31 Jul 2015 22:35:12 +0000 (15:35 -0700)]
mpxyd: fix buffer initialization when no-inline support is active

wr_buf buffer was zeroed instead of wr_buf_rx

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: reduce log level on qp_flush to CM level
Arlin Davis [Thu, 30 Jul 2015 15:16:17 +0000 (08:16 -0700)]
mpxyd: reduce log level on qp_flush to CM level

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: intra-node proxy missing LID setup on rejects
Arlin Davis [Thu, 30 Jul 2015 15:15:22 +0000 (08:15 -0700)]
mcm: intra-node proxy missing LID setup on rejects

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: add intra-node support via ibscif device and mcm provider
Arlin Davis [Fri, 24 Jul 2015 23:01:29 +0000 (16:01 -0700)]
mcm: add intra-node support via ibscif device and mcm provider

- New device entry ofa-v2-scif0-m
- Support for different CM and EP locality (MIC vs proxy LID)
- MSS mode for all scif device opens via proxy
- logging changes for multi-lid options

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: provide MIC address info with proxy device open
Arlin Davis [Fri, 24 Jul 2015 19:48:52 +0000 (12:48 -0700)]
mcm: provide MIC address info with proxy device open

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: add device info to non-debug log
Arlin Davis [Fri, 24 Jul 2015 19:45:11 +0000 (12:45 -0700)]
mcm: add device info to non-debug log

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agocommon: add DAPL_DTO_TYPE_EXTENSION_IMM for rdma_write_imm DTO type checking
Arlin Davis [Tue, 14 Jul 2015 22:41:35 +0000 (15:41 -0700)]
common: add DAPL_DTO_TYPE_EXTENSION_IMM for rdma_write_imm DTO type checking

Add new extended DTO type to request cookie to identify rdma write operations
with immediate data during completions.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: fix up some of the PI logging
Arlin Davis [Tue, 14 Jul 2015 22:39:52 +0000 (15:39 -0700)]
mpxyd: fix up some of the PI logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agodtest: modify rdma_write_with_msg to support uni-direction streaming
Arlin Davis [Tue, 14 Jul 2015 22:30:16 +0000 (15:30 -0700)]
dtest: modify rdma_write_with_msg to support uni-direction streaming

add proper client->server handshake at end of rdma data stream
to insure all data is delivered before disconnecting.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm,mpxyd: fix dreq processing to defer QP flush when proxy WRs still pending
Arlin Davis [Tue, 14 Jul 2015 21:58:32 +0000 (14:58 -0700)]
mcm,mpxyd: fix dreq processing to defer QP flush when proxy WRs still pending

The proxy will now defer DREQ flushing of proxy QPs if PI and PO
data engines have outstanding requests. Add mcm_qp_busy routine
for checking PI and PO data engines. When MIC calls disconnect
always send DREQ up to proxy in order to handle deferred flush
of proxy side posted rcv messages.

Change QP free to modify both local and proxy QPs and check for
outstanding rcv message before qp_destroy to avoid infinite wait
in dapls_ep_flush_cqs.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: update byte_len and comp_cnt for PO to remote HST communications
Arlin Davis [Tue, 14 Jul 2015 21:47:24 +0000 (14:47 -0700)]
mpxyd: update byte_len and comp_cnt for PO to remote HST communications

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: bug fixes for non-inline devices
Amir Hanania [Wed, 17 Jun 2015 17:12:24 +0000 (10:12 -0700)]
mcm: bug fixes for non-inline devices

mcm proxy mi_send_pi setup registered WR structure properly for no
inline data support but incorrectly overwrote sg.addr with WR
WR structure on stack.

qp create didn't check for no inline and setup create accordingly

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: return CM_rej with CM_req_in errors
Arlin Davis [Fri, 12 Jun 2015 20:56:38 +0000 (13:56 -0700)]
mcm: return CM_rej with CM_req_in errors

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd,mcm: RDMA write with immed data not signaled on request side
Arlin Davis [Fri, 5 Jun 2015 19:14:37 +0000 (12:14 -0700)]
mpxyd,mcm: RDMA write with immed data not signaled on request side

With eager completions set, the wc_flags is not set properly on event.
With eager completions no set, the proxy CQ reference is incorrect
and event is forwarded to MCM receive EVD instead of transmit EVD.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agomcm: add WC opcode and wc_flags in debug log message
Arlin Davis [Thu, 4 Jun 2015 23:53:59 +0000 (16:53 -0700)]
mcm: add WC opcode and wc_flags in debug log message

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agompxyd: set options bug fix for mcm_ib_inline
Arlin Davis [Thu, 4 Jun 2015 23:52:11 +0000 (16:52 -0700)]
mpxyd: set options bug fix for mcm_ib_inline

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoUpdate release notes with latest CM times
Arlin Davis [Thu, 28 May 2015 15:22:24 +0000 (08:22 -0700)]
Update release notes with latest CM times

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
5 years agoRelease 2.1.5 dapl-2.1.5-1
Arlin Davis [Tue, 26 May 2015 17:28:11 +0000 (10:28 -0700)]
Release 2.1.5

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>