From 061079ac0b9be7a578dcd09f7865c2c0d6ac894a Mon Sep 17 00:00:00 2001 From: zhuyj Date: Wed, 20 Aug 2014 17:31:43 +0800 Subject: sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe Since the transport has always been in state SCTP_UNCONFIRMED, it therefore wasn't active before and hasn't been used before, and it always has been, so it is unnecessary to bug the user with a notification. Reported-by: Deepak Khandelwal Suggested-by: Vlad Yasevich Suggested-by: Michael Tuexen Suggested-by: Daniel Borkmann Signed-off-by: Zhu Yanjun Acked-by: Vlad Yasevich Acked-by: Daniel Borkmann Signed-off-by: David S. Miller --- net/sctp/associola.c | 1 + 1 file changed, 1 insertion(+) (limited to 'net/sctp') diff --git a/net/sctp/associola.c b/net/sctp/associola.c index 06a9ee6b2d3a..aaafb3250c6a 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -813,6 +813,7 @@ void sctp_assoc_control_transport(struct sctp_association *asoc, else { dst_release(transport->dst); transport->dst = NULL; + ulp_notify = false; } spc_state = SCTP_ADDR_UNREACHABLE; -- cgit v1.2.3 From ea4f19c1f81d4bf709c74e3789ec785828bc6e51 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 22 Aug 2014 13:03:29 +0200 Subject: net: sctp: spare unnecessary comparison in sctp_trans_elect_best When both transports are the same, we don't have to go down that road only to realize that we will return the very same transport. We are guaranteed that curr is always non-NULL. Therefore, just short-circuit this special case. Signed-off-by: Daniel Borkmann Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller --- net/sctp/associola.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'net/sctp') diff --git a/net/sctp/associola.c b/net/sctp/associola.c index aaafb3250c6a..104fae489ad4 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -1245,7 +1245,7 @@ static struct sctp_transport *sctp_trans_elect_best(struct sctp_transport *curr, { u8 score_curr, score_best; - if (best == NULL) + if (best == NULL || curr == best) return curr; score_curr = sctp_trans_score(curr); -- cgit v1.2.3 From aa4a83ee8bbc08342c4acfd59ef234cac51a1eef Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 22 Aug 2014 13:03:30 +0200 Subject: net: sctp: fix suboptimal edge-case on non-active active/retrans path selection In SCTP, selection of active (T.ACT) and retransmission (T.RET) transports is being done whenever transport control operations (UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport(). Commits 4c47af4d5eb2 ("net: sctp: rework multihoming retransmission path selection to rfc4960") and a7288c4dd509 ("net: sctp: improve sctp_select_active_and_retran_path selection") have both improved it towards a more fine-grained and optimal path selection. Currently, the selection algorithm for T.ACT and T.RET is as follows: 1) Elect the two most recently used ACTIVE transports T1, T2 for T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used 2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set T.ACT<-T.PRI and T.RET<-T1 3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1 4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is still slightly suboptimal as it can lead to the following scenario: Setup: T1: p1p1 (10.0.10.10) <==> .'`) <==> p1p1 (10.0.10.12) <= T.PRI T2: p1p2 (10.0.10.20) <==> (_ . ) <==> p1p2 (10.0.10.22) net.sctp.rto_min = 1000 net.sctp.path_max_retrans = 2 net.sctp.pf_retrans = 0 net.sctp.hb_interval = 1000 T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to link flapping). Here, the first time transmission is sent over PF path T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks are sent over the INACTIVE path T1 (T.PRI), which is not good. After the patch, it's choosing better transports in both cases by modifying step 4): 4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is the most recently used (if avail) in PF, set T.RET<-T.ACT_new This will still select a best possible path in PF if available (which can also include T.PRI/T.RET), and set both T.ACT/T.RET to it. In case sctp_assoc_control_transport() *just* put T.ACT_old into INACTIVE as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just for a very short while before going back ACTIVE, it will guarantee that this path will be reselected for T.ACT/T.RET since T3 (PF) is not available. Previously, this was not possible, as we would only select between T.PRI and T.RET, and a possible T3 would be NULL due to the fact that we have just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE and would select a suboptimal path when T.PRI/T.RET have worse properties. In the case that T.ACT_old permanently went to INACTIVE during this transition and there's no PF path available, plus T.PRI and T.RET are INACTIVE as well, we would now camp on T.ACT_old, but if everything is being INACTIVE there's really not much we can do except hoping for a successful HB to bring one of the transports back up again and, thus cause a new selection through sctp_assoc_control_transport(). Now both tests work fine: Case 1: 1. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET 2. T1 S(ACTIVE) T.ACT, T.RET T2 S(PF) 3. T1 S(ACTIVE) T.ACT, T.RET T2 S(INACTIVE) 5. T1 S(PF) T.ACT, T.RET T2 S(INACTIVE) [ 5.1 T1 S(INACTIVE) T.ACT, T.RET T2 S(INACTIVE) ] 6. T1 S(ACTIVE) T.ACT, T.RET T2 S(INACTIVE) 7. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET Case 2: 1. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET 2. T1 S(PF) T2 S(ACTIVE) T.ACT, T.RET 3. T1 S(INACTIVE) T2 S(ACTIVE) T.ACT, T.RET 5. T1 S(INACTIVE) T2 S(PF) T.ACT, T.RET [ 5.1 T1 S(INACTIVE) T2 S(INACTIVE) T.ACT, T.RET ] 6. T1 S(INACTIVE) T2 S(ACTIVE) T.ACT, T.RET 7. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET Signed-off-by: Daniel Borkmann Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller --- net/sctp/associola.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) (limited to 'net/sctp') diff --git a/net/sctp/associola.c b/net/sctp/associola.c index 104fae489ad4..a88b8524846e 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -1356,14 +1356,11 @@ static void sctp_select_active_and_retran_path(struct sctp_association *asoc) trans_sec = trans_pri; /* If we failed to find a usable transport, just camp on the - * primary or retran, even if they are inactive, if possible - * pick a PF iff it's the better choice. + * active or pick a PF iff it's the better choice. */ if (trans_pri == NULL) { - trans_pri = sctp_trans_elect_best(asoc->peer.primary_path, - asoc->peer.retran_path); - trans_pri = sctp_trans_elect_best(trans_pri, trans_pf); - trans_sec = asoc->peer.primary_path; + trans_pri = sctp_trans_elect_best(asoc->peer.active_path, trans_pf); + trans_sec = trans_pri; } /* Set the active and retran transports. */ -- cgit v1.2.3 From 38ab1fa981d543e1b00f4ffbce4ddb480cd2effe Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Thu, 28 Aug 2014 15:28:26 +0200 Subject: net: sctp: fix ABI mismatch through sctp_assoc_to_state helper Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp tree, the official header carries a copy of enum sctp_sstat_state that looks like (compared to the current in-kernel enumeration): User definition: Kernel definition: enum sctp_sstat_state { typedef enum { SCTP_EMPTY = 0, SCTP_CLOSED = 1, SCTP_STATE_CLOSED = 0, SCTP_COOKIE_WAIT = 2, SCTP_STATE_COOKIE_WAIT = 1, SCTP_COOKIE_ECHOED = 3, SCTP_STATE_COOKIE_ECHOED = 2, SCTP_ESTABLISHED = 4, SCTP_STATE_ESTABLISHED = 3, SCTP_SHUTDOWN_PENDING = 5, SCTP_STATE_SHUTDOWN_PENDING = 4, SCTP_SHUTDOWN_SENT = 6, SCTP_STATE_SHUTDOWN_SENT = 5, SCTP_SHUTDOWN_RECEIVED = 7, SCTP_STATE_SHUTDOWN_RECEIVED = 6, SCTP_SHUTDOWN_ACK_SENT = 8, SCTP_STATE_SHUTDOWN_ACK_SENT = 7, }; } sctp_state_t; This header was later on also placed into the uapi, so that user space programs can compile without having , but the shipped with instead. While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we nevertheless have a what it appears to be dummy SCTP_EMPTY state from the very early days. While it seems to do just nothing, commit 0b8f9e25b0aa ("sctp: remove completely unsed EMPTY state") did the right thing and removed this dead code. That however, causes an off-by-one when the user asks the SCTP stack via SCTP_STATUS API and checks for the current socket state thus yielding possibly undefined behaviour in applications as they expect the kernel to tell the right thing. The enumeration had to be changed however as based on the current socket state, we access a function pointer lookup-table through this. Therefore, I think the best way to deal with this is just to add a helper function sctp_assoc_to_state() to encapsulate the off-by-one quirk. Reported-by: Tristan Su Fixes: 0b8f9e25b0aa ("sctp: remove completely unsed EMPTY state") Signed-off-by: Daniel Borkmann Acked-by: Vlad Yasevich Signed-off-by: David S. Miller --- include/net/sctp/sctp.h | 13 +++++++++++++ net/sctp/socket.c | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) (limited to 'net/sctp') diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index f6e7397e799d..9fbd856e6713 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -320,6 +320,19 @@ static inline sctp_assoc_t sctp_assoc2id(const struct sctp_association *asoc) return asoc ? asoc->assoc_id : 0; } +static inline enum sctp_sstat_state +sctp_assoc_to_state(const struct sctp_association *asoc) +{ + /* SCTP's uapi always had SCTP_EMPTY(=0) as a dummy state, but we + * got rid of it in kernel space. Therefore SCTP_CLOSED et al + * start at =1 in user space, but actually as =0 in kernel space. + * Now that we can not break user space and SCTP_EMPTY is exposed + * there, we need to fix it up with an ugly offset not to break + * applications. :( + */ + return asoc->state + 1; +} + /* Look up the association by its id. */ struct sctp_association *sctp_id2assoc(struct sock *sk, sctp_assoc_t id); diff --git a/net/sctp/socket.c b/net/sctp/socket.c index eb71d49e7653..634a2abb5f3a 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4243,7 +4243,7 @@ static int sctp_getsockopt_sctp_status(struct sock *sk, int len, transport = asoc->peer.primary_path; status.sstat_assoc_id = sctp_assoc2id(asoc); - status.sstat_state = asoc->state; + status.sstat_state = sctp_assoc_to_state(asoc); status.sstat_rwnd = asoc->peer.rwnd; status.sstat_unackdata = asoc->unack_data; -- cgit v1.2.3