Commit Graph

178 Commits

Author SHA1 Message Date
Jason A. Donenfeld 1a515a5df1 version: bump
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-03 09:57:39 +02:00
Jason A. Donenfeld c61c06a812 wg_noise: cleanup counter algorithm
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-03 09:06:19 +02:00
Jason A. Donenfeld de25b2aa73 wg_cookie: zero before init in selftest for witness
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-02 20:52:11 +02:00
Jason A. Donenfeld a41a7eb994 if_wg: don't double increment error counter
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-02 20:24:24 +02:00
Jason A. Donenfeld 24c418a7ad if_wg: ensure packet is not shared before writing
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-02 20:17:18 +02:00
Jason A. Donenfeld 9095d03409 if_wg: don't memcpy data for no reason
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-02 19:51:21 +02:00
Jason A. Donenfeld 350e95248f if_wg: pad packets properly
This takes into account mismatched MTUs. Borrows the
"calculate_skb_padding" function from Linux.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-02 13:39:55 +02:00
Jason A. Donenfeld cad7ead734 if_wg: return to m temporary variable style
The rest of the code uses this, so go with it for now. Maybe later ncon
will want to clean up everything to be this way, but for now keep it
consistent.

This partially reverts commit a1fdf6646b,
but doesn't reintroduce the bug that it had fixed.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-30 10:28:39 +02:00
Jason A. Donenfeld 9552bec02b if_wg: defragment mbufs early on
This makes the crypto a *lot* faster. We might later revert this if we
use opencrypto's fancy page table mapping scheme. But for now, it's
useful. We do it early, rather than before calling decrypt/encrypt, so
that the various other m_pullups that we have wind up being no-ops.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-30 10:05:24 +02:00
Jason A. Donenfeld ab1c95731e version: bump
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 21:49:46 -04:00
Jason A. Donenfeld 2d4075f006 if_wg: allocate entire mbuf all at once
Rather than making a tiny mbuf and then allocating another one in
m_copyback, just make one larger one. These packets are generally pretty
small anyway.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 21:40:25 -04:00
Jason A. Donenfeld a1fdf6646b if_wg: do not double-free after m_pullup
Either m_pullup reallocates, in which case wg_packet_free winds up
freeing something that's already been freed, or it fails and frees m,
and then wg_packet_free tries to free it again. In both cases, massive
memory corruption ensues.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 21:28:28 -04:00
Jason A. Donenfeld 753c36ef15 if_wg: enter net epoch for isr dispatch
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 20:51:45 -04:00
Jason A. Donenfeld a3e01095e2 if_wg: write data header directly
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 19:39:33 -04:00
Jason A. Donenfeld 57fcc8e52c if_wg: do not block for memory when sending buffer
This can be called when locks are held in upper parts of the stack,
resulting in witness splats.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 19:27:35 -04:00
Jason A. Donenfeld c3b368a6e7 if_wg: use proper bool for is_retry
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 19:18:37 -04:00
Jason A. Donenfeld 708220dfc9 if_wg: simplify state setting flow
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 18:38:53 -04:00
Jason A. Donenfeld f0d2d14e83 netns: enable debug logging
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 13:44:10 -04:00
Jason A. Donenfeld 7c89703091 if_wg: pull up packet before checking aip on input
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 13:34:03 -04:00
Jason A. Donenfeld b785f617dd if_wg: unify xmit error path
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-28 13:18:40 -04:00
Matt Dunwoodie 5810c2f54f wg_noise: fix remote refcount leak
In the occasion that noise_begin_session returns != 0, we could
accidentally leak the remote refcount, as the caller to
consume_response only expects *rp to be set when ret == 0.

The only situation we could leak this is if we cannot allocate memory
for the new keypair.

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-28 14:00:11 +10:00
Jason A. Donenfeld 307e552e62 if_wg: do not assume that IP header is pulled up
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-27 23:08:42 -04:00
Jason A. Donenfeld e2ea594774 if_wg: handle if_transmit and if_output properly
The netmap code goes directly to if_transmit, which means it'll bypass
if_output, in which case, there's no packet allocated. Also, we're
relying on if_output's sockaddr structure to be legit, but who knows
what types of userspace hijynxes can forge this. Rather than relying on
that kind of black magic, determine the AF from the actual packet
contents. But still insist that it agrees with the sockaddr.

The extraction of the type from AF_UNSPEC follows the same pattern as
if_gif and if_gre.

We also use this as an opportunity to send icmp error messages in the
right place.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-27 22:49:04 -04:00
Jason A. Donenfeld cb7cd32a7c if_wg: do not increment error counter when sc is null
If sc is null, jumping to increment the counter means crash.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-25 19:31:23 -04:00
Jason A. Donenfeld dd04bc5aa4 wg_noise: compile on 32-bit
The lack of 64bit atomic helpers on 32bit is an annoyance.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-24 16:38:24 -04:00
Jason A. Donenfeld 2e5e8bd58f version: bump
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-24 13:09:11 -04:00
Jason A. Donenfeld f5fa70e36b crypto: optimize out `if (encrypt)`
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-24 13:00:32 -04:00
Jason A. Donenfeld 2298409740 if_wg: count on peers always having a remote
We do a pretty nasty hack in the allowedips selftest to avoid having to
allocate more memory. Seems to work.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-24 12:58:12 -04:00
Matt Dunwoodie 49c70643ce if_wg: ensure peer lifetime
The peer (and keypair and local) lifecycle are managed through EPOCH and
refcounts. Primarily this is used in wg_noise to keep track of active
keypairs, however we can also use it to be sure no more peer references
exist. The structures are linked as such, so noise_remote cannot be
freed until all noise_keypairs are freed, and noise_local cannot be
freed until all noise_remotes are freed.

noise_keypair -> noise_remote -> noise_local

Therefore, if you hold a keypair reference you can be sure that remote
and local will still be around.

There are three main ways peers are referenced:

1) Incoming packets
1.a) Incoming handshake packets are passed to noise_consume_*,
     which will (on success) return a refcounted remote which is dropped
     at the end of wg_handshake.
1.b) Incoming cookie packets will have their index looked which will (on
     success) return a refcounted remote, which is also dropped at the
     end of wg_handshake.
1.c) Incoming data packets will have their index looked up which will
     (on success) return a refcounted keypair. This keypair will be
     dropped after the packet has been passed up the network stack, or
     otherwise freed.

2) Outgoing data packets
2.a) Outgoing data packets are first looked up by wg_aip_lookup, which
     returns a peer pointer, with an incremented remote refcount. This
     is then dropped in wg_transmit after adding the packet to the
     staged queue and sending the staged queue.
2.b) Packets in the staged queue do not hold any refcount for the remote
     or keypair, because they do not reference the peer in any way, they
     are just in the queue.
2.c) Packets finally get a refcoutned keypair in wg_peer_send_staged,
     which is dropped after the packet is sent out the UDP socket, or
     otherwise freed.

3) wg_timers system
3.a) The wg_timers system holds a reference to the peer whenever a
     callout is scheduled. Instead of holding a refcount, we instead
     disable the peer's timers, such that no callouts can be scheduled.

Some rationale for changes here:

We move the p_{send,recv} taskqgroup_detach into peer_free_deferred as
they will NULL fields in p_{send,recv}. If there are packets being
processed in wg_{en,de}crypt, then a call tou GROUPTASK_ENQUEUE will
dereference a NULL pointer. In general, we remove all references to the
peer in wg_peer_destroy, and free/deinit all the peer members once no
more references to the remote exist, in wg_peer_free_deferred.

Currently we take a refcount in wg_aip_lookup, which is to be sure that
the peer reference is valid for the entirety of wg_transmit. We do not
care about the refcount in wg_decrypt. It might be worth considering
storing the remote pointer in the allowedip entry, but it could be
argued both ways. For the time being, this is still correct. We don't
have a refcount for the peer stored in the allowedip table, as it is
protected by the table lock. One note here is the NULL p_remote check is
necessary to support selftest/allowedips.c, which does not specify a
p_remote. If we update the tests, then we may remove this check.

There are two added p_enabled checks, in run_retry_handshake and
run_send_keepalive. This is to align them with the other callout_reset
calls. In the case of p_zero_key_material, if we have set p_enabled =
false, then we subsequently clear keypairs and handshakes (on wg_down),
or we free the peer which will clear the keypairs for us.

We want to hold a refcount of remote in wg_{en,de}crypt to ensure that
the peer is still valid in the call to GROUPTASK_ENQUEUE. If we don't
then peer may become invalid after setting p_state. Another thread may
take the packet, put the keypair refcount and free the peer prior to the
call to GROUPTASK_ENQUEUE.

We no longer need to hold (haven't for a while) the EPOCH in
wg_send_initiation and wg_send_response, as we hold valid references for
the duration. This could be either a refcount of a remote or through the
wg_timers system as described above.

We also fix some refcount leaks in wgc_set.

Notes:

We may want to pull NET_EPOCH_WAIT out of wg_timers_disable, to improve
performance. However, we can destroy 20000 peers in less than 20ms so
the performance is not critical for this snapshot and can be addressed
later.

Finally, there is the special case of noise_remote_arg, which stores the
corresponding peer pointer. The peer is not refcounted however it will
have the same scope as the remote. In otherwords it is valid until we
call noise_remote_put on the remote.

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-25 01:35:26 +10:00
Matt Dunwoodie 9e98ee86f8 selftests: capitalise fail messages for readability
Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 15:35:19 +10:00
Jason A. Donenfeld 5a6c97af1e if_wg: zero out remaining mallocs
We might add locks and things later. Mainly it doesn't cost much and
makes things easier/safer to reason about.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 23:07:51 -06:00
Jason A. Donenfeld 0c91bf5f0b wg_noise: zero out new structures
Good for hygiene, but also, lock hardening traps on double initialization
if we don't do this.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 22:52:56 -06:00
Jason A. Donenfeld a520a799ba compat: backport m_snd_tag_rele to 12
This doesn't add any reference counting, opting instead to go right to
the free. This could cause problems, but hopefully not.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 22:41:07 -06:00
Matt Dunwoodie 78dc3f1bd0 if_wg: remove M_WAITOK, check return codes on init
Here we remove all M_WAITOK checks, because we don't want to hang while
trying to allocate memory. It is better to return an error so the user
can try again later.

We also make sure to check all the return codes in peer and interface
allocation. The structure of those functions is:

1) Allocate all memory
2) Initialise fields in order of the struct
3) Cleanup gotos

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 14:40:32 +10:00
Matt Dunwoodie 8f31763ff3 if_wg: check wg_module_init succeeded
Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 14:33:30 +10:00
Jason A. Donenfeld 96f33dfffd if_wg: set snd_tag to NULL after releasing
The rest of the stack does this.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 22:32:14 -06:00
Jason A. Donenfeld 04d0ba2839 if_wg: destroy interfaces on module unload
This is already done anyway by if_clone_detach, so let that happen.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 22:25:05 -06:00
Jason A. Donenfeld 62f21e8273 wg_cookie: import optional inet6 headers
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 22:00:14 -06:00
Jason A. Donenfeld 0c227d384b wg_cookie: hash vnet into ratelimiter entry
IPs mean different things per-vnet.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 21:56:52 -06:00
Jason A. Donenfeld da78e26891 if_wg: properly use rn_inithead and rn_detachhead
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 21:22:39 -06:00
Matt Dunwoodie 7a7eb567d7 wg_cookie: allocate ratelimit table statically
We can simplify the ratelimit init/deinit calls by allocating the table
statically, that is by not using hashinit_flags. That function ended up
doing some unnecessary calculation and meant that the mask couldn't be
constant.

By increasing the size of struct ratelimit, this also caught a nasty
(but benign) bug, where ratelimit_pool was initialised to allocate
sizeof(struct ratelimit) and not sizeof(struct ratelimit_entry). It has
been this way since FreeBSD tree and I didn't pick up on it while moving
the uma_zcreate call to wg_cookie.

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 13:15:49 +10:00
Matt Dunwoodie 3959d12083 wg_cookie: cleanup internal code
The two main changes here are:

* Remove cookie_ prefix from static functions. This is a leftover from
  OpenBSD where they don't want static functions.
* Rename cm to macs, and cp to cm. Not sure where this came from but it
  didn't really make much sense to leave it as is.

The reset are whitespace changes. Overall there is no modification to
functionality here, just appearances.

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 12:38:26 +10:00
Matt Dunwoodie 69d65f583c wg_cookie: add cookie_valid bool
Primarily this commit adds a cookie_valid state, to prevent a recently
booted machine from sending a mac2. We also do a little bit of reworking
on locking and a fixup for int to bool.

There is one slight difference to cookie_valid (latest_cookie.is_valid)
on Linux and that is to set cookie_valid to false when the
cookie_birthdate has expired. The purpose of this is to prevent the
expensive timer check after it has expired.

For the locking, we want to hold a write lock in cookie_maker_mac
because we write to mac1_last, mac1_valid and cookie_valid. This
wouldn't cause too much contention as this is a per peer lock and we
only do so when sending handshake packets. This is different from Linux
as Linux writes all it's variables at the start, then downgrades to a
read lock.

We also match cookie_maker_consume_payload locking to Linux, that is to
read lock while checking mac1_valid and decrypting the cookie then take
a write lock to set the cookie.

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 12:17:04 +10:00
Matt Dunwoodie 7ea3c638c7 wg_cookie: make ratelimiter global
Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-23 12:17:04 +10:00
Jason A. Donenfeld 47a6d9c35e TODO: more nits
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 00:39:13 -06:00
Jason A. Donenfeld 2621207a3c if_wg: add more usual string concat spacing
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 00:38:40 -06:00
Jason A. Donenfeld ed944d02de if_wg: correct logic in tag clearing
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 00:38:40 -06:00
Jason A. Donenfeld 2e200c8cf3 global: add missing brackets
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-22 00:38:40 -06:00
Matt Dunwoodie 1ec4ad8b7a if_wg: more thorough wg_mbuf_reset
Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-22 00:02:45 -06:00
Matt Dunwoodie 4947482e22 if_wg: better loop detection
While it was nice to have per peer loop detection, it was not meant to
be. The loop tag has a tag type == 0, which conflicts with other tags.
Therefore we want to at least be a little bit more sure that the tag
cookie is unique to the loop tag. I guess the peer address was also
quite hacky so on the other side, I'm glad to be rid of that.

Now we have a loop of 8 (to any peer) which should be good enough for an
edge case operation.

Signed-off-by: Matt Dunwoodie <ncon@noconroy.net>
2021-04-22 00:02:45 -06:00