A missing piece for AF_VSOCK in the Linux kernel has been network namespace support. We discussed it as a future challenge during the KVM Forum 2019 talk and it was mentioned in several conference discussions since then.
I started working on namespace support back in
2019,
but never had the chance to complete it. Last year, Bobby Eshleman (Meta)
restarted the effort and drove it through 16 revisions of the patch series.
Daniel Berrangé, Michael S. Tsirkin, Paolo Abeni, and I contributed with
reviews and suggestions that shaped the current user API.
The result has been merged into net-next and will be available in
Linux 7.0.
Updated on 2026-04-17 to cover the write-once behavior of child_ns_mode,
merged after the initial publication of this post.
Background
Network namespaces are a fundamental building block for containers in Linux. They provide isolation of the network stack, so each namespace has its own interfaces, routing tables, and sockets.
Before Linux 7.0, AF_VSOCK was not namespace-aware. All vsock sockets lived in the same global space, regardless of the network namespace they were created in. This caused two problems:
- No isolation: a VM started inside a network namespace (or container) was reachable via vsock from any other namespace on the host, breaking the isolation that containers expect.
- No CID reuse: since CIDs were global, two VMs in different namespaces could not use the same CID, even if they were completely isolated from each other at the network level.
Design
The new implementation introduces two modes, configured per network namespace:
- global: CIDs are shared across namespaces. This is the original behavior and the default, so existing setups continue to work without any change.
- local: namespaces are completely isolated. Sockets in a local-mode namespace can only communicate with other sockets in the same namespace.
Two sysctl knobs are available since Linux 7.0:
/proc/sys/net/vsock/child_ns_mode: the parent namespace uses this to set the mode that new child namespaces will inherit. Acceptsglobalorlocal. This sysctl is write-once: after the first write, it becomes immutable./proc/sys/net/vsock/ns_mode: read-only, shows the mode of the current namespace. The mode is immutable after namespace creation.
This design ensures backward compatibility: the default is global, matching
the previous behavior. Namespace isolation is opt-in.
Each namespace gets its mode from the parent’s child_ns_mode at
creation time. Once set, the namespace’s ns_mode is immutable: every
socket and VM in that namespace follows it.
The child_ns_mode sysctl is also write-once: the first write locks
the value and any subsequent write of a different value returns -EBUSY.
This guarantees that a namespace manager can set the value once and be
certain it won’t change before creating its namespaces, preventing races
where two administrator processes set conflicting modes and one ends up
with a namespace in the wrong mode.
Supported vsock transports
This series adds namespace support to two transports:
- vhost-vsock: host-to-guest (H2G) transport, emulates the virtio-vsock device for KVM guests
- vsock-loopback: local transport, useful for testing and debugging without running VMs
The missing transports are the guest-to-host (G2H) ones (virtio, hyperv, vmci).
These run in the guest as device drivers, and we currently don’t have a way to
assign a vsock device to a specific namespace, since vsock devices are not
standard network devices. For now, G2H transports operate in global mode, so
they are reachable from any global namespace, but not from local namespaces.
This means that sockets in a local namespace cannot communicate with the host
through these transports. We plan to work on that in the future.
Examples
Loopback
In the following examples, the commands without a namespace prefix run in the
initial network namespace (init_netns), which is the default namespace where
all processes start. The init_netns is always in global mode.
These examples use the vsock loopback device for local communication, without any VM involved.
Make sure the vsock_loopback kernel module is loaded:
$ sudo modprobe vsock_loopback
Namespace isolation with unshare
Global mode (default)
By default, child_ns_mode is set to global. This is the same behavior
as before Linux 7.0: vsock sockets are shared across namespaces.
A listener started in a new namespace is reachable from the init_netns
using the loopback CID (VMADDR_CID_LOCAL = 1):
$ unshare --user --net nc --vsock -l 1234 &
$ nc --vsock 1 1234
# reachable - global mode, no isolation
Local mode
Setting child_ns_mode to local enables isolation. Since this sysctl
is write-once, we use a nested unshare to avoid locking the
init_netns mode:
$ unshare --user --map-root-user --net bash -c \
'echo local > /proc/sys/net/vsock/child_ns_mode && \
unshare --net nc --vsock -l 1234' &
$ nc --vsock 1 1234
Ncat: Connection reset by peer.
Namespace isolation with ip netns
The same can be done with ip netns, which requires root (or CAP_SYS_ADMIN).
First, create a global namespace and check its mode:
$ sudo ip netns add vsock_ns_global
$ sudo ip netns exec vsock_ns_global cat /proc/sys/net/vsock/ns_mode
global
A listener in the global namespace is reachable from the init_netns:
$ sudo ip netns exec vsock_ns_global nc --vsock -l 1234 &
$ nc --vsock 1 1234
# reachable - global mode, no isolation
Now create a local namespace. Since child_ns_mode is write-once, we
use unshare to create a parent namespace and set its mode to local
before creating the child:
$ sudo unshare --net bash -c \
'echo local > /proc/sys/net/vsock/child_ns_mode && \
ip netns add vsock_ns_local'
$ sudo ip netns exec vsock_ns_local cat /proc/sys/net/vsock/ns_mode
local
A listener in the local namespace is not reachable from the init_netns:
$ sudo ip netns exec vsock_ns_local nc --vsock -l 1234 &
$ nc --vsock 1 1234
Ncat: Connection reset by peer.
But communication within the same namespace still works:
$ sudo ip netns exec vsock_ns_local nc --vsock 1 1234
# reachable - same namespace
VMs with QEMU
The vhost-vsock H2G transport exposes the /dev/vhost-vsock device, which
QEMU opens at VM startup to emulate the virtio-vsock device for the guest.
Since namespace support applies to this transport, VMs inherit the namespace
mode as well.
In the following examples, we reuse the vsock_ns_global and vsock_ns_local
namespaces created in the previous section.
Global mode
With global mode, the VM started in a global namespace is reachable
from any other global namespace, including the init_netns:
$ sudo ip netns exec vsock_ns_global \
qemu-system-x86_64 -m 1G -M q35,accel=kvm \
-drive file=guest.qcow2,if=virtio,snapshot=on \
-device vhost-vsock-pci,guest-cid=42
# start a listener in the guest (global namespace)
guest_global$ nc --vsock -l 1234
# from the init_netns (global) - reachable
$ nc --vsock 42 1234
Local mode
With local mode, the VM is only reachable from within the same namespace:
$ sudo ip netns exec vsock_ns_local \
qemu-system-x86_64 -m 1G -M q35,accel=kvm \
-drive file=guest.qcow2,if=virtio,snapshot=on \
-device vhost-vsock-pci,guest-cid=42
# start a listener in the guest (local namespace)
guest_local$ nc --vsock -l 1234
# from the init_netns (global) - isolated
$ nc --vsock 42 1234
Ncat: Connection reset by peer.
# from the same namespace - reachable
$ sudo ip netns exec vsock_ns_local nc --vsock 42 1234
Guest-to-host (G2H) behavior
As mentioned in the Supported vsock transports
section, the G2H virtio transport does not support namespaces yet. The
virtio-vsock device in the guest always operates in global mode, so
only sockets in global namespaces can communicate with the host.
Using the VM started in vsock_ns_global, a listener in the guest’s
init_netns is reachable from the host:
# start a listener in the guest
guest_global$ nc --vsock -l 1234
# from the host - reachable
$ nc --vsock 42 1234
But a listener started in a local namespace inside the guest is not
reachable from the host:
# create a local namespace in the guest and start a listener
guest_global$ unshare --user --map-root-user --net bash -c \
'echo local > /proc/sys/net/vsock/child_ns_mode && \
unshare --net nc --vsock -l 1234' &
# from the host - isolated
$ nc --vsock 42 1234
Ncat: Connection reset by peer.
CID reuse
Note that we used the same CID (42) in both examples without turning off
the first VM. This is possible because the second VM is in a local
namespace, so its CID space is isolated. With global mode, QEMU would
fail to start the second VM because the CID is already in use.
Patches
- [PATCH net-next v16 00/12] vsock: add namespace support to vhost-vsock and loopback
- [PATCH net v3 0/3] vsock: add write-once semantics to child_ns_mode
- [PATCH net] vsock: initialize child_ns_mode_locked in vsock_net_init()