-
Notifications
You must be signed in to change notification settings - Fork 0
psoftware/bpfhv
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
=== Structure of this repository === The driver/ directory contains the bpfhv guest driver for Linux kernels (>= 4.18): - bpfhv.c: driver source code The proxy/ directory contains the implementation of an external backend process associated to the QEMU bpfhv-proxy network backend (-netdev bpfhv-proxy). The external backend provides the queue processing functionalities for a single bpfhv device that belongs to a QEMU VM. In other words, QEMU only implements the control functionalities of the device, while RX and TX queues are processed by the external process. Similarly to vhost-user, QEMU and the external backend process use a dedicated control channel to exchange some information needed for the packet processing tasks (e.g. guest memory map, addresses of each TX and RX queue, file descriptors for notifications, etc.). Files: - backend.c: main file, implements the control protocol and two packet processing loops (the first one is a poll() event loop, whereas the second one uses busy-wait; - sring.[ch]: hv implementation of a device which uses a minimal descriptor format, with no support for offloads (and reduced per-packet overhead); - sring_progs.c: eBPF programs for the sring device; - sring_gso.[ch]: hv implementation of a device which uses an extended descriptor format, supporting checksum offloads and TCP/UDP segmentation offloads; - sring_gso_progs.c: eBPF programs for the sring_gso device - vring_packed.[ch]: hv implementation of the packed virtqueue in the VirtIO 1.1 specification; - vring_packed_progs.c: eBPF programs for the vring_packed device; - start-qemu.sh: an example script to start a QEMU VM with a bpfhv device peered with a bpfhv-proxy network backend; - start-proxy.sh: an example script to start the external backend process and configure the backend network device (e.g. a TAP interface or a netmap port); === Some advantages of bpfhv === - Have doorbells on separate pages (configurable stride) - Provider can evolve the medatata header (e.g., virtio-net) to balance between the needs of FreeBSD and Linux. (virtio-net is good for Linux, but not for FreeBSD). - Virtio 1.1 vs 1.0 (while 0.95 is still around). This is a sign that there is a need for evolution and compatibility problems. - You can define a metadata format (e.g. virtio-net header) that fits the specific hardware NIC features used by the cloud provider. - Let the provider inject code to encrypt/decrypt the payload, together with the hardcoded key. The encrypt/decrypt routines can be helper functions that take as argument the OS packet pointer and the key. - Simplification of device paravirtualization. Fixed datapath ABI means that you need to be backward compatible. Look at virtio implementation in Linux 4.20: it needs to support both split and packet ring --> complex, error prone, less efficient. - Change virtual switch and backend under the hood (tap, netmap, other). - Adapt to changing workloads. === TODOs (driver) === - Let BPFHV_MAX_TX_BUFS and BPFHV_MAX_RX_BUFS be variable. This requires of course reshaping the layout of the context data structures. - Try to replace dma_map_single() with dma_map_page() on the RX datapath ? Not sure this is relevant. - What if the eBPF program needs to modify the SG layout, e.g., for encapsulation or encryption? This would require changing the paddr/vaddr/len in the buffer descriptors, and DMA mapping and unmapping... So maybe we should ask the eBPF program to DMA map/unmap so, that it can do that after encapsulation or encryption (i.e. once the SG layout is stable). === TODOs (qemu) === - Replace cpu_physical_memory_[un]map() with dma_memory_[un]map() and the MemoryRegionCache library. This should be only necessary if the guest platform has an IOMMU. Code in virtqueue_pop() and virtqueue_push(). - let backend.ops.init fail (vring packed < 2^15) - move vring_packed generic code at the top of the files
About
Implementation and evaluation of hypervisor to guest offloading for high throughput Virtual Machine traffic classification
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published