Skip to content

Commit

Permalink
Merge pull request #757 from iqiyi/devel
Browse files Browse the repository at this point in the history
Release v1.9.0
  • Loading branch information
ywc689 authored Sep 1, 2021
2 parents 8c4ea56 + 471d8f1 commit e094c9d
Show file tree
Hide file tree
Showing 113 changed files with 3,685 additions and 4,353 deletions.
33 changes: 4 additions & 29 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,44 +16,19 @@ jobs:
build-basic:
runs-on: self-hosted
env:
RTE_SDK: /data/dpdk/intel/dpdk-stable-18.11.2
RTE_TARGET: x86_64-native-linuxapp-gcc
PKG_CONFIG_PATH: /data/dpdk/dpdklib/lib64/pkgconfig
steps:
- uses: actions/checkout@v2
- name: make
run: make -j32
run: make -j

build-mlnx:
runs-on: self-hosted
env:
RTE_SDK: /data/dpdk/mlnx/dpdk-stable-18.11.2
RTE_TARGET: x86_64-native-linuxapp-gcc
steps:
- uses: actions/checkout@v2
- name: config
run: sed -i 's/^CONFIG_MLX5=./CONFIG_MLX5=y/' src/config.mk
- name: make
run: make -j32

build-debug:
runs-on: self-hosted
env:
RTE_SDK: /data/dpdk/intel/dpdk-stable-18.11.2
RTE_TARGET: x86_64-native-linuxapp-gcc
PKG_CONFIG_PATH: /data/dpdk/dpdklib/lib64/pkgconfig
steps:
- uses: actions/checkout@v2
- name: config
run: sed -i 's/#CFLAGS +=/CFLAGS +=/' src/config.mk && sed -i 's/^#DEBUG := 1/DEBUG := 1/' src/Makefile
- name: make
run: make -j32

build-olddpdk:
runs-on: self-hosted
env:
RTE_SDK: /data/dpdk/intel/dpdk-stable-17.11.6
RTE_TARGET: x86_64-native-linuxapp-gcc
steps:
- uses: actions/checkout@v2
- name: make
run: make -j32

run: make -j
5 changes: 2 additions & 3 deletions .github/workflows/run.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,11 @@ jobs:
run-dpvs:
runs-on: self-hosted
env:
RTE_SDK: /data/dpdk/intel/dpdk-stable-18.11.2
RTE_TARGET: x86_64-native-linuxapp-gcc
PKG_CONFIG_PATH: /data/dpdk/dpdklib/lib64/pkgconfig
steps:
- uses: actions/checkout@v2
- name: make
run: make -j32
run: make -j
- name: install
run: make install
- name: run-dpvs
Expand Down
76 changes: 40 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

`DPVS` is a high performance **Layer-4 load balancer** based on [DPDK](http://dpdk.org). It's derived from Linux Virtual Server [LVS](http://www.linuxvirtualserver.org/) and its modification [alibaba/LVS](https://github.com/alibaba/LVS).

> The name `DPVS` comes from "DPDK-LVS".
> Notes: The name `DPVS` comes from "DPDK-LVS".
![dpvs.png](./pic/dpvs.png)

Expand Down Expand Up @@ -52,7 +52,9 @@ This *quick start* is tested with the environment below.
Other environments should also be OK if DPDK works, please check [dpdk.org](http://www.dpdk.org) for more info.

* Please check this link for NICs supported by DPDK: http://dpdk.org/doc/nics.
* Note `flow-director` ([fdir](http://dpdk.org/doc/guides/nics/overview.html#id1)) is needed for `FNAT` and `SNAT` mode with multi-cores.
* Note `flow control` ([rte_flow](http://dpdk.org/doc/guides/nics/overview.html#id1)) is needed for `FNAT` and `SNAT` mode with multi-cores.

> Notes: To let dpvs work properly with multi-cores, rte_flow items must support "ipv4, ipv6, tcp, udp" four items, and rte_flow actions must support "drop, queue" at least.
## Clone DPVS

Expand All @@ -65,48 +67,49 @@ Well, let's start from DPDK then.

## DPDK setup.

Currently, `dpdk-stable-18.11.2` is recommended for `DPVS`. `dpdk-stable-17.11.2` and `dpdk-stable-17.11.6` are supported until the lifecycle end of DPVS v1.8.
Currently, `dpdk-stable-20.11.1` is recommended for `DPVS`, and we will not support dpdk version earlier than dpdk-20.11 any more. If you are still using earlier dpdk versions, such as `dpdk-stable-17.11.2`, `dpdk-stable-17.11.6` and `dpdk-stable-18.11.2`, please use earlier dpvs releases, such as [v1.8.10](https://github.com/iqiyi/dpvs/releases/tag/v1.8.10).

> You can skip this section if experienced with DPDK, and refer the [link](http://dpdk.org/doc/guides/linux_gsg/index.html) for details.
> Notes: You can skip this section if experienced with DPDK, and refer the [link](http://dpdk.org/doc/guides/linux_gsg/index.html) for details.
```bash
$ wget https://fast.dpdk.org/rel/dpdk-18.11.2.tar.xz # download from dpdk.org if link failed.
$ tar xf dpdk-18.11.2.tar.xz
$ wget https://fast.dpdk.org/rel/dpdk-20.11.1.tar.xz # download from dpdk.org if link failed.
$ tar xf dpdk-20.11.1.tar.xz
```

### DPDK patchs

There are some patches for DPDK to support extra features needed by DPVS. Apply them if needed. For example, there's a patch for DPDK `kni` driver for hardware multicast, apply it if you are to launch `ospfd` on `kni` device.

> Assuming we are in DPVS root directory and dpdk-stable-18.11.2 is under it, please note it's not mandatory, just for convenience.
> Notes: Assuming we are in DPVS root directory and dpdk-stable-20.11.1 is under it, please note it's not mandatory, just for convenience.
```
$ cd <path-of-dpvs>
$ cp patch/dpdk-stable-18.11.2/*.patch dpdk-stable-18.11.2/
$ cd dpdk-stable-18.11.2/
$ cp patch/dpdk-stable-20.11.1/*.patch dpdk-stable-20.11.1/
$ cd dpdk-stable-20.11.1/
$ patch -p1 < 0001-kni-use-netlink-event-for-multicast-driver-part.patch
$ patch -p1 < 0002-net-support-variable-IP-header-len-for-checksum-API.patch
$ patch -p1 < 0002-pdump-change-dpdk-pdump-tool-for-dpvs.patch
$ ...
```

> It's advised to patch all if your are not sure about what they are meant for.
> Tips: It's advised to patch all if your are not sure about what they are meant for.
### DPDK build and install

Now build DPDK and export `RTE_SDK` env variable for DPDK app (DPVS).
Use meson-ninja to build DPDK libraries, and export environment variable `PKG_CONFIG_PATH` for DPDK app (DPVS). The `dpdk.mk` in DPVS checks the presence of libdpdk.

```bash
$ cd dpdk-stable-18.11.2/
$ make config T=x86_64-native-linuxapp-gcc
Configuration done
$ make # or make -j40 to save time, where 40 is the cpu core number.
$ export RTE_SDK=$PWD
$ export RTE_TARGET=build
$ cd dpdk-stable-20.11.1
$ mkdir dpdklib # user desired install folder
$ mkdir dpdkbuild # user desired build folder
$ meson -Denable_kmods=true -Dprefix=dpdklib dpdkbuild
$ ninja -C dpdkbuild
$ cd dpdkbuild; ninja install
$ export PKG_CONFIG_PATH=$(pwd)/../dpdklib/lib64/pkgconfig/libdpdk.pc
```

In our tutorial, `RTE_TARGET` is set to the default "build", thus DPDK libs and header files can be found in `dpdk-stable-18.11.2/build`.
> Tips: You can use script [dpdk-build.sh](./scripts/dpdk-build.sh) to facilitate dpdk build. Run `dpdk-build.sh -h` for the usage of the script.
Now to set up DPDK hugepage, our test environment is NUMA system. For single-node system please refer to the [link](http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html).
Next is to set up DPDK hugepage. Our test environment is NUMA system. For single-node system please refer to the [link](http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html).

```bash
$ # for NUMA machine
Expand All @@ -117,40 +120,41 @@ $ mkdir /mnt/huge
$ mount -t hugetlbfs nodev /mnt/huge
```

Install kernel modules and bind NIC with `igb_uio` driver. Quick start uses only one NIC, normally we use 2 for FNAT cluster, even 4 for bonding mode. For example, suppose the NIC we would use to run DPVS is eth0, in the meantime, we still keep another standalone NIC eth1 for debugging.
Install kernel modules and bind NIC with `uio_pci_generic` driver. Quick start uses only one NIC, normally we use two for FNAT cluster, even four for bonding mode. For example, suppose the NIC we would use to run DPVS is eth0, in the meantime, we still keep another standalone NIC eth1 for debugging.

```bash
$ modprobe uio
$ cd dpdk-stable-18.11.2
$ modprobe uio_pci_generic

$ insmod build/kmod/igb_uio.ko
$ insmod build/kmod/rte_kni.ko carrier=on
$ cd dpdk-stable-20.11.1
$ insmod dpdkbuild/kernel/linux/kni/rte_kni.ko carrier=on

$ ./usertools/dpdk-devbind.py --status
$ ifconfig eth0 down # assuming eth0 is 0000:06:00.0
$ ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
$ ifconfig eth0 down # assuming eth0 is 0000:06:00.0
$ ./usertools/dpdk-devbind.py -b uio_pci_generic 0000:06:00.0
```

> Note that a kernel parameter `carrier` is added to `rte_kni.ko` since [DPDK v18.11](https://elixir.bootlin.com/dpdk/v18.11/source/kernel/linux/kni/kni_misc.c), and the default value for it is "off". We need to load `rte_kni.ko` with the extra parameter `carrier=on` to make KNI devices work properly.
> Notes:
> 1. An alternative to the `uio_pci_generic` is `igb_uio`, which is moved to a separated repository [dpdk-kmods](http://git.dpdk.org/dpdk-kmods).
> 2. A kernel module parameter `carrier` is added to `rte_kni.ko` since [DPDK v18.11](https://elixir.bootlin.com/dpdk/v18.11/source/kernel/linux/kni/kni_misc.c), and the default value for it is "off". We need to load `rte_kni.ko` with the extra parameter `carrier=on` to make KNI devices work properly.
`dpdk-devbind.py -u` can be used to unbind driver and switch it back to Linux driver like `ixgbe`. You can also use `lspci` or `ethtool -i eth0` to check the NIC PCI bus-id. Please refer to [DPDK site](http://www.dpdk.org) for more details.

> Note: PMD of Mellanox NIC is built on top of libibverbs using the Raw Ethernet Accelerated Verbs AP. It doesn't rely on UIO/VFIO driver. Thus, Mellanox NICs should not bind the `igb_uio` driver. Refer to [Mellanox DPDK](https://community.mellanox.com/s/article/mellanox-dpdk) for details.
> Notes: PMD of Mellanox NIC is built on top of libibverbs using the Raw Ethernet Accelerated Verbs AP. It doesn't rely on UIO/VFIO driver. Thus, Mellanox NICs should not bind the `igb_uio` driver. Refer to [Mellanox DPDK](https://community.mellanox.com/s/article/mellanox-dpdk) for details.
## Build DPVS

It's simple, just set `RTE_SDK` and build it.
It's simple, just set `PKG_CONFIG_PATH` and build it.

```bash
$ cd dpdk-stable-18.11.2/
$ export RTE_SDK=$PWD
$ export PKG_CONFIG_PATH=<path-of-libdpdk.pc> # normally located at dpdklib/lib64/pkgconfig/libdpdk.pc
$ cd <path-of-dpvs>

$ make # or "make -j40" to speed up.
$ make # or "make -j" to speed up
$ make install
```

> Build dependencies may be needed, such as `automake`, `libnl3`, `libnl-genl-3.0`, `openssl`, `popt` and `numactl`. You can install the missing dependencies by using the package manager of the system, e.g., `yum install popt-devel` (CentOS).
> Notes:
> 1. Build dependencies may be needed, such as `pkg-config`(version 0.29.2+),`automake`, `libnl3`, `libnl-genl-3.0`, `openssl`, `popt` and `numactl`. You can install the missing dependencies by using the package manager of the system, e.g., `yum install popt-devel` (CentOS).
> 2. Early `pkg-config` versions (v0.29.2 before) may cause dpvs build failure. If so, please upgrade this tool.
Output files are installed to `dpvs/bin`.

Expand Down Expand Up @@ -196,7 +200,7 @@ EAL: Error - exiting with code: 1
```
>It means the NIC count of DPVS does not match `/etc/dpvs.conf`. Please use `dpdk-devbind` to adjust the NIC number or modify `dpvs.conf`. We'll improve this part to make DPVS more "clever" to avoid modify config file when NIC count does not match.
What config items does `dpvs.conf` support and how to configure them? Well, `DPVS` maintains a config item file `conf/dpvs.conf.items` which lists all supported config entries and corresponding feasible values.
What config items does `dpvs.conf` support? How to configure them? Well, `DPVS` maintains a config item file `conf/dpvs.conf.items` which lists all supported config entries and corresponding feasible values. Besides, some config sample files maintained as `./conf/dpvs.*.sample` show the configurations of dpvs in some specified cases.

## Test Full-NAT (FNAT) Load Balancer

Expand Down
28 changes: 6 additions & 22 deletions conf/dpvs.bond.conf.sample
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ global_defs {
netif_defs {
<init> pktpool_size 1048575
<init> pktpool_cache 256
<init> fdir_mode perfect

<init> device dpdk0 {
rx {
Expand All @@ -33,11 +34,6 @@ netif_defs {
queue_number 8
descriptor_number 1024
}
fdir {
mode perfect
pballoc 64k
status matched
}
! mtu 1500
! promisc_mode
! kni_name dpdk0.kni
Expand All @@ -53,11 +49,6 @@ netif_defs {
queue_number 8
descriptor_number 1024
}
fdir {
mode perfect
pballoc 64k
status matched
}
! mtu 1500
! promisc_mode
! kni_name dpdk1.kni
Expand All @@ -74,11 +65,6 @@ netif_defs {
queue_number 8
descriptor_number 1024
}
fdir {
mode perfect
pballoc 64k
status matched
}
! mtu 1500
! promisc_mode
! kni_name dpdk2.kni
Expand All @@ -94,11 +80,6 @@ netif_defs {
queue_number 8
descriptor_number 1024
}
fdir {
mode perfect
pballoc 64k
status matched
}
! mtu 1500
! promisc_mode
! kni_name dpdk3.kni
Expand All @@ -109,6 +90,7 @@ netif_defs {
slave dpdk0
slave dpdk1
primary dpdk0
! numa_node 1 ! /sys/bus/pci/devices/[slaves' pci]/numa_node
kni_name bond0.kni
}

Expand All @@ -117,6 +99,7 @@ netif_defs {
slave dpdk2
slave dpdk3
primary dpdk2
! numa_node 1 ! /sys/bus/pci/devices/[slaves' pci]/numa_node
kni_name bond1.kni
}
}
Expand Down Expand Up @@ -250,7 +233,7 @@ worker_defs {
<init> worker cpu8 {
type slave
cpu_id 8
icmp_redirect_core
! icmp_redirect_core
port bond0 {
rx_queue_ids 7
tx_queue_ids 7
Expand Down Expand Up @@ -386,5 +369,6 @@ ipvs_defs {

! sa_pool config
sa_pool {
pool_hash_size 16
pool_hash_size 16
flow_enable on
}
15 changes: 8 additions & 7 deletions conf/dpvs.conf.items
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ global_defs {
netif_defs {
<init> pktpool_size 2097151 <65535, 1023-134217728>
<init> pktpool_cache 256 <256, 32-8192>
<init> fdir_mode perfect <perfect, perfect|signature> # only for ixgbe

<init> device dpdk0 {
rx {
Expand All @@ -34,12 +35,6 @@ netif_defs {
queue_number 6 <16, 0-16>
descriptor_number 512 <512, 16-8192>
}
fdir {
<init> filter on <on, on/off>
mode perfect <perfect, none|signature|perfect|perfect_mac_vlan|perfect_tunnel>
pballoc 64k <64k, 64k|128k|256k>
status matched <matched, close|matched|always>
}
! mtu 1500 <1500,0-9000>
! promisc_mode <disable>
! kni_name dpdk0.kni <char[32]>
Expand All @@ -61,12 +56,17 @@ netif_defs {
! kni_name dpdk1.kni
}

<init> device bond0 {
<init> bonding bond0 {
mode 4 <0-6>
slave dpdk0 <device name>
slave dpdk1 <device name>
primary dpdk0 <device name, use primary slave queue conf for bond>
numa_node 0 <0, int value from /sys/bus/pci/devices/[pci_bus]/numa_node>
kni_name bond0.kni <char[32]>

! supported options:
! dedicated_queues=on|enable|off|disable, default on
options OPT1=VAL1;OPT2=VAL2;...
}
}

Expand Down Expand Up @@ -262,4 +262,5 @@ ipvs_defs {

sa_pool {
<init> pool_hash_size 16 <16, 1-128>
<init> flow_enable on <on, on|off>
}
Loading

0 comments on commit e094c9d

Please sign in to comment.