IPDK Kubernetes Infrastructure Offload Release Notes
IPDK 24.01
What’s new in this Release
Service Load Balancing: Support for K8s Service of type ClusterIP. Kubeproxy implementation is now offloaded to hardware. Services can be created and dynamically distributed to endpoints. For TCP, only the SYN packet goes through the load-balancing logic. Entry is added to hardware CT table for treatment of subsequent packets. Support for dynamic scale-up of endpoints.
Support for Go version 1.21.4
Support for log level configuration from config files for Infraagent
SRIOV support for E2100
Component Feature Support
Feature |
Description |
Status |
Addition of a P4 Dataplane for offload to IPU |
Addition of InfraAgent, InfraManager components in a split grpc mode for p4 based offloads to IPU pipeline using SDE |
Production ready |
Enabling Secure gRPC |
mTLS between Inframanager, Infraagent and Infrap4d grpc components |
Production ready |
Proxy Arp Support |
Proxy ARP implementation using a dummy virtual router gateway |
Production ready |
Pod to pod connectivity using vrouter |
Each pod is connected to another pod using L3 virtual routing. Proxy arp implementation for arp resolution. |
Production ready |
Support for DPDK P4 pipeline for Pod to Pod connectivity |
A P4 DPDK pipeline for pod to pod connectivity and service load-balancing using TAP interfaces and a P4 DPDK pipeline |
Production ready |
Flow connection tracking |
Connection tracking and NAT implementation using flow 5-tuple for Service Load Balancing. |
Production ready |
Support for Service type “ClusterIP” for UDP and TCP traffic |
Support for UDP and TCP services. Fully functional DNS and Kubernetes API services. |
Production ready |
Support for Device interfaces like subfunctions and native interfaces like ipvlan and Tap |
Subfunctions for E2100; Tap for DPDK |
Production ready |
Execution support in Split Mode with Inframanager running on ACC |
Inframanager running on ACC but infraagent on host |
Production ready |
Automation scripts for cluster deployment |
Example scripts for cluster deployment of Load balancing and pod scale up |
Production ready |
SRIOV interface support for E2100 |
SRIOV support for E2100 in host mode in addition to CDQ |
Engineering preview |
Resolved Issues
After deleting and creating multiple test pods, multiple times some of the pods are not getting created, with error “failed to get a CDQ interface for pod: no free resources left” on infraagent.
No Readme for TLS certificates and security guide
Test pods are not coming to running state, as Policy related errors on inframanager
Infraagent is not coming to running state, with the error “Error while parsing Json file”
Inframanger restarts/crashes with “Panic occured, runtime error: invalid memory address or nil pointer dereference” error on sending invalid grpc messages from Defensics for fuzz testing.
“/opt/p4/k8s/inframanager”: No such file or directory” error when running setup_infra.sh in split mode
Readme: config-manage for Docker should point to right repo supporting Rocky Linux
infra-manager pod not coming up due to problem in cleanup of kustomization.yaml after runnning split mode.
scripts/setup_infra.sh doesn’t contain right changes to make in split mode for infraagent, inframanger and openssl conf files
Internal state wasn’t being retained earlier for recovery purposes.
Sanity checks were missing for wrong configuration in case of missing node IP.
“One or more write operations failed” due to duplicate rules present when inframanager was restarted.
setup_infra.sh has infrap4d start twice in split mode.
Known Issues and Limitations
The setup_infra.sh automation script, works with the default configuration for certificate paths and artifact paths. Any changes in these paths will render the script unusable. User may need to manually configure and execute instructions mentioned in the script.
SRIOV is an experimental feature. The setup_infra_sriov.sh script doesn’t support the -r option for remote IP for host IP on ACC. Host mode is supported for this release as an engineering preview.
Max supported CDQ interfaces are 254 as max vport for host. The default max vport in the cdq use case cp_init file has been provided as 50 which can be configured.
Service Load Balancing for TCP has few random session resets. Known issue and bugfix to be available in a future minor release.
IPDK 23.07
This is the first release of K8s-Infra-Offload recipe that supports E2100 and DPDK targets.
Highlights
E2100 Target
Support for Kubernetes Container Network Interface (CNI) to deploy pods and enable pod-to-pod connectivity on a P4 target using hardware device interfaces.
Use of internal gateway with dummy MAC to enable layer-3 connectivity on the same node.
Support for dynamic Subfunctions on E2100. Subfunction is a lightweight function that has a parent PCI function on which it is deployed. It is created and deployed in a unit of 1. Unlike SRIOV VFs, a subfunction doesn’t require its own PCI virtual function. A subfunction communicates with the hardware through the parent PCI function.
Infra Manager build support on ARM cores.
DPDK Target
Support for internal gateway with dummy MAC to enable layer-3 connectivity on the same node.
service of type=ClusterIP support. Service Load Balancing within the node to allow multiple pods on same node to act as end points providing any application service.
Bi-directional Auto Learning and Flow Pinning (a.k.a Connection Tracking), used with load balancing, to allow consistent end point pod selection, once it has been selected for the first packet.
DNS service provided by Core DNS pods to other pods.
Common Changes
Makefile target to support tls-secrets and certificate generation
Automatated build & integration test on each commit
Felix integration and communication with Infrastructure Offload Components.
Addition of DB to store state information.
Support for building K8s Offload Recipe for Rocky Linux 9.1
Support for Go version 1.20.5
Support for logging per feature in components
Configurable MTU using config file
Bug Fixes
“make undeploy” fails as a non-root user
Unable to deploy services after deploy/undeploy a few times
Infra manager restarts on sending “Empty CNI Add request”
Infra manager restarts on running anamoly test cases on fuzz testing using defensics
Persistent /var/log/inframanager.log is not deleted after “make undeploy”
conf and few other params in “inframanager/config.yaml” are not used, should be removed from input file
Unable to create pods after add/delete a few times
Inframanger restarts/crashes with “panic: runtime error
inframanager coming to running state after corrupting inframanager-server-ca.crt
dump flow-entries is not decrementing after deleting the test pods
Setup infra fixes for vfio driver bind
Known Issues
This release does not support multi-tenant or multi-node deployments. At present, the underlying IPDK networking recipe needs to be run on bare metal on host CPU cores. The entire node, used for deployment, is assumed to be a trusted zone. However, gRPC/gNMI channels for communications are still secured using TLS.
E2100 feature set is limited to pod-to-pod connectivity.
Incomplete integration for Network Policies.
Infra agent fails to come up if interface name is not correct
Less than expected number of PODs are in Running state
Infrap4d is not started by create_interfaces.sh script due to incorrect BDF in es2k_skip_p4.conf
Inframanger crashes with error on sending invalid grpc messages from Defensics for fuzz testing
Script create_interfaces.sh should report the status of the actions performed
Inframanager log level setting and some cleanup
Need support to set log level for all modules under Inframanager from the config setting
Split mode feature where manager runs on es2k is experimental
Coming Attractions
[E2100]
Support for Service and Load balancing.Support for Kubernetes Network Policy feature on both targets.
Support for Calico BGP and basic control plane API interfaces.
Support for natOutgoing for services with backends outside of the cluster.
[E2100]
support for Device creation and queue allocation on ARM[E2100]
Infra Manager on ARM support
Installation and Build Instructions
See the following for more information: - [Kubernetes*, Docker*, and containerd* Installation](k8s-docker-containerd-install.md) - [Kubernetes* Infrastructure Offload Readme](IPDK_K8s_Recipe_Readme.md)
License, Notices, and Disclaimers
Licensing
For licensing information, see the file “LICENSE” in the root folder of the repository.