注意: 一些读者可能想直接去 Quick Start.
如果正在使用 Kubebuilder v1, 请查看 Kubebuilder v1。
英文原版: book.kubebuilder.io
哪些人适合看这个文档
Kubernetes 的使用者
Kubernetes 的使用者将通过学习 API 是如何设计和实现的,获得对 Kubernetes 更深入的了解。 本书将教读者如何开发自己的 Kubernetes API 以及实现 Kubernetes API 的核心原理。
包括:
- 如何构造 Kubernetes API 和 Resources
- 如何进行 API 版本控制
- 如何实现故障自愈
- 如何实现垃圾回收和 Finalizers
- 如何创建声明式和命令式 API
- 如何创建 Level-Based API 和 Edge-Base API
- 如何创建 Resources 和 Subresources
Kubernetes API extension developers
API extension developers 将学习实现标准的 Kubernetes API 的原理和概念,以及用于快速构建 API 的工具和库。本书涵盖了开发人员通常会遇到的陷阱和误解。
包括:
- 如何用一个 reconciliation 方法处理多个 events
- 如何定期执行 reconciliation 方法
- 将来会有
- 何时使用 lister cache 与 live lookups(实时查找)
- 如何垃圾回收和 Finalizers
- 如何使用 Declarative Validation 和 Webhook Validation
- 如何实现 API 版本控制
相关资源
-
代码仓库: sigs.k8s.io/kubebuilder
-
Slack channel: #kubebuilder
-
Google Group: kubebuilder@googlegroups.com
快速开始
这个 Quick Start 指南包括:
环境准备
- go version v1.13+.
- docker version 17.03+.
- kubectl version v1.11.3+.
- kustomize v3.1.0+
- Access to a Kubernetes v1.11.3+ cluster.
安装
安装 kubebuilder:
os=$(go env GOOS)
arch=$(go env GOARCH)
# download kubebuilder and extract it to tmp
curl -L https://go.kubebuilder.io/dl/2.2.0/${os}/${arch} | tar -xz -C /tmp/
# move to a long-term location and put it on your path
# (you'll need to set the KUBEBUILDER_ASSETS env var if you put it somewhere else)
sudo mv /tmp/kubebuilder_2.2.0_${os}_${arch} /usr/local/kubebuilder
export PATH=$PATH:/usr/local/kubebuilder/bin
创建一个 Project
创建一个目录,然后运行 init 命令以初始化新项目
mkdir $GOPATH/src/example
cd $GOPATH/src/example
kubebuilder init --domain my.domain
创建一个 API
运行以下命令创建一个新的 API(group/version) -->webapp/v1
, 并在此 API 上创建一个新的 Kind(CRD) --> Guestbook
kubebuilder create api --group webapp --version v1 --kind Guestbook
备注: 对于如何设计 API 和如何实现业务逻辑可以参考 Designing an API 和 What’s in a Controller.
查看示例 `(api/v1/guestbook_types.go)`
// GuestbookSpec defines the desired state of Guestbook
type GuestbookSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make" to regenerate code after modifying this file
// Quantity of instances
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=10
Size int32 `json:"size"`
// Name of the ConfigMap for GuestbookSpec's configuration
// +kubebuilder:validation:MaxLength=15
// +kubebuilder:validation:MinLength=1
ConfigMapName string `json:"configMapName"`
// +kubebuilder:validation:Enum=Phone;Address;Name
Type string `json:"alias,omitempty"`
}
// GuestbookStatus defines the observed state of Guestbook
type GuestbookStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make" to regenerate code after modifying this file
// PodName of the active Guestbook node.
Active string `json:"active"`
// PodNames of the standby Guestbook nodes.
Standby []string `json:"standby"`
}
type Guestbook struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec GuestbookSpec `json:"spec,omitempty"`
Status GuestbookStatus `json:"status,omitempty"`
}
在本地运行 operator
你需要一个 Kubernetes 测试集群来进行测试,你可以使用 KIND 来启动一个本地或者远程的测试集群
在集群中安装 CRDs
make install
运行 controller (这个命令不会再后台运行 controller,所以建议新建一个 terminal 运行以下命令)
make run
安装 Custom Resources 实例
创建 Project时,对于 Create Resource [y/n]
,如果你选择 y
, 将会在 config/samples/
目录中为你的 CRD 创建一个 CR 文件 xxx.yaml (如果你修改了 API 定义文件(api/v1/guestbook_types.go),请务必修改此文件)
kubectl apply -f config/samples/
在集群上运行 operator
Build 并 Push 镜像 IMG
:
make docker-build docker-push IMG=<some-registry>/<project-name>:tag
使用指定镜像IMG
将 operator 部署到集群中:
make deploy IMG=<some-registry>/<project-name>:tag
卸载 CRDs
从集群中删除 CRDs:
make uninstall
下一步
现在,请继续学习 CronJob tutorial 教程,通过开发演示示例项目更好地了解其工作原理。
教程:构建 CronJob
很多教程都是从一些人为设置好的程序开始,或者是给你一些了解基础知识的小应用程序,这些不能让你深入了解更复杂的东西。 相反,本教程几乎带会您了解 Kubebuilder 的全部复杂性,从简单开始,然后逐步发展为功能齐全的产品。
现在让我们假设 Kubernetes 中实现的 CronJob Controller
并不能满足我们的需求,我们希望使用 Kubebuilder 重写它。
CronJob controller
会控制 kubernetes 集群上的 job 每隔一段时间运行一次,它是基于 Job controller
实现的,Job controller
的 job 只会执行任务一次。
通过重写 Job controller
,我们可以更加了解如何与不属于集群的资源类型进行交互。
创建我们的项目
如快速入门中所述,我们需要创建一个新项目,请确保已经安装Kubebuilder ,然后执行如下命令创建一个新项目:
# we'll use a domain of tutorial.kubebuilder.io,
# so all API groups will be <group>.tutorial.kubebuilder.io.
kubebuilder init --domain tutorial.kubebuilder.io
现在我们已经有了一个项目,下面让我们看一下到目前为止 Kubebuilder 为我们搭建的脚手架...
基本项目中有什么?
在创建新项目时,Kubebuilder 为我们提供了一些基本的模板。
基础设施模板
首先,会创建一些用于构建项目的 basic infrastructure:
go.mod
: 一个与项目匹配,包含最基本依赖关系的 go module 文件
module tutorial.kubebuilder.io/project
go 1.13
require (
github.com/go-logr/logr v0.1.0
github.com/robfig/cron v1.2.0
k8s.io/api v0.17.2
k8s.io/apimachinery v0.17.2
k8s.io/client-go v0.17.2
sigs.k8s.io/controller-runtime v0.5.0
)
Makefile
: 用于构建和部署 controller
# Image URL to use all building/pushing image targets
IMG ?= controller:latest
# Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
CRD_OPTIONS ?= "crd:trivialVersions=true"
all: manager
# Run tests
test: generate fmt vet manifests
go test ./api/... ./controllers/... -coverprofile cover.out
# Build manager binary
manager: generate fmt vet
go build -o bin/manager main.go
# Run against the configured Kubernetes cluster in ~/.kube/config
run: generate fmt vet
go run ./main.go
# Install CRDs into a cluster
install: manifests
kubectl apply -f config/crd/bases
# Deploy controller in the configured Kubernetes cluster in ~/.kube/config
deploy: manifests
kubectl apply -f config/crd/bases
kustomize build config/default | kubectl apply -f -
# Generate manifests e.g. CRD, RBAC etc.
manifests: controller-gen
$(CONTROLLER_GEN) $(CRD_OPTIONS) rbac:roleName=manager-role webhook paths="./api/...;./controllers/..." output:crd:artifacts:config=config/crd/bases
# Run go fmt against code
fmt:
go fmt ./...
# Run go vet against code
vet:
go vet ./...
# Generate code
generate: controller-gen
$(CONTROLLER_GEN) object:headerFile=./hack/boilerplate.go.txt paths=./api/...
# Build the docker image
docker-build: test
docker build . -t ${IMG}
@echo "updating kustomize image patch file for manager resource"
sed -i'' -e 's@image: .*@image: '"${IMG}"'@' ./config/default/manager_image_patch.yaml
# Push the docker image
docker-push:
docker push ${IMG}
# find or download controller-gen
# download controller-gen if necessary
controller-gen:
ifeq (, $(shell which controller-gen))
go get sigs.k8s.io/controller-tools/cmd/controller-gen@v0.2.0-rc.0
CONTROLLER_GEN=$(shell go env GOPATH)/bin/controller-gen
else
CONTROLLER_GEN=$(shell which controller-gen)
endif
PROJECT
: 用于创建新组件的 Kubebuilder 元数据
version: "2"
domain: tutorial.kubebuilder.io
repo: tutorial.kubebuilder.io/project
启动配置项模板
我们能在 config/
目录下找到运行 operator 所需的所有配置文件,现在它只包含运行 controller 所需要的 Kustomize YAML 配置文件,后续我们编写 operator 时,这个目录还会包含 CustomResourceDefinitions(CRD)、RBAC 和 Webhook 等相关的配置文件
config/default
包含 Kustomize base 文件,用于以标准配置启动 controller。
config/
目录下每个目录都包含不同的配置:
-
config/manager
: 包含在 k8s 集群中以 pod 形式运行 controller 的 YAML 配置文件 -
config/rbac
: 包含运行 controller 所需最小权限的配置文件
程序入口
最后,但同样重要的是,Kubebuilder 创建了我们项目的程序入口:main.go。下面让我们看看这个文件...
每个程序都需要一个 main 入口
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
我们的 main.go 文件最开始会 import 一些基础的 package。 例如:
- 比较核心的库: controller-runtime
- controller-runtime 的默认日志库:Zap(稍后会详细介绍)
package main
import (
"flag"
"fmt"
"os"
"k8s.io/apimachinery/pkg/runtime"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/cache"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
// +kubebuilder:scaffold:imports
)
每组 controller 都需要一个 Scheme, Scheme 会提供 Kinds 与 Go types 之间的映射关系(现在你只需要记住这一点)。 在编写 API 定义时,我们将会进一步讨论 Kinds。
var (
scheme = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
)
func init() {
// +kubebuilder:scaffold:scheme
}
此时,我们 main.go 的功能相对来说比较简单:
-
为 metrics 绑定一些基本的 flags。
-
实例化一个 manager,用于跟踪我们运行的所有 controllers, 并设置 shared caches 和可以连接到 API server 的 k8s clients 实例,并将 Scheme 配置传入 manager。
-
运行我们的 manager, 而 manager 又运行所有的 controllers 和 webhook。 manager 会一直处于运行状态,直到收到正常关闭信号为止。 这样,当我们的 operator 运行在 Kubernetes 上时,我们可以通过优雅的方式终止这个 Pod。
虽然目前我们还没有什么可以运行,但是请记住 +kubebuilder:scaffold:builder
注释的位置 -- 很快那里就会变得有趣。
func main() {
var metricsAddr string
flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
flag.Parse()
ctrl.SetLogger(zap.New(zap.UseDevMode(true)))
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{Scheme: scheme, MetricsBindAddress: metricsAddr})
if err != nil {
setupLog.Error(err, "unable to start manager")
os.Exit(1)
}
请注意:下面代码如果指定了 Namespace 字段, controllers 仍然可以 watch cluster 级别的资源(例如Node), 但是对于 namespace 级别的资源,cache 将仅可以缓存指定 namespace 中的资源。
mgr, err = ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
Namespace: namespace,
MetricsBindAddress: metricsAddr,
})
上面的示例将 operator 应用的获取资源范围限制在了单个 namespace。在这种情况下,建议将默认的 ClusterRole 和 ClusterRoleBinding 分别替换为 Role 和 RoleBinding 来将授权限制于此名称空间。 有关更多信息,请参见如何使用 RBAC Authorization。
除此之外,你也可以使用 MultiNamespacedCacheBuilder 来监视一组特定的 namespaces:
var namespaces []string // List of Namespaces
mgr, err = ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
NewCache: cache.MultiNamespacedCacheBuilder(namespaces),
MetricsBindAddress: fmt.Sprintf("%s:%d", metricsHost, metricsPort),
})
有关更多信息,请参见 MultiNamespacedCacheBuilder
// +kubebuilder:scaffold:builder
setupLog.Info("starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}
有了这些,我们就可以继续创建我们的API!
Groups、Versions 和 Kinds 之间的关系
在开始编写 API 之前,让我们讨论一些关键术语
当描述 Kubernetes 中的 API 时,我们经常用到的四个术语是:groups
、
versions
、kinds
和 resources
Groups 和 Versions
Kubernetes 中的 API Group
只是相关功能的集合, 每个 group
包含一个或者多个 versions
, 这样的关系可以让我们随着时间的推移,通过创建不同的 versions
来更改 API 的工作方式。
Kinds 和 Resources
Kinds
每个 API group-version
包含一个或者多个 API 类型, 我们称之为 Kinds
.
虽然同类型 Kind
在不同的 version
之间的表现形式可能不同,但是同类型 Kind
必须能够存储其他 Kind
的全部数据,也就是说同类型 Kind
之间必须是互相兼容的(我们可以把数据存到 fields 或者 annotations),这样当你使用老版本的 API group-version
时不会造成丢失或损坏, 有关更多信息,请参见Kubernetes API指南。
Resources
你应该听别人提到过 Resources
, Resource
是 Kind
在 API 中的标识,通常情况下 Kind
和 Resource
是一一对应的, 但是有时候相同的 Kind
可能对应多个 Resources
, 比如 Scale Kind 可能对应很多 Resources:deployments/scale 或者 replicasets/scale, 但是在 CRD 中,每个 Kind
只会对应一种 Resource
。
请注意,Resource
始终是小写形式,并且通常情况下是 Kind
的小写形式。
具体对应关系可以查看 resource type。
那么以上类型在框架中是如何定义的呢
当我们在一个特定的 group-version
中使用 Kind
时,我们称它为 GroupVersionKind
, 简称 GVK
, 同样的 resources
我们称它为 GVR
。
稍后我们将看到每个 GVK
都对应一个 root Go type (比如:Deployment 就关联着 K8s 源码里面 k8s.io/api/apps/v1 package 中的 Deployment struct)。
现在我们已经熟悉了一些关键术语,那么我们可以开始创建我们的 API 了!
额, 但是 Scheme 是什么东东?
Scheme
提供了 GVK
与对应 Go types(struct) 之间的映射(请不要和 godocs 中的 Scheme 混淆)
也就是说给定 Go type 就可以获取到它的 GVK,给定 GVK 可以获取到它的 Go type
例如,让我们假设 tutorial.kubebuilder.io/api/v1/cronjob_types.go
中的 CronJob
结构体在 batch.tutorial.kubebuilder.io/v1
API group 中(也就是说,假设这个 API group 有一种 Kind:CronJob
,并且已经注册到 Api Server 中)
那么我们可以从 Api Server 获取到数据并反序列化至 &CronJob{}
中,那么结果会是如下格式:
{
"kind": "CronJob",
"apiVersion": "batch.tutorial.kubebuilder.io/v1",
...
}
ps:这里翻译的不好,请继续向后看,慢慢就理解了 :)
创建一个 API
创建一个新的 Kind (你还记得上一章的内容, 对吗?)和相应的 controller,
我们可以使用 kubebuilder create api
命令:
kubebuilder create api --group batch --version v1 --kind CronJob
当我们第一次执行这个命令时,它会为创建一个新的 group-version
目录
在这里,api/v1/
这个目录会被创建, 对应的 group-version
是 batch.tutorial.kubebuilder.io/v1
(还记得我们一开始设置的--domain
setting) 参数么)
它也会创建 api/v1/cronjob_types.go
文件并添加 CronJob Kind
,
每次我们运行这个命令但是指定不同的 Kind
时, 他都会为我们创建一个 xxx_types.go
文件。
先让我们看看我们得到了什么开箱即用的东西,然后我们就开始填写它。
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
最开始我们导入了 meta/v1
库,该库通常不会直接使用,但是它包含了 Kubernetes 所有 kind 的 metadata 结构体类型。
package v1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
下一步,我们为我们的 Kind 定义 Spec(CronJobSpec
) 和 Status(CronJobStatus
) 的类型。
Kubernetes 的功能是将期望状态(Spec
)与集群实际状态(其他对象的Status
)和外部状态进行协调。
然后记录下观察到的状态(Status
)。
因此,每个具有功能的对象都包含 Spec
和 Status
。
但是 ConfigMap 之类的一些类型不遵循此模式,因为它们不会维护一种状态,但是大多数类型都需要。
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make" to regenerate code after modifying this file
}
// CronJobStatus defines the observed state of CronJob
type CronJobStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make" to regenerate code after modifying this file
}
然后,我们定义了 Kinds 对应的结构体类型,CronJob
和 CronJobList
。
CronJob
是我们的 root type,用来描述 CronJob Kind
。和所有 Kubernetes 对象一样,
它包含 TypeMeta
(用来定义 API version 和 Kind) 和 ObjectMeta
(用来定义 name、namespace 和 labels等一些字段)
CronJobList
包含了一个 CronJob
的切片,它是用来批量操作 Kind
的,比如 LIST 操作
通常,我们不会修改它们 -- 所有的修改都是在 Spec 和 Status 上进行的。
+kubebuilder:object:root
注释称为标记(marker)。
稍后我们还会看到更多它们,它们提供了一些元数据,
来告诉 controller-tools(我们的代码和 YAML 生成器) 一些额外的信息。
这个注释告诉 object
这是一种 root type Kind
。
然后,object
生成器会为我们生成 runtime.Object 接口的实现,
这是所有 Kinds 必须实现的接口。
// +kubebuilder:object:root=true
// CronJob is the Schema for the cronjobs API
type CronJob struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec CronJobSpec `json:"spec,omitempty"`
Status CronJobStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// CronJobList contains a list of CronJob
type CronJobList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []CronJob `json:"items"`
}
最后,我们将 Kinds 注册到 API group 中。这使我们可以将此 API group 中的 Kind 添加到任何 Scheme 中。
func init() {
SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}
现在,我们已经认识了基础的结构体,下一步我们开始填写它们!
设计一个 API
在 Kubernetes 中,我们有一些设计 API 的规范。
比如:所有可序列化字段必须是camelCase
(驼峰) 格式,所以我们使用 JSON 的 tags 去指定它们,当一个字段可以为空时,我们会用 omitempty
tag 去标记它。
字段类型大多是原始数据类型,Numbers 类型比较特殊:
出于兼容性的考虑,numbers 类型只接受三种数据类型:整形只能使用 int32
和 int64
声明,
小数只能使用 resource.Quantity
声明
我的天, Quantity 又是什么鬼
Quantity 是十进制数字的一种特殊表示法,具有明确固定的表示形式, 使它们在计算机之间更易于移植。在Kubernetes中指定资源请求和Pod的限制时,您可能已经注意到它们。
从概念上讲,它们的工作方式类似于浮点数:它们具有有效位数,基数和指数。 它们的可序列化和人类可读格式使用整数和后缀来指定值,这与我们描述计算机存储的方式非常相似。
例如,该值2m表示0.002十进制表示法。 2Ki 表示2048十进制,而2K表示2000十进制。 如果要指定分数,请切换到一个后缀,该后缀使我们可以使用整数:2.5is 2500m。
支持两个基数:10和2(分别称为十进制和二进制)。 十进制基数用“常规” SI后缀(例如 M和K)表示,而二进制基数用“ mebi”表示法(例如Mi 和Ki)指定。想想megabytes vs mebibytes。
我们还使用了另一种特殊类型:metav1.Time
。它除了格式在 Kubernetes 中比较通用外,功能与 time.Time
完全相同。
先让我们看一下 CronJob 应该是什么样子的!
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Imports
package v1
import (
batchv1beta1 "k8s.io/api/batch/v1beta1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
首先,让我们看一下 spec
,如我们之前说的,spec
表明期望的状态,所以有关 controller 的任何 “inputs” 都在这里声明。
一个基本的 CronJob 需要以下功能:
- 一个 schedule(调度器) -- (CronJob 中的 “Cron”)
- 用来运行 Job 的模板 -- (CronJob 中的 “Job”)
为了让用户体现更好,我们也需要一些额外的功能:
- 一个截止时间(
StartingDeadlineSeconds
), 如果错过了这个截止时间,Job 将会等到下一个调度时间点再被调度。 - 如果多个 Job 同时启动要怎么做(
ConcurrencyPolicy
)(等待?停掉最老的一个?还是同时运行?) - 一个暂停(
Suspend
)功能,以防止 Job 在运行过程中出现什么错误。 - 限制历史 Job 的数量(
SuccessfulJobsHistoryLimit
)
请记住,因为 job 不会读取自己的状态,所以我们需要用一些方式去跟踪一个 CronJob 是否已经运行过 Job, 我们可以用至少一个 old job 来完成这个功能。
我们会用几个标记 (//+comment
) 去定义一些额外的数据,这些标记在
controller-tools
生成 CRD manifest 时会被使用。
稍后我们还将看到,controller-tools 还会使用 GoDoc 来为字段生成描述信息。
// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
// +kubebuilder:validation:MinLength=0
// The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
Schedule string `json:"schedule"`
// +kubebuilder:validation:Minimum=0
// Optional deadline in seconds for starting the job if it misses scheduled
// time for any reason. Missed jobs executions will be counted as failed ones.
// +optional
StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`
// Specifies how to treat concurrent executions of a Job.
// Valid values are:
// - "Allow" (default): allows CronJobs to run concurrently;
// - "Forbid": forbids concurrent runs, skipping next run if previous run hasn't finished yet;
// - "Replace": cancels currently running job and replaces it with a new one
// +optional
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// This flag tells the controller to suspend subsequent executions, it does
// not apply to already started executions. Defaults to false.
// +optional
Suspend *bool `json:"suspend,omitempty"`
// Specifies the job that will be created when executing a CronJob.
JobTemplate batchv1beta1.JobTemplateSpec `json:"jobTemplate"`
// +kubebuilder:validation:Minimum=0
// The number of successful finished jobs to retain.
// This is a pointer to distinguish between explicit zero and not specified.
// +optional
SuccessfulJobsHistoryLimit *int32 `json:"successfulJobsHistoryLimit,omitempty"`
// +kubebuilder:validation:Minimum=0
// The number of failed finished jobs to retain.
// This is a pointer to distinguish between explicit zero and not specified.
// +optional
FailedJobsHistoryLimit *int32 `json:"failedJobsHistoryLimit,omitempty"`
}
我们自定义了一个类型(ConcurrencyPolicy
)来保存我们的并发策略。
它实际上只是一个字符串,但是这个类型名称提供了额外的说明,
我们还将验证附加到类型上,而不是字段上,从而使验证更易于重用。
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
表示这个 ConcurrencyPolicy
只接受 Allow
、Forbid
、Replace
这三个值。
// ConcurrencyPolicy describes how the job will be handled.
// Only one of the following concurrent policies may be specified.
// If none of the following policies is specified, the default one
// is AllowConcurrent.
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
type ConcurrencyPolicy string
const (
// AllowConcurrent allows CronJobs to run concurrently.
AllowConcurrent ConcurrencyPolicy = "Allow"
// ForbidConcurrent forbids concurrent runs, skipping next run if previous
// hasn't finished yet.
ForbidConcurrent ConcurrencyPolicy = "Forbid"
// ReplaceConcurrent cancels currently running job and replaces it with a new one.
ReplaceConcurrent ConcurrencyPolicy = "Replace"
)
下面,让我们设计 status
,它用来存储观察到的状态。它包含我们希望用户或其他 controllers 能够获取的所有信息
我们将保留一份正在运行的 Job 列表,以及最后一次成功运行 Job 的时间。 注意,如上所述,我们使用 metav1.Time 而不是 time.Time 来获得稳定的序列化。
// CronJobStatus defines the observed state of CronJob
type CronJobStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make" to regenerate code after modifying this file
// A list of pointers to currently running jobs.
// +optional
Active []corev1.ObjectReference `json:"active,omitempty"`
// Information when was the last time the job was successfully scheduled.
// +optional
LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"`
}
最后我们只剩下了 CronJob
结构体,如之前说的,我们不需要修改该结构体的任何内容,
但是我们希望操作改 Kind 像操作 Kubernetes 内置资源一样,所以我们需要增加一个 mark +kubebuilder:subresource:status
,
来声明一个status subresource, 关于 subresource 的更多信息可以参考
k8s 文档
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// CronJob is the Schema for the cronjobs API
type CronJob struct {
Root Object Definitions
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec CronJobSpec `json:"spec,omitempty"`
Status CronJobStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// CronJobList contains a list of CronJob
type CronJobList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []CronJob `json:"items"`
}
func init() {
SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}
现在,我们已经有了一个 API,然后我们需要编写一个 controller 来实现具体功能。
简要说明: 其他文件是干什么的?
api/v1/
目录下除 cronjob_types.go
外还有另外两个文件:groupversion_info.go
和 zz_generated.deepcopy.go
。
这两个文件都不需要编辑(前者保持不变,后者是自动生成的), 但是知道其中有什么是非常有用的。
groupversion_info.go
groupversion_info.go
包含和 group-version 有关的元数据:
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
首先,我们有一些 package-level
的标记,+kubebuilder:object:generate=true
表示该程序包中有 Kubernetes 对象,
+groupName=batch.tutorial.kubebuilder.io
表示该程序包的 Api group 是 batch.tutorial.kubebuilder.io
。
object
生成器使用前者,而 CRD 生成器使用后者, 来为由此包创建的 CRD 生成正确的元数据。
// Package v1 contains API Schema definitions for the batch v1 API group
// +kubebuilder:object:generate=true
// +groupName=batch.tutorial.kubebuilder.io
package v1
import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)
然后,我们定义一些全局变量帮助我们建立 Scheme。
由于我们需要在 controller 中使用此程序包中的所有 types,
因此需要一种方便的方法(或约定)可以将所有 types 添加到其他 Scheme
中。
SchemeBuilder
让我们做这件事情变的容易。
var (
// GroupVersion is group version used to register these objects
GroupVersion = schema.GroupVersion{Group: "batch.tutorial.kubebuilder.io", Version: "v1"}
// SchemeBuilder is used to add go types to the GroupVersionKind scheme
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}
// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)
zz_generated.deepcopy.go
zz_generated.deepcopy.go
包含之前所说的由 +kubebuilder:object:root
自动生成的 runtime.Object
接口的实现。
runtime.Object
interface 的核心是一个 deep-copy 方法:DeepCopyObject
。
controller-tools 中的 object
生成器也为每个 root type(CronJob
) 和他的 sub-types(CronJobList,CronJobSpec,CronJob1Status
) 都生成了两个方法:DeepCopy
和 DeepCopyInto
controller 中有什么?
Controllers
是 operator 和 Kubernetes 的核心组件。
controller 的职责是确保实际的状态(包括群集状态,以及潜在的外部状态,例如正在运行的 Kubelet 容器和云提供商的 loadbalancers)与给定 object 期望的状态相匹配。
每个 controller 专注于一个 root Kind
,但也可以与其他 Kinds
进行交互。
这种努力达到期望状态的过程,我们称之为 reconciling
(调和,使...一直)。
在 controller-runtime 库中,实现 Kind reconciling 的逻辑我们称为
Reconciler。
reconciler
获取对象的名称并返回是否需要重试(例如: 发生错误或是一些周期性的 controllers,像 HorizontalPodAutoscale)。
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
首先,我们会 import 一些标准库。和以前一样,我们需要 controller-runtime
库,client 包以及我们定义的有关 API 类型的软件包。
package controllers
import (
"context"
"github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
batchv1 "tutorial.kubebuilder.io/project/api/v1"
)
接下来,kubebuilder 为我们搭建了一个基本的 reconciler
结构体。
几乎每个 reconciler
都需要记录日志,并且需要能够获取对象,因此这个结构体是开箱即用的。
// CronJobReconciler reconciles a CronJob object
type CronJobReconciler struct {
client.Client
Log logr.Logger
Scheme *runtime.Scheme
}
大多数 controllers 最终都会运行在 k8s 集群上,因此它们需要 RBAC 权限, 我们使用 controller-tools RBAC markers 指定了这些权限。 这是运行所需的最小权限。 随着我们添加更多功能,我们将会重新定义这些权限。
// +kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs/status,verbs=get;update;patch
Reconcile
方法对个某个单一的 object 执行 reconciling
动作,
我们的 Request只是一个 name,
但是 client 可以通过 name 信息从 cache 中获取到对应的 object。
我们返回的 result 为空,且 error 为 nil, 这表明 controller-runtime 已经成功 reconciled 了这个 object,无需进行任何重试,直到这个 object 被更改。
大多数 controller 都需要一个 logging handle 和一个 context,因此我们在这里进行了设置。
context 是用于允许取消请求或者用于跟踪之类的事情。
这是所有 client 方法的第一个参数。
Background
context 只是一个 basic context,没有任何额外的数据或时间限制。
logging handle 用于记录日志, controller-runtime 通过 logr 库结构化日志记录。 稍后我们将看到,日志记录通过将 key-value 添加到静态消息中而起作用。 我们可以在 reconcile 方法的顶部提前分配一些 key-value ,以便查找在这个 reconciler 中所有的日志
func (r *CronJobReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
_ = context.Background()
_ = r.Log.WithValues("cronjob", req.NamespacedName)
// your logic here
return ctrl.Result{}, nil
}
最后,我们将此 reconciler 加到 manager,以便在启动 manager 时启动 reconciler。
现在,我们仅记录了 reconciler 在 CronJob
上的动作。稍后,我们将使用它来标记我们关心的 objects。
func (r *CronJobReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&batchv1.CronJob{}).
Complete(r)
}
现在,我们已经看到了 controller 的基本结构,下一步让我们来填写 CronJob
的逻辑
实现一个 controller
我们的 CronJob controller 的基本逻辑是:
-
加载 CronJob
-
列出所有 active jobs,并更新状态
-
根据历史记录清理 old jobs
-
检查 Job 是否已被 suspended(如果被 suspended,请不要执行任何操作)
-
获取到下一次要 schedule 的 Job
-
运行新的 Job, 确定新 Job 没有超过 deadline 时间,且不会被我们 concurrency 规则 block
-
如果 Job 正在运行或者它应该下次运行,请重新排队
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
我们从一些 import 开始,正如你看到的,我们使用的 package 比脚手架帮我们生成的多,在使用它们时,我们会逐一讨论。
package controllers
import (
"context"
"fmt"
"sort"
"time"
"github.com/go-logr/logr"
"github.com/robfig/cron"
kbatch "k8s.io/api/batch/v1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
ref "k8s.io/client-go/tools/reference"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
batch "tutorial.kubebuilder.io/project/api/v1"
)
接下来,我们需要一个 Clock 字段,它帮我们在测试中伪装计时。
// CronJobReconciler reconciles a CronJob object
type CronJobReconciler struct {
client.Client
Log logr.Logger
Scheme *runtime.Scheme
Clock
}
Clock
我们模拟时钟,以便在测试时更容易跳转,
realClock
只是调用了 time.Now
函数。
type realClock struct{}
func (_ realClock) Now() time.Time { return time.Now() }
// clock 知道如何获取当前时间
//它可以用来在测试时伪造时间。
type Clock interface {
Now() time.Time
}
请注意,我们需要更多的 RBAC 权限 -- 由于我们现在正在创建和管理 Job,因此我们需要这些权限, 这意味着要添加更多 markers,所以我们增加了最下面两行。
// +kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=batch,resources=jobs,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=batch,resources=jobs/status,verbs=get
现在, 我们到了 controller 的核心部分 -- 实现 reconciler 的逻辑
var (
scheduledTimeAnnotation = "batch.tutorial.kubebuilder.io/scheduled-at"
)
func (r *CronJobReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
ctx := context.Background()
log := r.Log.WithValues("cronjob", req.NamespacedName)
1: 按 namespace 加载 CronJob
我们使用 client 获取 CronJob。
所有的 client 方法都将 context(用来取消请求)作为其第一个参数,
并将所讨论的 object 作为其最后一个参数。 Get 方法有点特殊,
因为它使用 NamespacedName
作为中间参数(大多数没有中间参数,如下所示)。
最后,许多 client 方法也采用可变参数选项(也就是 “...”)。
var cronJob batch.CronJob
if err := r.Get(ctx, req.NamespacedName, &cronJob); err != nil {
log.Error(err, "unable to fetch CronJob")
// we'll ignore not-found errors, since they can't be fixed by an immediate
// requeue (we'll need to wait for a new notification), and we can get them
// on deleted requests.
return ctrl.Result{}, client.IgnoreNotFound(err)
}
2: 列出所有 active jobs,并更新状态
要完全更新我们的状态,我们需要列出此 namespace 中属于此 CronJob 的所有 Job。
与 Get 方法类似,我们可以使用 List 方法列出 Job。
注意,我们使用可变参数选项来设置 client.InNamespace
和 client.MatchingFields
。
var childJobs kbatch.JobList
if err := r.List(ctx, &childJobs, client.InNamespace(req.Namespace), client.MatchingFields{jobOwnerKey: req.Name}); err != nil {
log.Error(err, "unable to list child Jobs")
return ctrl.Result{}, err
}
当得到所有的 Job 后,我们把 Job 的状态分为 active、successful和 failed, 并跟踪他们最近的运行情况,以便将其记录在 status 中。 请记住,status 应该可以从整体的状态重新构造, 因此从 root object 的状态读取信息通常不是一个好主意。 相反,您应该在每次运行时重新构建它。 这就是我们在这里要做的。
我们可以使用 status conditions 来检查作业是“完成”、成功或失败。 我们将把这种逻辑放在匿名函数中,以使我们的代码更整洁。
// 查找状态为 active 的 Jobs
var activeJobs []*kbatch.Job
var successfulJobs []*kbatch.Job
var failedJobs []*kbatch.Job
var mostRecentTime *time.Time // 找到最后一次运行 Job,以便我们更新状态
isJobFinished
如果一项工作的 “succeeded” 或 “failed” 的 Conditions 标记为 “true”,我们认为该工作 “finished”。
Status.conditions
使我们可以向 objects 添加可扩展的状态信息,
其他人和 controller 可以通过检查这些状态信息以确定 Job 完成和健康状况。
isJobFinished := func(job *kbatch.Job) (bool, kbatch.JobConditionType) {
for _, c := range job.Status.Conditions {
if (c.Type == kbatch.JobComplete || c.Type == kbatch.JobFailed) && c.Status == corev1.ConditionTrue {
return true, c.Type
}
}
return false, ""
}
getScheduledTimeForJob
我们将使用匿名函数从创建 Job 时添加的 annotation 中获取到 Job 计划执行的时间。
getScheduledTimeForJob := func(job *kbatch.Job) (*time.Time, error) {
timeRaw := job.Annotations[scheduledTimeAnnotation]
if len(timeRaw) == 0 {
return nil, nil
}
timeParsed, err := time.Parse(time.RFC3339, timeRaw)
if err != nil {
return nil, err
}
return &timeParsed, nil
}
// 根据 Job 的状态将 Job 放到不同的切片中, 并获得最近一个 Job
for i, job := range childJobs.Items {
_, finishedType := isJobFinished(&job)
switch finishedType {
case "": // ongoing
activeJobs = append(activeJobs, &childJobs.Items[i])
case kbatch.JobFailed:
failedJobs = append(failedJobs, &childJobs.Items[i])
case kbatch.JobComplete:
successfulJobs = append(successfulJobs, &childJobs.Items[i])
}
// 把运行时间存在 annotation,以便我们重新获取他们
scheduledTimeForJob, err := getScheduledTimeForJob(&job)
if err != nil {
log.Error(err, "unable to parse schedule time for child job", "job", &job)
continue
}
// 获取最后一个 Job
if scheduledTimeForJob != nil {
if mostRecentTime == nil {
mostRecentTime = scheduledTimeForJob
} else if mostRecentTime.Before(*scheduledTimeForJob) {
mostRecentTime = scheduledTimeForJob
}
}
}
if mostRecentTime != nil {
cronJob.Status.LastScheduleTime = &metav1.Time{Time: *mostRecentTime}
} else {
cronJob.Status.LastScheduleTime = nil
}
cronJob.Status.Active = nil
for _, activeJob := range activeJobs {
jobRef, err := ref.GetReference(r.Scheme, activeJob)
if err != nil {
log.Error(err, "unable to make reference to active job", "job", activeJob)
continue
}
cronJob.Status.Active = append(cronJob.Status.Active, *jobRef)
}
在这里,我们将以略高的日志记录级别记录观察到的 Job 数量,以进行调试。 请注意,我们是使用固定消息在键值对中附加额外的信息,而不是使用字符串格式。 这样可以更轻松地过滤和查询日志行。
log.V(1).Info("job count", "active jobs", len(activeJobs), "successful jobs", len(successfulJobs), "failed jobs", len(failedJobs))
我们根据我们得到的时间更新 CRD 的状态。
和以前一样,我们使用 client。
为了更新 subresource 的状态,我们将使用 client 的 Status().Update
方法
subresource status 会忽略对 spec 的更改, 因此它与其他任何 update 发生冲突的可能性较小, 并且它可以具有单独的权限。
if err := r.Status().Update(ctx, &cronJob); err != nil {
log.Error(err, "unable to update CronJob status")
return ctrl.Result{}, err
}
一旦我们正确 update 了我们的 status,我们可以确保整体的状态是符合我们在 spec 中指定的。
3: 根据历史记录清理过期 jobs
首先,我们会尝试清理过期的 jobs,以免留下太多闲杂事。
// 注意:这里是尽量删除,如果删除失败,我们不会为了删除让它们重新排队
if cronJob.Spec.FailedJobsHistoryLimit != nil {
// 把失败的 Job 按时间排序
sort.Slice(failedJobs, func(i, j int) bool {
if failedJobs[i].Status.StartTime == nil {
return failedJobs[j].Status.StartTime != nil
}
return failedJobs[i].Status.StartTime.Before(failedJobs[j].Status.StartTime)
})
// 如果 failedJob 超出 FailedJobsHistoryLimit 就删掉
for i, job := range failedJobs {
if int32(i) >= int32(len(failedJobs))-*cronJob.Spec.FailedJobsHistoryLimit {
break
}
if err := r.Delete(ctx, job, client.PropagationPolicy(metav1.DeletePropagationBackground)); client.IgnoreNotFound(err) != nil {
log.Error(err, "unable to delete old failed job", "job", job)
} else {
log.V(0).Info("deleted old failed job", "job", job)
}
}
}
// 如果 successfulJob 超出 SuccessfulJobsHistoryLimit 就删掉
if cronJob.Spec.SuccessfulJobsHistoryLimit != nil {
sort.Slice(successfulJobs, func(i, j int) bool {
if successfulJobs[i].Status.StartTime == nil {
return successfulJobs[j].Status.StartTime != nil
}
return successfulJobs[i].Status.StartTime.Before(successfulJobs[j].Status.StartTime)
})
for i, job := range successfulJobs {
if int32(i) >= int32(len(successfulJobs))-*cronJob.Spec.SuccessfulJobsHistoryLimit {
break
}
if err := r.Delete(ctx, job, client.PropagationPolicy(metav1.DeletePropagationBackground)); (err) != nil {
log.Error(err, "unable to delete old successful job", "job", job)
} else {
log.V(0).Info("deleted old successful job", "job", job)
}
}
}
4: 检查 Job 是否已被 suspended
如果这个 object 已被 suspended,且我们不想运行任何其他 Jobs,我们会立即 return。 这对调试 Job 的问题非常有用。
if cronJob.Spec.Suspend != nil && *cronJob.Spec.Suspend {
log.V(1).Info("cronjob suspended, skipping")
return ctrl.Result{}, nil
}
5: 获取到下一次要 schedule 的 Job
如果 Job 没有被暂停,则需要计算下一次 schedule 的 Job,以及是否有尚未处理的 Job。
getNextSchedule
我们将使用有用的 cron 库来计算下一个 scheduled 时间。 我们将从上次 Job 开始的时间计算下一次运行的时间,如果找不到上次运行的时间,就创建一个 CronJob。
如果错过的 Job 数量太多,并且我们没有设置任何 deadlines,我们将释放这个 Job, 以免造成 controller 重启。
否则,我们将只返回错过的 Job(我们将使用最后一个运行的 Job)和下一次要运行的 Job, 以便让我们知道何时该重新进行 reconcile。
getNextSchedule := func(cronJob *batch.CronJob, now time.Time) (lastMissed time.Time, next time.Time, err error) {
sched, err := cron.ParseStandard(cronJob.Spec.Schedule)
if err != nil {
return time.Time{}, time.Time{}, fmt.Errorf("Unparseable schedule %q: %v", cronJob.Spec.Schedule, err)
}
// for optimization purposes, cheat a bit and start from our last observed run time
// we could reconstitute this here, but there's not much point, since we've
// just updated it.
var earliestTime time.Time
if cronJob.Status.LastScheduleTime != nil {
earliestTime = cronJob.Status.LastScheduleTime.Time
} else {
earliestTime = cronJob.ObjectMeta.CreationTimestamp.Time
}
if cronJob.Spec.StartingDeadlineSeconds != nil {
// controller is not going to schedule anything below this point
schedulingDeadline := now.Add(-time.Second * time.Duration(*cronJob.Spec.StartingDeadlineSeconds))
if schedulingDeadline.After(earliestTime) {
earliestTime = schedulingDeadline
}
}
if earliestTime.After(now) {
return time.Time{}, sched.Next(now), nil
}
starts := 0
for t := sched.Next(earliestTime); !t.After(now); t = sched.Next(t) {
lastMissed = t
// An object might miss several starts. For example, if
// controller gets wedged on Friday at 5:01pm when everyone has
// gone home, and someone comes in on Tuesday AM and discovers
// the problem and restarts the controller, then all the hourly
// jobs, more than 80 of them for one hourly scheduledJob, should
// all start running with no further intervention (if the scheduledJob
// allows concurrency and late starts).
//
// However, if there is a bug somewhere, or incorrect clock
// on controller's server or apiservers (for setting creationTimestamp)
// then there could be so many missed start times (it could be off
// by decades or more), that it would eat up all the CPU and memory
// of this controller. In that case, we want to not try to list
// all the missed start times.
starts++
if starts > 100 {
// We can't get the most recent times so just return an empty slice
return time.Time{}, time.Time{}, fmt.Errorf("Too many missed start times (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew.")
}
}
return lastMissed, sched.Next(now), nil
}
// figure out the next times that we need to create
// jobs at (or anything we missed).
missedRun, nextRun, err := getNextSchedule(&cronJob, r.Now())
if err != nil {
log.Error(err, "unable to figure out CronJob schedule")
// we don't really care about requeuing until we get an update that
// fixes the schedule, so don't return an error
return ctrl.Result{}, nil
}
我们把下次执行 Job 的 object 存到 scheduledResult 变量中,知道下次需要需要执行的时间点,然后确定 Job 是否真的需要运行。
scheduledResult := ctrl.Result{RequeueAfter: nextRun.Sub(r.Now())} // save this so we can re-use it elsewhere
log = log.WithValues("now", r.Now(), "next run", nextRun)
6: 运行新的 Job, 确定新 Job 没有超过 deadline 时间,且不会被我们 concurrency 规则 block
如果我们错过了一个 Job 的运行时间点,但是 Job 还在 deadline 时间内,我们需要再次运行这个 Job
if missedRun.IsZero() {
log.V(1).Info("no upcoming scheduled times, sleeping until next")
return scheduledResult, nil
}
// 确地没有超过 StartingDeadlineSeconds 时间
log = log.WithValues("current run", missedRun)
tooLate := false
if cronJob.Spec.StartingDeadlineSeconds != nil {
tooLate = missedRun.Add(time.Duration(*cronJob.Spec.StartingDeadlineSeconds) * time.Second).Before(r.Now())
}
if tooLate {
log.V(1).Info("missed starting deadline for last run, sleeping till next")
// TODO(directxman12): events
return scheduledResult, nil
}
如果我们不得不运行一个 Job,那么需要等现有 Job 完成之后,替换现有 Job 或添加一个新 Job。 如果我们的信息由于 cache 的延迟而过时,那么我们将在获得 up-to-date 信息时重新排队。
// 判断如何运行此 Job -- 并发策略可能会禁止我们同时运行多个 Job
if cronJob.Spec.ConcurrencyPolicy == batch.ForbidConcurrent && len(activeJobs) > 0 {
log.V(1).Info("concurrency policy blocks concurrent runs, skipping", "num active", len(activeJobs))
return scheduledResult, nil
}
// ...或者希望我们替换一个 Job...
if cronJob.Spec.ConcurrencyPolicy == batch.ReplaceConcurrent {
for _, activeJob := range activeJobs {
// 我们不关心 Job 是否已经被删除
if err := r.Delete(ctx, activeJob, client.PropagationPolicy(metav1.DeletePropagationBackground)); client.IgnoreNotFound(err) != nil {
log.Error(err, "unable to delete active job", "job", activeJob)
return ctrl.Result{}, err
}
}
}
一旦弄清楚如何处理现有 Job,我们便会真正创建所需的 Job。
constructJobForCronJob
我们需要基于 CronJob 的 template 构建 Job, 我们将从 template 复制 spec,然后复制一些基本的字段。
然后,我们将设置 “scheduled time” annotation,
以便我们可以在每次 reconcile 时重新构造我们的 LastScheduleTime
字段。
最后,我们需要设置 owner reference。 这使 Kubernetes 垃圾收集器在删除 CronJob 时清理 Job, 并允许 controller-runtime 找出给定 Job 被更改(added,deleted,completes)时需要协调哪个 CronJob。
constructJobForCronJob := func(cronJob *batch.CronJob, scheduledTime time.Time) (*kbatch.Job, error) {
// We want job names for a given nominal start time to have a deterministic name to avoid the same job being created twice
// 这里防止 Job 名称冲突
name := fmt.Sprintf("%s-%d", cronJob.Name, scheduledTime.Unix())
job := &kbatch.Job{
ObjectMeta: metav1.ObjectMeta{
Labels: make(map[string]string),
Annotations: make(map[string]string),
Name: name,
Namespace: cronJob.Namespace,
},
Spec: *cronJob.Spec.JobTemplate.Spec.DeepCopy(),
}
for k, v := range cronJob.Spec.JobTemplate.Annotations {
job.Annotations[k] = v
}
job.Annotations[scheduledTimeAnnotation] = scheduledTime.Format(time.RFC3339)
for k, v := range cronJob.Spec.JobTemplate.Labels {
job.Labels[k] = v
}
if err := ctrl.SetControllerReference(cronJob, job, r.Scheme); err != nil {
return nil, err
}
return job, nil
}
// actually make the job...
job, err := constructJobForCronJob(&cronJob, missedRun)
if err != nil {
log.Error(err, "unable to construct job from template")
// don't bother requeuing until we get a change to the spec
return scheduledResult, nil
}
// 在 k8s 集群上启动一个 Job Resource
if err := r.Create(ctx, job); err != nil {
log.Error(err, "unable to create Job for CronJob", "job", job)
return ctrl.Result{}, err
}
log.V(1).Info("created Job for CronJob run", "job", job)
7: 如果 Job 正在运行或者它应该下次运行,请重新排队
最后,我们将返回上面准备的结果,这表示我们想在下次运行的 Job 需要重新排队时使用。 这被视为 maximum deadline -- 如果两者之间有其他变化,例如我们的 Job 开始或完成, 或者我们得到一个修改信息,我们需要尽快 reconcile。
// 当 Job 运行的时候把下次需要运行的 Job object 放到队列中,并更新状态
return scheduledResult, nil
}
Setup
最后,我们将更新我们的设置。 为了使我们的 reconciler 可以按其 owner 快速查找 Jobs,我们需要一个 index。 我们声明一个 index key,以后可以与 client 一起使用它作为伪字段名称,然后描述如何从 Job 对象中提取 index key。 索引器将自动为我们处理 namespace,因此,如果 Job 有一个 CronJob owner,我们只需提取所有者名称。
此外,我们会通知 manager 该 controller 拥有一些 Jobs, 以便在作业发生更改,删除等情况时,它将自动在基础 CronJob 上调用 Reconcile。
var (
jobOwnerKey = ".metadata.controller"
apiGVStr = batch.GroupVersion.String()
)
func (r *CronJobReconciler) SetupWithManager(mgr ctrl.Manager) error {
// set up a real clock, since we're not in a test
if r.Clock == nil {
r.Clock = realClock{}
}
if err := mgr.GetFieldIndexer().IndexField(&kbatch.Job{}, jobOwnerKey, func(rawObj runtime.Object) []string {
// grab the job object, extract the owner...
job := rawObj.(*kbatch.Job)
owner := metav1.GetControllerOf(job)
if owner == nil {
return nil
}
// ...make sure it's a CronJob...
if owner.APIVersion != apiGVStr || owner.Kind != "CronJob" {
return nil
}
// ...and if so, return it
return []string{owner.Name}
}); err != nil {
return err
}
return ctrl.NewControllerManagedBy(mgr).
For(&batch.CronJob{}).
Owns(&kbatch.Job{}).
Complete(r)
}
很明显,我们现在有了一个可以运行 controller ,让我们针对集群进行测试,然后,如果我们没有任何问题,把它部署到集群中! 让我们针对集群进行测试,然后,如果我们没有任何问题,请对其进行部署!
main 文件修改什么?
还记得最开始我们说要回到 main.go
文件么? 让我们看看 main.go
新增了什么。
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Imports
package main
import (
"flag"
"os"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
batchv1 "tutorial.kubebuilder.io/project/api/v1"
"tutorial.kubebuilder.io/project/controllers"
// +kubebuilder:scaffold:imports
)
首先要注意的是 kubebuilder 已将新 API group 的 package(batchv1
)添加到我们的 scheme 中了。
这意味着我们可以在 controller 中使用这些对象了。
如果要使用任何其他 CRD,则必须以相同的方式添加其 scheme。
诸如 Job 之类的内置类型的 scheme 是由 clientgoscheme
添加的。
var (
scheme = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
)
func init() {
_ = clientgoscheme.AddToScheme(scheme)
_ = batchv1.AddToScheme(scheme)
// +kubebuilder:scaffold:scheme
}
另一处更改是 kubebuilder 添加了一个块代码,
该代码调用了我们的 CronJob controller 的 SetupWithManager
方法。
func main() {
old stuff
var metricsAddr string
var enableLeaderElection bool
flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,
"Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.")
flag.Parse()
ctrl.SetLogger(zap.New(zap.UseDevMode(true)))
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
MetricsBindAddress: metricsAddr,
LeaderElection: enableLeaderElection,
Port: 9443,
})
if err != nil {
setupLog.Error(err, "unable to start manager")
os.Exit(1)
}
if err = (&controllers.CronJobReconciler{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("Captain"),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Captain")
os.Exit(1)
}
我们还会为我们的 type 设置 webhook。 我们只需要将它们添加到 manager 中就行。 由于我们可能想单独运行 webhook,或者在本地测试 controller 不运行它们,因此我们将其是否启动放在环境变量里面。
如不需要启动 webhook,只需要设置 ENABLE_WEBHOOKS = false
即可。
if os.Getenv("ENABLE_WEBHOOKS") != "false" {
if err = (&batchv1.CronJob{}).SetupWebhookWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create webhook", "webhook", "Captain")
os.Exit(1)
}
}
// +kubebuilder:scaffold:builder
old stuff
setupLog.Info("starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}
现在,我们可以实现我们的 controller 了
实现 defaulting webhooks 和 validating webhooks
如果你想为你的 CRD 实现 admission webhooks,你只需要实现 Defaulter
和 (或) Validator
接口即可。
其余的东西 Kubebuilder 会为你实现,比如:
- 创建一个 webhook server
- 确保这个 server 被添加到 manager 中
- 为你的 webhooks 创建一个 handlers
- 将每个 handler 以 path 形式注册到你的 server 中
首先,让我们为 CRD(CronJob)创建 webhooks,我们需要用到 --defaulting
和 --programmatic-validation
参数(因为我们的测试项目将使用 defaulting webhooks 和 validating webhooks):
kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation
这将创建 Webhook 功能相关的方法,并在 main.go
中注册 Webhook 到你的 manager 中。
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Go imports
package v1
import (
"github.com/robfig/cron"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
validationutils "k8s.io/apimachinery/pkg/util/validation"
"k8s.io/apimachinery/pkg/util/validation/field"
ctrl "sigs.k8s.io/controller-runtime"
logf "sigs.k8s.io/controller-runtime/pkg/runtime/log"
"sigs.k8s.io/controller-runtime/pkg/webhook"
)
下一步,为我们的 webhooks 创建一个 logger。
var cronjoblog = logf.Log.WithName("cronjob-resource")
然后, 我们通过 manager 构建 webhook。
func (r *CronJob) SetupWebhookWithManager(mgr ctrl.Manager) error {
return ctrl.NewWebhookManagedBy(mgr).
For(r).
Complete()
}
注意,下面我们使用 kubebuilder 的 marker 语句生成 webhook manifests(配置 webhook 的 yaml 文件), 这个 marker 会生成一个 mutating Webhook 的 manifests。
关于每个参数的解释都可以在这里找到。
// +kubebuilder:webhook:path=/mutate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=true,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=mcronjob.kb.io
我们使用 webhook.Defaulter
接口将 webhook 的默认值设置为我们的 CRD,
这将自动生成一个 Webhook(defaulting webhooks),并调用它的 Default 方法。
Default
方法用来改变接受到的内容,并设置默认值。
var _ webhook.Defaulter = &CronJob{}
// Default implements webhook.Defaulter so a webhook will be registered for the type
func (r *CronJob) Default() {
cronjoblog.Info("default", "name", r.Name)
if r.Spec.ConcurrencyPolicy == "" {
r.Spec.ConcurrencyPolicy = AllowConcurrent
}
if r.Spec.Suspend == nil {
r.Spec.Suspend = new(bool)
}
if r.Spec.SuccessfulJobsHistoryLimit == nil {
r.Spec.SuccessfulJobsHistoryLimit = new(int32)
*r.Spec.SuccessfulJobsHistoryLimit = 3
}
if r.Spec.FailedJobsHistoryLimit == nil {
r.Spec.FailedJobsHistoryLimit = new(int32)
*r.Spec.FailedJobsHistoryLimit = 1
}
}
这个 marker 负责生成一个 validating webhook manifest。
// TODO(user): change verbs to "verbs=create;update;delete" if you want to enable deletion validation.
// +kubebuilder:webhook:verbs=create;update,path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,versions=v1,name=vcronjob.kb.io
我们可以通过声明式验证(declarative validation)来验证我们的 CRD, 通常情况下,声明式验证就足够了,但是有时更高级的用例需要进行复杂的验证。
例如,我们将在后面看到我们使验证(declarative validation)来验证 cron schedule(是否是 * * * * * 这种格式) 的格式是否正确, 而不是编写复杂的正则表达式来验证。
如果实现了 webhook.Validator
接口,将会自动生成一个 Webhook(validating webhooks) 来调用我们验证方法。
ValidateCreate
、ValidateUpdate
和 ValidateDelete
方法分别在创建,更新和删除 resrouces 时验证它们收到的信息。
我们将 ValidateCreate
与 ValidateUpdate
方法分开,因为某些字段可能是固定不变的,
他们只能在 ValidateCreate
方法中被调用,这样会提高一些安全性,
ValidateDelete
和 ValidateUpdate
方法也被分开,以便在进行删除操作时进行单独的验证。
但是在这里,我们在 ValidateDelete
中什么也没有做,
只是对 ValidateCreate
和 ValidateUpdate
使用同一个方法进行了验证,
因为我们不需要在删除时验证任何内容。
var _ webhook.Validator = &CronJob{}
// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateCreate() error {
cronjoblog.Info("validate create", "name", r.Name)
return r.validateCronJob()
}
// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateUpdate(old runtime.Object) error {
cronjoblog.Info("validate update", "name", r.Name)
return r.validateCronJob()
}
// ValidateDelete implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateDelete() error {
cronjoblog.Info("validate delete", "name", r.Name)
// TODO(user): fill in your validation logic upon object deletion.
return nil
}
验证 CronJob 的 name 和 spec 字段
func (r *CronJob) validateCronJob() error {
var allErrs field.ErrorList
if err := r.validateCronJobName(); err != nil {
allErrs = append(allErrs, err)
}
if err := r.validateCronJobSpec(); err != nil {
allErrs = append(allErrs, err)
}
if len(allErrs) == 0 {
return nil
}
return apierrors.NewInvalid(
schema.GroupKind{Group: "batch.tutorial.kubebuilder.io", Kind: "CronJob"},
r.Name, allErrs)
}
一些字段是用 OpenAPI schema 方式进行验证,
你可以在 设计API 中找到有关 kubebuilder 的验证 markers(注释前缀为//+kubebuilder:validation
)。
你也可以通过运行 controller-gen crd -w
或 在这里
找到所有关于使用 markers 验证的格式信息。
func (r *CronJob) validateCronJobSpec() *field.Error {
// The field helpers from the kubernetes API machinery help us return nicely
// structured validation errors.
return validateScheduleFormat(
r.Spec.Schedule,
field.NewPath("spec").Child("schedule"))
}
我们在这里验证 cron schedule 的格式
func validateScheduleFormat(schedule string, fldPath *field.Path) *field.Error {
if _, err := cron.ParseStandard(schedule); err != nil {
return field.Invalid(fldPath, schedule, err.Error())
}
return nil
}
Validate object name
验证字符串字段的长度可以以声明方式完成。
但是 ObjectMeta.Name
字段是在 apimachinery
库下的 package 中定义的,
因此我们无法以声明性的方式对其进行验证。
func (r *CronJob) validateCronJobName() *field.Error {
if len(r.ObjectMeta.Name) > validationutils.DNS1035LabelMaxLength-11 {
// The job name length is 63 character like all Kubernetes objects
// (which must fit in a DNS subdomain). The cronjob controller appends
// a 11-character suffix to the cronjob (`-$TIMESTAMP`) when creating
// a job. The job name length limit is 63 characters. Therefore cronjob
// names must have length <= 63-11=52. If we don't validate this here,
// then job creation will fail later.
return field.Invalid(field.NewPath("metadata").Child("name"), r.Name, "must be no more than 52 characters")
}
return nil
}
运行和部署 controller
为了测试,我们可以在本地运行 controller。像快速入门中说的,在运行 controller 之前,我们需要先安装 CRD。
一下命令会通过 controller-tools 更新我们的 YAML manifests 文件。
make install
现在,我们已经安装了 CRD,我们可以针对我们的集群运行 controller了,运行 controller 的证书和我们连接 k8s 集群的证书是同一个 ,因此我们现在不必担心 RBAC 权限问题。
在另一个终端中,运行:
make run ENABLE_WEBHOOKS=false
您应该会从 controller 中看到有关启动的日志,但是它目前还没有做任何事。
此时,我们需要创建一个 CronJob 进行测试。让我们写个简单的例子放到 config/samples/batch_v1_cronjob.yaml
中,并运行该例子:
apiVersion: batch.tutorial.kubebuilder.io/v1
kind: CronJob
metadata:
name: cronjob-sample
spec:
schedule: "*/1 * * * *"
startingDeadlineSeconds: 60
concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
kubectl create -f config/samples/batch_v1_cronjob.yaml
执行下面的命令,你应该会看到一连串的信息。如果你使用 -w
参数,应该会看到你的 cronjob
正在运行,并且正在更新状态:
kubectl get cronjob.batch.tutorial.kubebuilder.io -o yaml
kubectl get job
现在, 我们已经可以在集群中运行 CronJob 了。 停掉 make run
命令,然后运行
make docker-build docker-push IMG=<some-registry>/<project-name>:tag
make deploy IMG=<some-registry>/<project-name>:tag
此时 Webhook
并不能正确运行,如果想 Webhook
正确运行,请参运行 Webhook。
注意
-
为了解决镜像拉不到的问题,可以将
Dockerfile
中的FROM gcr.io/distroless/static:nonroot
修改为FROM golang:1.13
或任意可正常拉取到的镜像。将/config/default/manager_auth_proxy_patch.yaml
文件中的gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1
改为xuejipeng/learn:kube-rbac-proxy-v0.4.1
-
为了加速构建,我们可以把 Makefile 中
docker-build
后面的test
去掉
如果像以前一样再次get cronjobs ,我们应该可以看到 controller 再次运行!
部署 cert manager
我们建议使用 cert manager 为 Webhook 服务器配置证书, 如果使用其他方法,将证书放在正确的位置,它们也会起作用。
您可以根据 the cert manager documentation 安装 cert manager。
Cert manager 还具有一个名为 CA injector 的组件, 该组件负责将 CA bundle 注入到 Mutating | ValidatingWebhookConfiguration 中。
为此,您需要在 Mutating | ValidatingWebhookConfiguration 对象中使用 key 为
certmanager.k8s.io/inject-ca-from
的 annotation,
annotation 的 value 应该以 <certificate-namespace>/<certificate-name>
的格式指向现有证书 CR 实例。
这是带有 Mutating | ValidatingWebhookConfiguration 对象的 kustomize 文件补丁。
# This patch add annotation to admission webhook config and
# the variables $(CERTIFICATE_NAMESPACE) and $(CERTIFICATE_NAME) will be substituted by kustomize.
apiVersion: admissionregistration.k8s.io/v1beta1
kind: MutatingWebhookConfiguration
metadata:
name: mutating-webhook-configuration
annotations:
certmanager.k8s.io/inject-ca-from: $(CERTIFICATE_NAMESPACE)/$(CERTIFICATE_NAME)
---
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
name: validating-webhook-configuration
annotations:
certmanager.k8s.io/inject-ca-from: $(CERTIFICATE_NAMESPACE)/$(CERTIFICATE_NAME)
部署 Admission Webhooks
Kind Cluster
建议使用 kind 集群开发 Webhook,以加快迭代速度。 为什么?
- 你可以在1分钟内在本地启动多节点群集。
- 你可以在几秒钟内将其拆除。
- 你无需将 images 推送到远程镜像仓库。
Cert Manager
你需要 按此 安装 cert manager bundle。你只需要安装就好,对于证书的申请 kubebuilder 会帮你做。
Build your image
运行以下命令以在本地生成 image。
make docker-build
如果你使用 kind
创建的群集,则无需将 image 推送到远程镜像仓库。你可以将本地的 image 直接加载到 kind
创建的群集:
kind load docker-image your-image-name:your-tag
部署 Webhooks
您可以通过 kustomize 启动 webhook 和 cert manager 配置,将 config/default/kustomization.yaml
改成如下所示:
# Adds namespace to all resources.
namespace: project-system
# Value of this field is prepended to the
# names of all resources, e.g. a deployment named
# "wordpress" becomes "alices-wordpress".
# Note that it should also match with the prefix (text before '-') of the namespace
# field above.
namePrefix: project-
# Labels to add to all resources and selectors.
#commonLabels:
# someName: someValue
bases:
- ../crd
- ../rbac
- ../manager
# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in
# crd/kustomization.yaml
- ../webhook
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'. 'WEBHOOK' components are required.
- ../certmanager
# [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.
#- ../prometheus
patchesStrategicMerge:
# Protect the /metrics endpoint by putting it behind auth.
# If you want your controller-manager to expose the /metrics
# endpoint w/o any authn/z, please comment the following line.
- manager_auth_proxy_patch.yaml
# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in
# crd/kustomization.yaml
- manager_webhook_patch.yaml
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'.
# Uncomment 'CERTMANAGER' sections in crd/kustomization.yaml to enable the CA injection in the admission webhooks.
# 'CERTMANAGER' needs to be enabled to use ca injection
- webhookcainjection_patch.yaml
# the following config is for teaching kustomize how to do var substitution
vars:
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER' prefix.
- name: CERTIFICATE_NAMESPACE # namespace of the certificate CR
objref:
kind: Certificate
group: cert-manager.io
version: v1alpha2
name: serving-cert # this name should match the one in certificate.yaml
fieldref:
fieldpath: metadata.namespace
- name: CERTIFICATE_NAME
objref:
kind: Certificate
group: cert-manager.io
version: v1alpha2
name: serving-cert # this name should match the one in certificate.yaml
- name: SERVICE_NAMESPACE # namespace of the service
objref:
kind: Service
version: v1
name: webhook-service
fieldref:
fieldpath: metadata.namespace
- name: SERVICE_NAME
objref:
kind: Service
version: v1
name: webhook-service
现在你可以通过执行下面的命令将它们部署到你的集群中
make docker-build docker-push IMG=<some-registry>/<project-name>:tag
make deploy IMG=<some-registry>/<project-name>:tag
稍等片刻,直到出现 webhook pod 启动并且证书被提供。它通常在1分钟内完成。
现在,您可以创建一个有效的 CronJob 来测试你的 webhooks,创建应该成功完成。
kubectl create -f config/samples/batch_v1_cronjob.yaml
您也可以尝试创建一个无效的 CronJob (例如,使用格式错误的 cron schedule 字段),您应该看到创建失败并显示 validation error。
结语
至此,我们已经有了 CronJob controller 的相当完整的实现,并利用了KubeBuilder 的大多数功能。
如果需要更多内容,请转到 Multi-Version 指南l 以了解如何向项目添加新的 API version。
此外,您可以自行尝试执行以下步骤 -- 很快我们将为它们提供一个教程:
- 编写单元/整合测试(使用 [envtest])
- 为
kubectl get
命令添加额外字段 [envtest]: https://godoc.org/sigs.k8s.io/controller-runtime/pkg/envtest
Tutorial: Multi-Version API
Most projects start out with an alpha API that changes release to release. However, eventually, most projects will need to move to a more stable API. Once your API is stable though, you can’t make breaking changes to it. That’s where API versions come into play.
Let’s make some changes to the CronJob
API spec and make sure all the
different versions are supported by our CronJob project.
If you haven’t already, make sure you’ve gone through the base CronJob Tutorial.
Next, let’s figure out what changes we want to make...
Changing things up
A fairly common change in a Kubernetes API is to take some data that used
to be unstructured or stored in some special string format, and change it
to structured data. Our schedule
field fits the bill quite nicely for
this -- right now, in v1
, our schedules look like
schedule: "*/1 * * * *"
That’s a pretty textbook example of a special string format (it’s also pretty unreadable unless you’re a Unix sysadmin).
Let’s make it a bit more structured. According to the our CronJob code, we support “standard” Cron format.
In Kubernetes, all versions must be safely round-tripable through each other. This means that if we convert from version 1 to version 2, and then back to version 1, we must not lose information. Thus, any change we make to our API must be compatible with whatever we supported in v1, and also need to make sure anything we add in v2 is supported in v2. In some cases, this means we need to add new fields to v1, but in our case, we won’t have to, since we’re not adding new functionality.
Keeping all that in mind, let’s convert our example above to be slightly more structured:
schedule:
minute: */1
Now, at least, we’ve got labels for each of our fields, but we can still easily support all the different syntax for each field.
We’ll need a new API version for this change. Let’s call it v2:
kubebuilder create api --group batch --version v2 --kind CronJob
Now, let’s copy over our existing types, and make the change:
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Since we’re in a v2 package, controller-gen will assume this is for the v2
version automatically. We could override that with the +versionName
marker.
package v2
Imports
import (
batchv1beta1 "k8s.io/api/batch/v1beta1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
We’ll leave our spec largely unchanged, except to change the schedule field to a new type.
// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
// The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
Schedule CronSchedule `json:"schedule"`
The rest of Spec
// +kubebuilder:validation:Minimum=0
// Optional deadline in seconds for starting the job if it misses scheduled
// time for any reason. Missed jobs executions will be counted as failed ones.
// +optional
StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`
// Specifies how to treat concurrent executions of a Job.
// Valid values are:
// - "Allow" (default): allows CronJobs to run concurrently;
// - "Forbid": forbids concurrent runs, skipping next run if previous run hasn't finished yet;
// - "Replace": cancels currently running job and replaces it with a new one
// +optional
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// This flag tells the controller to suspend subsequent executions, it does
// not apply to already started executions. Defaults to false.
// +optional
Suspend *bool `json:"suspend,omitempty"`
// Specifies the job that will be created when executing a CronJob.
JobTemplate batchv1beta1.JobTemplateSpec `json:"jobTemplate"`
// +kubebuilder:validation:Minimum=0
// The number of successful finished jobs to retain.
// This is a pointer to distinguish between explicit zero and not specified.
// +optional
SuccessfulJobsHistoryLimit *int32 `json:"successfulJobsHistoryLimit,omitempty"`
// +kubebuilder:validation:Minimum=0
// The number of failed finished jobs to retain.
// This is a pointer to distinguish between explicit zero and not specified.
// +optional
FailedJobsHistoryLimit *int32 `json:"failedJobsHistoryLimit,omitempty"`
}
Next, we’ll need to define a type to hold our schedule. Based on our proposed YAML above, it’ll have a field for each corresponding Cron “field”.
// describes a Cron schedule.
type CronSchedule struct {
// specifies the minute during which the job executes.
// +optional
Minute *CronField `json:"minute,omitempty"`
// specifies the hour during which the job executes.
// +optional
Hour *CronField `json:"hour,omitempty"`
// specifies the day of the month during which the job executes.
// +optional
DayOfMonth *CronField `json:"dayOfMonth,omitempty"`
// specifies the month during which the job executes.
// +optional
Month *CronField `json:"month,omitempty"`
// specifies the day of the week during which the job executes.
// +optional
DayOfWeek *CronField `json:"dayOfWeek,omitempty"`
}
Finally, we’ll define a wrapper type to represent a field. We could attach additional validation to this field, but for now we’ll just use it for documentation purposes.
// represents a Cron field specifier.
type CronField string
Other Types
All the other types will stay the same as before.
// ConcurrencyPolicy describes how the job will be handled.
// Only one of the following concurrent policies may be specified.
// If none of the following policies is specified, the default one
// is AllowConcurrent.
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
type ConcurrencyPolicy string
const (
// AllowConcurrent allows CronJobs to run concurrently.
AllowConcurrent ConcurrencyPolicy = "Allow"
// ForbidConcurrent forbids concurrent runs, skipping next run if previous
// hasn't finished yet.
ForbidConcurrent ConcurrencyPolicy = "Forbid"
// ReplaceConcurrent cancels currently running job and replaces it with a new one.
ReplaceConcurrent ConcurrencyPolicy = "Replace"
)
// CronJobStatus defines the observed state of CronJob
type CronJobStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make" to regenerate code after modifying this file
// A list of pointers to currently running jobs.
// +optional
Active []corev1.ObjectReference `json:"active,omitempty"`
// Information when was the last time the job was successfully scheduled.
// +optional
LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"`
}
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// CronJob is the Schema for the cronjobs API
type CronJob struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec CronJobSpec `json:"spec,omitempty"`
Status CronJobStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// CronJobList contains a list of CronJob
type CronJobList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []CronJob `json:"items"`
}
func init() {
SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}
Storage Versions
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
package v1
Imports
import (
batchv1beta1 "k8s.io/api/batch/v1beta1"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
old stuff
// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
// +kubebuilder:validation:MinLength=0
// The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
Schedule string `json:"schedule"`
// +kubebuilder:validation:Minimum=0
// Optional deadline in seconds for starting the job if it misses scheduled
// time for any reason. Missed jobs executions will be counted as failed ones.
// +optional
StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`
// Specifies how to treat concurrent executions of a Job.
// Valid values are:
// - "Allow" (default): allows CronJobs to run concurrently;
// - "Forbid": forbids concurrent runs, skipping next run if previous run hasn't finished yet;
// - "Replace": cancels currently running job and replaces it with a new one
// +optional
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// This flag tells the controller to suspend subsequent executions, it does
// not apply to already started executions. Defaults to false.
// +optional
Suspend *bool `json:"suspend,omitempty"`
// Specifies the job that will be created when executing a CronJob.
JobTemplate batchv1beta1.JobTemplateSpec `json:"jobTemplate"`
// +kubebuilder:validation:Minimum=0
// The number of successful finished jobs to retain.
// This is a pointer to distinguish between explicit zero and not specified.
// +optional
SuccessfulJobsHistoryLimit *int32 `json:"successfulJobsHistoryLimit,omitempty"`
// +kubebuilder:validation:Minimum=0
// The number of failed finished jobs to retain.
// This is a pointer to distinguish between explicit zero and not specified.
// +optional
FailedJobsHistoryLimit *int32 `json:"failedJobsHistoryLimit,omitempty"`
}
// ConcurrencyPolicy describes how the job will be handled.
// Only one of the following concurrent policies may be specified.
// If none of the following policies is specified, the default one
// is AllowConcurrent.
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
type ConcurrencyPolicy string
const (
// AllowConcurrent allows CronJobs to run concurrently.
AllowConcurrent ConcurrencyPolicy = "Allow"
// ForbidConcurrent forbids concurrent runs, skipping next run if previous
// hasn't finished yet.
ForbidConcurrent ConcurrencyPolicy = "Forbid"
// ReplaceConcurrent cancels currently running job and replaces it with a new one.
ReplaceConcurrent ConcurrencyPolicy = "Replace"
)
// CronJobStatus defines the observed state of CronJob
type CronJobStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make" to regenerate code after modifying this file
// A list of pointers to currently running jobs.
// +optional
Active []corev1.ObjectReference `json:"active,omitempty"`
// Information when was the last time the job was successfully scheduled.
// +optional
LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"`
}
Since we’ll have more than one version, we’ll need to mark a storage version. This is the version that the Kubernetes API server uses to store our data. We’ll chose the v1 version for our project.
We’ll use the +kubebuilder:storageversion
to do this.
Note that multiple versions may exist in storage if they were written before the storage version changes -- changing the storage version only affects how objects are created/updated after the change.
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:storageversion
// CronJob is the Schema for the cronjobs API
type CronJob struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec CronJobSpec `json:"spec,omitempty"`
Status CronJobStatus `json:"status,omitempty"`
}
old stuff
// +kubebuilder:object:root=true
// CronJobList contains a list of CronJob
type CronJobList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []CronJob `json:"items"`
}
func init() {
SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}
Now that we’ve got our types in place, we’ll need to set up conversion...
Hubs, spokes, and other wheel metaphors
Since we now have two different versions, and users can request either version, we’ll have to define a way to convert between our version. For CRDs, this is done using a webhook, similar to the defaulting and validating webhooks we defined in the base tutorial. Like before, controller-runtime will help us wire together the nitty-gritty bits, we just have to implement the actual conversion.
Before we do that, though, we’ll need to understand how controller-runtime thinks about versions. Namely:
Complete graphs are insufficiently nautical
A simple approach to defining conversion might be to define conversion functions to convert between each of our versions. Then, whenever we need to convert, we’d look up the appropriate function, and call it to run the conversion.
This works fine when we just have two versions, but what if we had 4 types? 8 types? That’d be a lot of conversion functions.
Instead, controller-runtime models conversion in terms of a “hub and spoke” model -- we mark one version as the “hub”, and all other versions just define conversion to and from the hub:
Then, if we have to convert between two non-hub versions, we first convert to the hub version, and then to our desired version:
This cuts down on the number of conversion functions that we have to define, and is modeled off of what Kubernetes does internally.
What does that have to do with Webhooks?
When API clients, like kubectl or your controller, request a particular version of your resource, the Kubernetes API server needs to return a result that’s of that version. However, that version might not match the version stored by the API server.
In that case, the API server needs to know how to convert between the desired version and the stored version. Since the conversions aren’t built in for CRDs, the Kubernetes API server calls out to a webhook to do the conversion instead. For KubeBuilder, this webhook is implemented by controller-runtime, and performs the hub-and-spoke conversions that we discussed above.
Now that we have the model for conversion down pat, we can actually implement our conversions.
Implementing conversion
With our model for conversion in place, it’s time to actually implement
the conversion functions. We’ll put them in a file called
cronjob_conversion.go
next to our cronjob_types.go
file, to avoid
cluttering up our main types file with extra functions.
Hub...
First, we’ll implement the hub. We’ll choose the v1 version as the hub:
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
package v1
Implementing the hub method is pretty easy -- we just have to add an empty
method called Hub()
to serve as a
marker.
We could also just put this inline in our cronjob_types.go
file.
// Hub marks this type as a conversion hub.
func (*CronJob) Hub() {}
... and Spokes
Then, we’ll implement our spoke, the v2 version:
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
package v2
Imports
For imports, we’ll need the controller-runtime
conversion
package, plus the API version for our hub type (v1), and finally some of the
standard packages.
import (
"fmt"
"strings"
"sigs.k8s.io/controller-runtime/pkg/conversion"
"tutorial.kubebuilder.io/project/api/v1"
)
Our “spoke” versions need to implement the
Convertible
interface. Namely, they’ll need ConvertTo
and ConvertFrom
methods to convert to/from
the hub version.
ConvertTo is expected to modify its argument to contain the converted object. Most of the conversion is straightforward copying, except for converting our changed field.
// ConvertTo converts this CronJob to the Hub version (v1).
func (src *CronJob) ConvertTo(dstRaw conversion.Hub) error {
dst := dstRaw.(*v1.CronJob)
sched := src.Spec.Schedule
scheduleParts := []string{"*", "*", "*", "*", "*"}
if sched.Minute != nil {
scheduleParts[0] = string(*sched.Minute)
}
if sched.Hour != nil {
scheduleParts[1] = string(*sched.Hour)
}
if sched.DayOfMonth != nil {
scheduleParts[2] = string(*sched.DayOfMonth)
}
if sched.Month != nil {
scheduleParts[3] = string(*sched.Month)
}
if sched.DayOfWeek != nil {
scheduleParts[4] = string(*sched.DayOfWeek)
}
dst.Spec.Schedule = strings.Join(scheduleParts, " ")
rote conversion
The rest of the conversion is pretty rote.
// ObjectMeta
dst.ObjectMeta = src.ObjectMeta
// Spec
dst.Spec.StartingDeadlineSeconds = src.Spec.StartingDeadlineSeconds
dst.Spec.ConcurrencyPolicy = v1.ConcurrencyPolicy(src.Spec.ConcurrencyPolicy)
dst.Spec.Suspend = src.Spec.Suspend
dst.Spec.JobTemplate = src.Spec.JobTemplate
dst.Spec.SuccessfulJobsHistoryLimit = src.Spec.SuccessfulJobsHistoryLimit
dst.Spec.FailedJobsHistoryLimit = src.Spec.FailedJobsHistoryLimit
// Status
dst.Status.Active = src.Status.Active
dst.Status.LastScheduleTime = src.Status.LastScheduleTime
return nil
}
ConvertFrom is expected to modify its receiver to contain the converted object. Most of the conversion is straightforward copying, except for converting our changed field.
// ConvertFrom converts from the Hub version (v1) to this version.
func (dst *CronJob) ConvertFrom(srcRaw conversion.Hub) error {
src := srcRaw.(*v1.CronJob)
schedParts := strings.Split(src.Spec.Schedule, " ")
if len(schedParts) != 5 {
return fmt.Errorf("invalid schedule: not a standard 5-field schedule")
}
partIfNeeded := func(raw string) *CronField {
if raw == "*" {
return nil
}
part := CronField(raw)
return &part
}
dst.Spec.Schedule.Minute = partIfNeeded(schedParts[0])
dst.Spec.Schedule.Hour = partIfNeeded(schedParts[1])
dst.Spec.Schedule.DayOfMonth = partIfNeeded(schedParts[2])
dst.Spec.Schedule.Month = partIfNeeded(schedParts[3])
dst.Spec.Schedule.DayOfWeek = partIfNeeded(schedParts[4])
rote conversion
The rest of the conversion is pretty rote.
// ObjectMeta
dst.ObjectMeta = src.ObjectMeta
// Spec
dst.Spec.StartingDeadlineSeconds = src.Spec.StartingDeadlineSeconds
dst.Spec.ConcurrencyPolicy = ConcurrencyPolicy(src.Spec.ConcurrencyPolicy)
dst.Spec.Suspend = src.Spec.Suspend
dst.Spec.JobTemplate = src.Spec.JobTemplate
dst.Spec.SuccessfulJobsHistoryLimit = src.Spec.SuccessfulJobsHistoryLimit
dst.Spec.FailedJobsHistoryLimit = src.Spec.FailedJobsHistoryLimit
// Status
dst.Status.Active = src.Status.Active
dst.Status.LastScheduleTime = src.Status.LastScheduleTime
return nil
}
Now that we’ve got our conversions in place, all that we need to do is wire up our main to serve the webhook!
Setting up the webhooks
Our conversion is in place, so all that’s left is to tell controller-runtime about our conversion.
Normally, we’d run
kubebuilder create webhook --group batch --version v1 --kind CronJob --conversion
to scaffold out the webhook setup. However, we’ve already got webhook setup, from when we built our defaulting and validating webhooks!
Webhook setup...
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Go imports
package v1
import (
"github.com/robfig/cron"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
validationutils "k8s.io/apimachinery/pkg/util/validation"
"k8s.io/apimachinery/pkg/util/validation/field"
ctrl "sigs.k8s.io/controller-runtime"
logf "sigs.k8s.io/controller-runtime/pkg/runtime/log"
"sigs.k8s.io/controller-runtime/pkg/webhook"
)
var cronjoblog = logf.Log.WithName("cronjob-resource")
This setup is doubles as setup for our conversion webhooks: as long as our types implement the Hub and Convertible interfaces, a conversion webhook will be registered.
func (r *CronJob) SetupWebhookWithManager(mgr ctrl.Manager) error {
return ctrl.NewWebhookManagedBy(mgr).
For(r).
Complete()
}
Existing Defaulting and Validation
var _ webhook.Defaulter = &CronJob{}
// Default implements webhook.Defaulter so a webhook will be registered for the type
func (r *CronJob) Default() {
cronjoblog.Info("default", "name", r.Name)
if r.Spec.ConcurrencyPolicy == "" {
r.Spec.ConcurrencyPolicy = AllowConcurrent
}
if r.Spec.Suspend == nil {
r.Spec.Suspend = new(bool)
}
if r.Spec.SuccessfulJobsHistoryLimit == nil {
r.Spec.SuccessfulJobsHistoryLimit = new(int32)
*r.Spec.SuccessfulJobsHistoryLimit = 3
}
if r.Spec.FailedJobsHistoryLimit == nil {
r.Spec.FailedJobsHistoryLimit = new(int32)
*r.Spec.FailedJobsHistoryLimit = 1
}
}
// TODO(user): change verbs to "verbs=create;update;delete" if you want to enable deletion validation.
// +kubebuilder:webhook:verbs=create;update,path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,versions=v1,name=vcronjob.kb.io
var _ webhook.Validator = &CronJob{}
// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateCreate() error {
cronjoblog.Info("validate create", "name", r.Name)
return r.validateCronJob()
}
// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateUpdate(old runtime.Object) error {
cronjoblog.Info("validate update", "name", r.Name)
return r.validateCronJob()
}
// ValidateDelete implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateDelete() error {
cronjoblog.Info("validate delete", "name", r.Name)
// TODO(user): fill in your validation logic upon object deletion.
return nil
}
func (r *CronJob) validateCronJob() error {
var allErrs field.ErrorList
if err := r.validateCronJobName(); err != nil {
allErrs = append(allErrs, err)
}
if err := r.validateCronJobSpec(); err != nil {
allErrs = append(allErrs, err)
}
if len(allErrs) == 0 {
return nil
}
return apierrors.NewInvalid(
schema.GroupKind{Group: "batch.tutorial.kubebuilder.io", Kind: "CronJob"},
r.Name, allErrs)
}
func (r *CronJob) validateCronJobSpec() *field.Error {
// The field helpers from the kubernetes API machinery help us return nicely
// structured validation errors.
return validateScheduleFormat(
r.Spec.Schedule,
field.NewPath("spec").Child("schedule"))
}
func validateScheduleFormat(schedule string, fldPath *field.Path) *field.Error {
if _, err := cron.ParseStandard(schedule); err != nil {
return field.Invalid(fldPath, schedule, err.Error())
}
return nil
}
func (r *CronJob) validateCronJobName() *field.Error {
if len(r.ObjectMeta.Name) > validationutils.DNS1035LabelMaxLength-11 {
// The job name length is 63 character like all Kubernetes objects
// (which must fit in a DNS subdomain). The cronjob controller appends
// a 11-character suffix to the cronjob (`-$TIMESTAMP`) when creating
// a job. The job name length limit is 63 characters. Therefore cronjob
// names must have length <= 63-11=52. If we don't validate this here,
// then job creation will fail later.
return field.Invalid(field.NewPath("metadata").Child("name"), r.Name, "must be no more than 52 characters")
}
return nil
}
...and main.go
Similarly, our existing main file is sufficient:
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Imports
package main
import (
"flag"
"os"
kbatchv1 "k8s.io/api/batch/v1"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
batchv1 "tutorial.kubebuilder.io/project/api/v1"
batchv2 "tutorial.kubebuilder.io/project/api/v2"
"tutorial.kubebuilder.io/project/controllers"
// +kubebuilder:scaffold:imports
)
existing setup
var (
scheme = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
)
func init() {
_ = clientgoscheme.AddToScheme(scheme)
_ = kbatchv1.AddToScheme(scheme) // we've added this ourselves
_ = batchv1.AddToScheme(scheme)
_ = batchv2.AddToScheme(scheme)
// +kubebuilder:scaffold:scheme
}
func main() {
existing setup
var metricsAddr string
var enableLeaderElection bool
flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,
"Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.")
flag.Parse()
ctrl.SetLogger(zap.New(zap.UseDevMode(true)))
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
MetricsBindAddress: metricsAddr,
LeaderElection: enableLeaderElection,
})
if err != nil {
setupLog.Error(err, "unable to start manager")
os.Exit(1)
}
if err = (&controllers.CronJobReconciler{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("Captain"),
Scheme: mgr.GetScheme(), // we've added this ourselves
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Captain")
os.Exit(1)
}
Our existing call to SetupWebhookWithManager registers our conversion webhooks with the manager, too.
if err = (&batchv1.CronJob{}).SetupWebhookWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create webhook", "webhook", "Captain")
os.Exit(1)
}
// +kubebuilder:scaffold:builder
existing setup
setupLog.Info("starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}
Everything’s set up and ready to go! All that’s left now is to test out our webhooks.
Deployment and Testing
Before we can test out our conversion, we’ll need to enable them conversion in our CRD:
Kubebuilder generates Kubernetes manifests under the config
directory with webhook
bits disabled. To enable them, we need to:
-
Enable
patches/webhook_in_<kind>.yaml
andpatches/cainjection_in_<kind>.yaml
inconfig/crd/kustomization.yaml
file. -
Enable
../certmanager
and../webhook
directories under thebases
section inconfig/default/kustomization.yaml
file. -
Enable
manager_webhook_patch.yaml
under thepatches
section inconfig/default/kustomization.yaml
file. -
Enable all the vars under the
CERTMANAGER
section inconfig/default/kustomization.yaml
file.
Additionally, we’ll need to set the CRD_OPTIONS
variable to just
"crd"
, removing the trivialVersions
option (this ensures that we
actually generate validation for each version, instead of
telling Kubernetes that they’re the same):
CRD_OPTIONS ?= "crd"
Now we have all our code changes and manifests in place, so let’s deploy it to the cluster and test it out.
You’ll need cert-manager installed
(version 0.9.0+
) unless you’ve got some other certificate management
solution. The Kubebuilder team has tested the instructions in this tutorial
with
0.9.0-alpha.0
release.
Once all our ducks are in a row with certificates, we can run make install deploy
(as normal) to deploy all the bits (CRD,
controller-manager deployment) onto the cluster.
Testing
Once all of the bits are up an running on the cluster with conversion enabled, we can test out our conversion by requesting different versions.
We’ll make a v2 version based on our v1 version (put it under config/samples
)
apiVersion: batch.tutorial.kubebuilder.io/v2
kind: CronJob
metadata:
name: cronjob-sample
spec:
schedule:
minute: "*/1"
startingDeadlineSeconds: 60
concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Then, we can create it on the cluster:
kubectl apply -f config/samples/batch_v2_cronjob.yaml
If we’ve done everything correctly, it should create successfully, and we should be able to fetch it using both the v2 resource
kubectl get cronjobs.v2.batch.tutorial.kubebuilder.io -o yaml
apiVersion: batch.tutorial.kubebuilder.io/v2
kind: CronJob
metadata:
name: cronjob-sample
spec:
schedule:
minute: "*/1"
startingDeadlineSeconds: 60
concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
and the v1 resource
kubectl get cronjobs.v1.batch.tutorial.kubebuilder.io -o yaml
apiVersion: batch.tutorial.kubebuilder.io/v1
kind: CronJob
metadata:
name: cronjob-sample
spec:
schedule: "*/1 * * * *"
startingDeadlineSeconds: 60
concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Both should be filled out, and look equivalent to our v2 and v1 samples, respectively. Notice that each has a different API version.
Finally, if we wait a bit, we should notice that our CronJob continues to reconcile, even though our controller is written against our v1 API version.
Troubleshooting
Migrations
Migrating between project structures in KubeBuilder generally involves a bit of manual work.
This section details what’s required to migrate, between different versions of KubeBuilder scaffolding, as well as to more complex project layout structures.
Kubebuilder v1 vs v2
This document cover all breaking changes when migrating from v1 to v2.
The details of all changes (breaking or otherwise) can be found in controller-runtime, controller-tools and kubebuilder release notes.
Common changes
V2 project uses go modules. But kubebuilder will continue to support dep
until
go 1.13 is out.
controller-runtime
-
Client.List
now uses functional options (List(ctx, list, ...option)
) instead ofList(ctx, ListOptions, list)
. -
Client.DeleteAllOf
was added to theClient
interface. -
Metrics are on by default now.
-
A number of packages under
pkg/runtime
have been moved, with their old locations deprecated. The old locations will be removed before controller-runtime v1.0.0. See the godocs for more information.
Webhook-related
-
Automatic certificate generation for webhooks has been removed, and webhooks will no longer self-register. Use controller-tools to generate a webhook configuration. If you need certificate generation, we recommend using cert-manager. Kubebuilder v2 will scaffold out cert manager configs for you to use -- see the Webhook Tutorial for more details.
-
The
builder
package now has separate builders for controllers and webhooks, which facilitates choosing which to run.
controller-tools
The generator framework has been rewritten in v2. It still works the same as before in many cases, but be aware that there are some breaking changes. Please check marker documentation for more details.
Kubebuilder
-
Kubebuilder v2 introduces a simplified project layout. You can find the design doc here.
-
In v1, the manager is deployed as a
StatefulSet
, while it’s deployed as aDeployment
in v2. -
The
kubebuilder create webhook
command was added to scaffold mutating/validating/conversion webhooks. It replaces thekubebuilder alpha webhook
command. -
v2 uses
distroless/static
instead of Ubuntu as base image. This reduces image size and attack surface. -
v2 requires kustomize v3.1.0+.
Migration from v1 to v2
Make sure you understand the differences between Kubebuilder v1 and v2 before continuing
Please ensure you have followed the installation guide to install the required components.
The recommended way to migrate a v1 project is to create a new v2 project and copy over the API and the reconciliation code. The conversion will end up with a project that looks like a native v2 project. However, in some cases, it’s possible to do an in-place upgrade (i.e. reuse the v1 project layout, upgrading controller-runtime and controller-tools.
Let’s take the example v1 project and migrate it to Kubebuilder v2. At the end, we should have something that looks like the example v2 project.
Preparation
We’ll need to figure out what the group, version, kind and domain are.
Let’s take a look at our current v1 project structure:
pkg/
├── apis
│ ├── addtoscheme_batch_v1.go
│ ├── apis.go
│ └── batch
│ ├── group.go
│ └── v1
│ ├── cronjob_types.go
│ ├── cronjob_types_test.go
│ ├── doc.go
│ ├── register.go
│ ├── v1_suite_test.go
│ └── zz_generated.deepcopy.go
├── controller
└── webhook
All of our API information is stored in pkg/apis/batch
, so we can look
there to find what we need to know.
In cronjob_types.go
, we can find
type CronJob struct {...}
In register.go
, we can find
SchemeGroupVersion = schema.GroupVersion{Group: "batch.tutorial.kubebuilder.io", Version: "v1"}
Putting that together, we get CronJob
as the kind, and batch.tutorial.kubebuilder.io/v1
as the group-version
Initialize a v2 Project
Now, we need to initialize a v2 project. Before we do that, though, we’ll need
to initialize a new go module if we’re not on the gopath
:
go mod init tutorial.kubebuilder.io/project
Then, we can finish initializing the project with kubebuilder:
kubebuilder init --domain tutorial.kubebuilder.io
Migrate APIs and Controllers
Next, we’ll re-scaffold out the API types and controllers. Since we want both, we’ll say yes to both the API and controller prompts when asked what parts we want to scaffold:
kubebuilder create api --group batch --version v1 --kind CronJob
If you’re using multiple groups, some manual work is required to migrate. Please follow this for more details.
Migrate the APIs
Now, let’s copy the API definition from pkg/apis/batch/v1/cronjob_types.go
to
api/v1/cronjob_types.go
. We only need to copy the implementation of the Spec
and Status
fields.
We can replace the +k8s:deepcopy-gen:interfaces=...
marker (which is
deprecated in kubebuilder) with
+kubebuilder:object:root=true
.
We don’t need the following markers any more (they’re not used anymore, and are relics from much older versions of KubeBuilder):
// +genclient
// +k8s:openapi-gen=true
Our API types should look like the following:
// +kubebuilder:object:root=true
// CronJob is the Schema for the cronjobs API
type CronJob struct {...}
// +kubebuilder:object:root=true
// CronJobList contains a list of CronJob
type CronJobList struct {...}
Migrate the Controllers
Now, let’s migrate the controller reconciler code from
pkg/controller/cronjob/cronjob_controller.go
to
controllers/cronjob_controller.go
.
We’ll need to copy
- the fields from the
ReconcileCronJob
struct toCronJobReconciler
- the contents of the
Reconcile
function - the rbac related markers to the new file.
- the code under
func add(mgr manager.Manager, r reconcile.Reconciler) error
tofunc SetupWithManager
Migrate the Webhooks
If you don’t have a webhook, you can skip this section.
Webhooks for Core Types and External CRDs
If you are using webhooks for Kubernetes core types (e.g. Pods), or for an external CRD that is not owned by you, you can refer the controller-runtime example for builtin types and do something similar. Kubebuilder doesn’t scaffold much for these cases, but you can use the library in controller-runtime.
Scaffold Webhooks for our CRDs
Now let’s scaffold the webhooks for our CRD (CronJob). We’ll need to run the
following command with the --defaulting
and --programmatic-validation
flags
(since our test project uses defaulting and validating webhooks):
kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation
Depending on how many CRDs need webhooks, we may need to run the above command multiple times with different Group-Version-Kinds.
Now, we’ll need to copy the logic for each webhook. For validating webhooks, we
can copy the contents from
func validatingCronJobFn
in pkg/default_server/cronjob/validating/cronjob_create_handler.go
to func ValidateCreate
in api/v1/cronjob_webhook.go
and then the same for update
.
Similarly, we’ll copy from func mutatingCronJobFn
to func Default
.
Webhook Markers
When scaffolding webhooks, Kubebuilder v2 adds the following markers:
// These are v2 markers
// This is for the mutating webhook
// +kubebuilder:webhook:path=/mutate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=true,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=mcronjob.kb.io
...
// This is for the validating webhook
// +kubebuilder:webhook:path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=vcronjob.kb.io
The default verbs are verbs=create;update
. We need to ensure verbs
matches
what we need. For example, if we only want to validate creation, then we would
change it to verbs=create
.
We also need to ensure failure-policy
is still the same.
Markers like the following are no longer needed (since they deal with self-deploying certificate configuration, which was removed in v2):
// v1 markers
// +kubebuilder:webhook:port=9876,cert-dir=/tmp/cert
// +kubebuilder:webhook:service=test-system:webhook-service,selector=app:webhook-server
// +kubebuilder:webhook:secret=test-system:webhook-server-secret
// +kubebuilder:webhook:mutating-webhook-config-name=test-mutating-webhook-cfg
// +kubebuilder:webhook:validating-webhook-config-name=test-validating-webhook-cfg
In v1, a single webhook marker may be split into multiple ones in the same paragraph. In v2, each webhook must be represented by a single marker.
Others
If there are any manual updates in main.go
in v1, we need to port the changes
to the new main.go
. We’ll also need to ensure all of the needed schemes have
been registered.
If there are additional manifests added under config
directory, port them as
well.
Change the image name in the Makefile if needed.
Verification
Finally, we can run make
and make docker-build
to ensure things are working
fine.
Single Group to Multi-Group
While KubeBuilder v2 will not scaffold out a project structure compatible with multiple API groups in the same repository by default, it’s possible to modify the default project structure to support it.
Let’s migrate the CronJob example.
Generally, we use the prefix for the API group as the directory name. We
can check api/v1/groupversion_info.go
to find that out:
// +groupName=batch.tutorial.kubebuilder.io
package v1
Then, we’ll rename api
to apis
to be more clear, and we’ll move our
existing APIs into a new subdirectory, “batch”:
mkdir apis/batch
mv api/* apis/batch
# After ensuring that all was moved successfully remove the old directory `api/`
rm -rf api/
After moving the APIs to a new directory, the same needs to be applied to the controllers:
mkdir controllers/batch
mv controllers/* controllers/batch/
Next, we’ll need to update all the references to the old package name.
For CronJob, that’ll be main.go
and controllers/batch/cronjob_controller.go
.
If you’ve added additional files to your project, you’ll need to track down imports there as well.
Finally, we’ll run the command which enable the multi-group layout in the project:
kubebuilder edit --multigroup=true
When the command kubebuilder edit --multigroup=true
is executed it will add a new line
to PROJECT
that marks this a multi-group project:
version: "2"
domain: tutorial.kubebuilder.io
repo: tutorial.kubebuilder.io/project
multigroup: true
Note that this option indicates to KubeBuilder that this is a multi-group project.
In this way, if the project is not new and has previous APIs already implemented will be in the previous structure.
Notice that with the multi-group
project the Kind API’s files are
created under apis/<group>/<version>
instead of api/<version>
.
Also, note that the controllers will be created under controllers/<group>
instead of controllers
.
That is the reason why we moved the previously generated APIs with the provided scripts in the previous steps.
Remember to update the references afterwards.
The CronJob tutorial explains each of these changes in more detail (in the context of how they’re generated by KubeBuilder for single-group projects).
Reference
-
Using Finalizers Finalizers are a mechanism to execute any custom logic related to a resource before it gets deleted from Kubernetes cluster.
-
What’s a webhook? Webhooks are HTTP callbacks, there are 3 types of webhooks in k8s: 1) admission webhook 2) CRD conversion webhook 3) authorization webhook
- Admission webhook Admission webhooks are HTTP callbacks for mutating or validating resources before the API server admit them.
Generating CRDs
KubeBuilder uses a tool called controller-gen
to
generate utility code and Kubernetes object YAML, like
CustomResourceDefinitions.
To do this, it makes use of special “marker comments” (comments that start
with // +
) to indicate additional information about fields, types, and
packages. In the case of CRDs, these are generally pulled from your
_types.go
files. For more information on markers, see the marker
reference docs.
KubeBuilder provides a make
target to run controller-gen and generate
CRDs: make manifests
.
When you run make manifests
, you should see CRDs generated under the
config/crd/bases
directory. make manifests
can generate a number of
other artifacts as well -- see the marker reference docs for
more details.
Validation
CRDs support declarative validation using an OpenAPI
v3 schema in the validation
section.
In general, validation markers may be attached to fields or to types. If you’re defining complex validation, if you need to re-use validation, or if you need to validate slice elements, it’s often best to define a new type to describe your validation.
For example:
type ToySpec struct {
// +kubebuilder:validation:MaxLength=15
// +kubebuilder:validation:MinLength=1
Name string `json:"name,omitempty"`
// +kubebuilder:validation:MaxItems=500
// +kubebuilder:validation:MinItems=1
// +kubebuilder:validation:UniqueItems=true
Knights []string `json:"knights,omitempty"`
Alias Alias `json:"alias,omitempty"`
Rank Rank `json:"rank"`
}
// +kubebuilder:validation:Enum=Lion;Wolf;Dragon
type Alias string
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=3
// +kubebuilder:validation:ExclusiveMaximum=false
type Rank int32
Additional Printer Columns
Starting with Kubernetes 1.11, kubectl get
can ask the server what
columns to display. For CRDs, this can be used to provide useful,
type-specific information with kubectl get
, similar to the information
provided for built-in types.
The information that gets displayed can be controlled with the
[additionalPrinterColumns field][kube-additional-printer-columns] on your
CRD, which is controlled by the
+kubebuilder:printcolumn
marker on the Go type for
your CRD.
For instance, in the following example, we add fields to display information about the knights, rank, and alias fields from the validation example:
// +kubebuilder:printcolumn:name="Alias",type=string,JSONPath=`.spec.alias`
// +kubebuilder:printcolumn:name="Rank",type=integer,JSONPath=`.spec.rank`
// +kubebuilder:printcolumn:name="Bravely Run Away",type=boolean,JSONPath=`.spec.knights[?(@ == "Sir Robin")]`,description="when danger rears its ugly head, he bravely turned his tail and fled",priority=10
type Toy struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ToySpec `json:"spec,omitempty"`
Status ToyStatus `json:"status,omitempty"`
}
Subresources
CRDs can choose to implement the /status
and /scale
subresources as of Kubernetes 1.13.
It’s generally reccomended that you make use of the /status
subresource
on all resources that have a status field.
Both subresources have a corresponding marker.
Status
The status subresource is enabled via +kubebuilder:subresource:status
.
When enabled, updates at the main resource will not change status.
Similarly, updates to the status subresource cannot change anything but
the status field.
For example:
// +kubebuilder:subresource:status
type Toy struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ToySpec `json:"spec,omitempty"`
Status ToyStatus `json:"status,omitempty"`
}
Scale
The scale subresource is enabled via +kubebuilder:subresource:scale
.
When enabled, users will be able to use kubectl scale
with your
resource. If the selectorpath
argument pointed to the string form of
a label selector, the HorizontalPodAutoscaler will be able to autoscale
your resource.
For example:
type CustomSetSpec struct {
Replicas *int32 `json:"replicas"`
}
type CustomSetStatus struct {
Replicas int32 `json:"replicas"`
Selector string `json:"selector"` // this must be the string form of the selector
}
// +kubebuilder:subresource:status
// +kubebuilder:subresource:scale:specpath=.spec.replicas,statuspath=.status.replicas,selectorpath=.status.selector
type CustomSet struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ToySpec `json:"spec,omitempty"`
Status ToyStatus `json:"status,omitempty"`
}
Multiple Versions
As of Kubernetes 1.13, you can have multiple versions of your Kind defined in your CRD, and use a webhook to convert between them.
For more details on this process, see the multiversion tutorial.
By default, KubeBuilder disables generating different validation for different versions of the Kind in your CRD, to be compatible with older Kubernetes versions.
You’ll need to enable this by switching the line in your makefile that
says CRD_OPTIONS ?= "crd:trivialVersions=true
to CRD_OPTIONS ?= crd
Then, you can use the +kubebuilder:storageversion
marker
to indicate the GVK that
should be used to store data by the API server.
Under the hood
KubeBuilder scaffolds out make rules to run controller-gen
. The rules
will automatically install controller-gen if it’s not on your path using
go get
with Go modules.
You can also run controller-gen
directly, if you want to see what it’s
doing.
Each controller-gen “generator” is controlled by an option to
controller-gen, using the same syntax as markers. For instance, to
generate CRDs with “trivial versions” (no version conversion webhooks), we
call controller-gen crd:trivialVersions=true paths=./api/...
.
controller-gen also supports different output “rules” to control how
and where output goes. Notice the manifests
make rule (condensed
slightly to only generate CRDs):
# Generate manifests for CRDs
manifests: controller-gen
$(CONTROLLER_GEN) crd:trivialVersions=true paths="./..." output:crd:artifacts:config=config/crd/bases
It uses the output:crd:artifacts
output rule to indicate that
CRD-related config (non-code) artifacts should end up in
config/crd/bases
instead of config/crd
.
To see all the options for controller-gen
, run
$ controller-gen -h
or, for more details:
$ controller-gen -hhh
Using Finalizers
Finalizers
allow controllers to implement asynchronous pre-delete hooks. Let’s
say you create an external resource (such as a storage bucket) for each object of
your API type, and you want to delete the associated external resource
on object’s deletion from Kubernetes, you can use a finalizer to do that.
You can read more about the finalizers in the Kubernetes reference docs. The section below demonstrates how to register and trigger pre-delete hooks
in the Reconcile
method of a controller.
The key point to note is that a finalizer causes “delete” on the object to become an “update” to set deletion timestamp. Presence of deletion timestamp on the object indicates that it is being deleted. Otherwise, without finalizers, a delete shows up as a reconcile where the object is missing from the cache.
Highlights:
- If the object is not being deleted and does not have the finalizer registered, then add the finalizer and update the object in Kubernetes.
- If object is being deleted and the finalizer is still present in finalizers list, then execute the pre-delete logic and remove the finalizer and update the object.
- Ensure that the pre-delete logic is idempotent.
Apache License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Imports
First, we start out with some standard imports. As before, we need the core controller-runtime library, as well as the client package, and the package for our API types.
package controllers
import (
"context"
"github.com/go-logr/logr"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
batchv1 "tutorial.kubebuilder.io/project/api/v1"
)
The code snippet below shows skeleton code for implementing a finalizer.
func (r *CronJobReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
ctx := context.Background()
log := r.Log.WithValues("cronjob", req.NamespacedName)
var cronJob *batchv1.CronJob
if err := r.Get(ctx, req.NamespacedName, cronJob); err != nil {
log.Error(err, "unable to fetch CronJob")
// we'll ignore not-found errors, since they can't be fixed by an immediate
// requeue (we'll need to wait for a new notification), and we can get them
// on deleted requests.
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// name of our custom finalizer
myFinalizerName := "storage.finalizers.tutorial.kubebuilder.io"
// examine DeletionTimestamp to determine if object is under deletion
if cronJob.ObjectMeta.DeletionTimestamp.IsZero() {
// The object is not being deleted, so if it does not have our finalizer,
// then lets add the finalizer and update the object. This is equivalent
// registering our finalizer.
if !containsString(cronJob.ObjectMeta.Finalizers, myFinalizerName) {
cronJob.ObjectMeta.Finalizers = append(cronJob.ObjectMeta.Finalizers, myFinalizerName)
if err := r.Update(context.Background(), cronJob); err != nil {
return ctrl.Result{}, err
}
}
} else {
// The object is being deleted
if containsString(cronJob.ObjectMeta.Finalizers, myFinalizerName) {
// our finalizer is present, so lets handle any external dependency
if err := r.deleteExternalResources(cronJob); err != nil {
// if fail to delete the external dependency here, return with error
// so that it can be retried
return ctrl.Result{}, err
}
// remove our finalizer from the list and update it.
cronJob.ObjectMeta.Finalizers = removeString(cronJob.ObjectMeta.Finalizers, myFinalizerName)
if err := r.Update(context.Background(), cronJob); err != nil {
return ctrl.Result{}, err
}
}
// Stop reconciliation as the item is being deleted
return ctrl.Result{}, nil
}
// Your reconcile logic
return ctrl.Result{}, nil
}
func (r *Reconciler) deleteExternalResources(cronJob *batch.CronJob) error {
//
// delete any external resources associated with the cronJob
//
// Ensure that delete implementation is idempotent and safe to invoke
// multiple types for same object.
}
// Helper functions to check and remove string from a slice of strings.
func containsString(slice []string, s string) bool {
for _, item := range slice {
if item == s {
return true
}
}
return false
}
func removeString(slice []string, s string) (result []string) {
for _, item := range slice {
if item == s {
continue
}
result = append(result, item)
}
return
}
Kind Cluster
This only cover the basics to use a kind cluster. You can find more details at kind documentation.
Installation
You can follow this to
install kind
.
Create a Cluster
You can simply create a kind
cluster by
kind create cluster
To customize your cluster, you can provide additional configuration.
For example, the following is a sample kind
configuration.
kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
Using the configuration above, run the following command will give you a k8s 1.14.2 cluster with 1 master and 3 workers.
kind create cluster --config hack/kind-config.yaml --image=kindest/node:v1.14.2
You can use --image
flag to specify the cluster version you want, e.g.
--image=kindest/node:v1.13.6
, the supported version are listed
here
Cheetsheet
kind load docker-image your-image-name:your-tag
- Point
kubectl
to the kind cluster
kind export kubeconfig
- Delete a kind cluster
kind delete cluster
Webhook
Webhooks are requests for information sent in a blocking fashion. A web application implementing webhooks will send an HTTP request to other application when certain event happens.
In the kubernetes world, there are 3 kinds of webhooks: admission webhook, authorization webhook and CRD conversion webhook.
In controller-runtime libraries, we support admission webhooks and CRD conversion webhooks.
Kubernetes supports these dynamic admission webhooks as of version 1.9 (when the feature entered beta).
Kubernetes supports the conversion webhooks as of version 1.15 (when the feature entered beta).
Admission Webhooks
Admission webhooks are HTTP callbacks that receive admission requests, process them and return admission responses.
Kubernetes provides the following types of admission webhooks:
-
Mutating Admission Webhook: These can mutate the object while it’s being created or updated, before it gets stored. It can be used to default fields in a resource requests, e.g. fields in Deployment that are not specified by the user. It can be used to inject sidecar containers.
-
Validating Admission Webhook: These can validate the object while it’s being created or updated, before it gets stored. It allows more complex validation than pure schema-based validation. e.g. cross-field validation and pod image whitelisting.
The apiserver by default doesn’t authenticate itself to the webhooks. However, if you want to authenticate the clients, you can configure the apiserver to use basic auth, bearer token, or a cert to authenticate itself to the webhooks. You can find detailed steps here.
Admission Webhook for Core Types
It is very easy to build admission webhooks for CRDs, which has been covered in the CronJob tutorial. Given that kubebuilder doesn’t support webhook scaffolding for core types, you have to use the library from controler-runtime to handle it. There is an example in controller-runtime.
It is suggested to use kubebuilder to initialize a project, and then you can follow the steps below to add admission webhooks for core types.
Implement Your Handler
You need to have your handler implements the admission.Handler interface.
type podAnnotator struct {
Client client.Client
decoder *admission.Decoder
}
func (a *podAnnotator) Handle(ctx context.Context, req admission.Request) admission.Response {
pod := &corev1.Pod{}
err := a.decoder.Decode(req, pod)
if err != nil {
return admission.Errored(http.StatusBadRequest, err)
}
// mutate the fields in pod
marshaledPod, err := json.Marshal(pod)
if err != nil {
return admission.Errored(http.StatusInternalServerError, err)
}
return admission.PatchResponseFromRaw(req.Object.Raw, marshaledPod)
}
If you need a client, just pass in the client at struct construction time.
If you add the InjectDecoder
method for your handler, a decoder will be
injected for you.
func (a *podAnnotator) InjectDecoder(d *admission.Decoder) error {
a.decoder = d
return nil
}
Note: in order to have controller-gen generate the webhook configuration for
you, you need to add markers. For example,
// +kubebuilder:webhook:path=/mutate-v1-pod,mutating=true,failurePolicy=fail,groups="",resources=pods,verbs=create;update,versions=v1,name=mpod.kb.io
Update main.go
Now you need to register your handler in the webhook server.
mgr.GetWebhookServer().Register("/mutate-v1-pod", &webhook.Admission{Handler: &podAnnotator{Client: mgr.GetClient()}})
You need to ensure the path here match the path in the marker.
Deploy
Deploying it is just like deploying a webhook server for CRD. You need to
- provision the serving certificate 2) deploy the server
You can follow the tutorial.
Markers for Config/Code Generation
KubeBuilder makes use of a tool called controller-gen for generating utility code and Kubernetes YAML. This code and config generation is controlled by the presence of special “marker comments” in Go code.
Markers are single-line comments that start with a plus, followed by a marker name, optionally followed by some marker specific configuration:
// +kubebuilder:validation:Optional
// +kubebuilder:validation:MaxItems=2
// +kubebuilder:printcolumn:JSONPath=".status.replicas",name=Replicas,type=string
See each subsection for information about different types of code and YAML generation.
Generating Code & Artifacts in KubeBuilder
KubeBuilder projects have two make
targets that make use of
controller-gen:
-
make manifests
generates Kubernetes object YAML, like CustomResourceDefinitions, WebhookConfigurations, and RBAC roles. -
make generate
generates code, like runtime.Object/DeepCopy implementations.
See Generating CRDs for a comprehensive overview.
Marker Syntax
Exact syntax is described in the godocs for controller-tools.
In general, markers may either be:
-
Empty (
+kubebuilder:validation:Optional
): empty markers are like boolean flags on the command line -- just specifying them enables some behavior. -
Anonymous (
+kubebuilder:validation:MaxItems=2
): anonymous markers take a single value as their argument. -
Multi-option (
+kubebuilder:printcolumn:JSONPath=".status.replicas",name=Replicas,type=string
): multi-option markers take one or more named arguments. The first argument is separated from the name by a colon, and latter arguments are comma-separated. Order of arguments doesn’t matter. Some arguments may be optional.
Marker arguments may be strings, ints, bools, slices, or maps thereof. Strings, ints, and bools follow their Go syntax:
// +kubebuilder:validation:ExclusiveMaximum=false
// +kubebuilder:validation:Format="date-time"
// +kubebuilder:validation:Maximum=42
For convenience, in simple cases the quotes may be omitted from strings, although this is not encouraged for anything other than single-word strings:
// +kubebuilder:validation:Type=string
Slices may be specified either by surrounding them with curly braces and separating with commas:
// +kubebuilder:webhooks:Enum={"crackers, Gromit, we forgot the crackers!","not even wensleydale?"}
or, in simple cases, by separating with semicolons:
// +kubebuilder:validation:Enum=Wallace;Gromit;Chicken
Maps are specified with string keys and values of any type (effectively
map[string]interface{}
). A map is surrounded by curly braces ({}
),
each key and value is separated by a colon (:
), and each key-value
pair is separated by a comma:
// +kubebuilder:validation:Default={magic: {numero: 42, stringified: forty-two}}
CRD Generation
These markers describe how to construct a custom resource definition from a series of Go types and packages. Generation of the actual validation schema is described by the validation markers.
See Generating CRDs for examples.
- groupName
- string
specifies the API group name for this package.
- string
- kubebuilder:printcolumn
- JSONPath
- string
- description
- string
- format
- string
- name
- string
- priority
- int
- type
- string
adds a column to "kubectl get" output for this CRD.
- JSONPath
- string
specifies the jsonpath expression used to extract the value of the column.
- description
- string
specifies the help/description for this column.
- format
- string
specifies the format of the column.
It may be any OpenAPI data format corresponding to the type, listed at https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types.
- name
- string
specifies the name of the column.
- priority
- int
indicates how important it is that this column be displayed.
Lower priority (higher numbered) columns will be hidden if the terminal width is too small.
- type
- string
indicates the type of the column.
It may be any OpenAPI data type listed at https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types.
- kubebuilder:resource
- categories
- string
- path
- string
- scope
- string
- shortName
- string
- singular
- string
configures naming and scope for a CRD.
- categories
- string
specifies which group aliases this resource is part of.
Group aliases are used to work with groups of resources at once. The most common one is “all“ which covers about a third of the base resources in Kubernetes, and is generally used for “user-facing“ resources.
- path
- string
specifies the plural "resource" for this CRD.
It generally corresponds to a plural, lower-cased version of the Kind. See https://book.kubebuilder.io/cronjob-tutorial/gvks.html.
- scope
- string
overrides the scope of the CRD (cluster vs namespaced).
Scope defaults to “namespaced“. Cluster-scoped (“cluster“) resources don‘t exist in namespaces.
- shortName
- string
specifies aliases for this CRD.
Short names are often used when people have work with your resource over and over again. For instance, “rs“ for “replicaset“ or “crd“ for customresourcedefinition.
- singular
- string
overrides the singular form of your resource.
The singular form is otherwise defaulted off the plural (path).
- kubebuilder:skip
don't consider this package as an API version.
- kubebuilder:skipversion
removes the particular version of the CRD from the CRDs spec.
This is useful if you need to skip generating and listing version entries for ‘internal‘ resource versions, which typically exist if using the Kubernetes upstream conversion-gen tool.
- kubebuilder:storageversion
marks this version as the "storage version" for the CRD for conversion.
When conversion is enabled for a CRD (i.e. it‘s not a trivial-versions/single-version CRD), one version is set as the “storage version“ to be stored in etcd. Attempting to store any other version will result in conversion to the storage version via a conversion webhook.
- kubebuilder:subresource:scale
- selectorpath
- string
- specpath
- string
- statuspath
- string
enables the "/scale" subresource on a CRD.
- selectorpath
- string
specifies the jsonpath to the pod label selector field for the scale's status.
The selector field must be the string form (serialized form) of a selector. Setting a pod label selector is necessary for your type to work with the HorizontalPodAutoscaler.
- specpath
- string
specifies the jsonpath to the replicas field for the scale's spec.
- statuspath
- string
specifies the jsonpath to the replicas field for the scale's status.
- kubebuilder:subresource:status
enables the "/status" subresource on a CRD.
- versionName
- string
overrides the API group version for this package (defaults to the package name).
- string
CRD Validation
These markers modify how the CRD validation schema is produced for the types and fields they modify. Each corresponds roughly to an OpenAPI/JSON schema option.
See Generating CRDs for examples.
- kubebuilder:default
- any
sets the default value for this field.
A default value will be accepted as any value valid for the field. Formatting for common types include: boolean:
true
, string:Cluster
, numerical:1.24
, array:{1,2}
, object:{policy: "delete"}
). Defaults should be defined in pruned form, and only best-effort validation will be performed. Full validation of a default requires submission of the containing CRD to an apiserver.- any
- kubebuilder:validation:EmbeddedResource
EmbeddedResource marks a fields as an embedded resource with apiVersion, kind and metadata fields.
An embedded resource is a value that has apiVersion, kind and metadata fields. They are validated implicitly according to the semantics of the currently running apiserver. It is not necessary to add any additional schema for these field, yet it is possible. This can be combined with PreserveUnknownFields.
- kubebuilder:validation:Enum
- any
specifies that this (scalar) field is restricted to the *exact* values specified here.
- any
- kubebuilder:validation:Enum
- any
specifies that this (scalar) field is restricted to the *exact* values specified here.
- any
- kubebuilder:validation:ExclusiveMaximum
- bool
indicates that the maximum is "up to" but not including that value.
- bool
- kubebuilder:validation:ExclusiveMaximum
- bool
indicates that the maximum is "up to" but not including that value.
- bool
- kubebuilder:validation:ExclusiveMinimum
- bool
indicates that the minimum is "up to" but not including that value.
- bool
- kubebuilder:validation:ExclusiveMinimum
- bool
indicates that the minimum is "up to" but not including that value.
- bool
- kubebuilder:validation:Format
- string
specifies additional "complex" formatting for this field.
For example, a date-time field would be marked as “type: string“ and “format: date-time“.
- string
- kubebuilder:validation:Format
- string
specifies additional "complex" formatting for this field.
For example, a date-time field would be marked as “type: string“ and “format: date-time“.
- string
- kubebuilder:validation:MaxItems
- int
specifies the maximum length for this list.
- int
- kubebuilder:validation:MaxItems
- int
specifies the maximum length for this list.
- int
- kubebuilder:validation:MaxLength
- int
specifies the maximum length for this string.
- int
- kubebuilder:validation:MaxLength
- int
specifies the maximum length for this string.
- int
- kubebuilder:validation:Maximum
- int
specifies the maximum numeric value that this field can have.
- int
- kubebuilder:validation:Maximum
- int
specifies the maximum numeric value that this field can have.
- int
- kubebuilder:validation:MinItems
- int
specifies the minimun length for this list.
- int
- kubebuilder:validation:MinItems
- int
specifies the minimun length for this list.
- int
- kubebuilder:validation:MinLength
- int
specifies the minimum length for this string.
- int
- kubebuilder:validation:MinLength
- int
specifies the minimum length for this string.
- int
- kubebuilder:validation:Minimum
- int
specifies the minimum numeric value that this field can have.
- int
- kubebuilder:validation:Minimum
- int
specifies the minimum numeric value that this field can have.
- int
- kubebuilder:validation:MultipleOf
- int
specifies that this field must have a numeric value that's a multiple of this one.
- int
- kubebuilder:validation:MultipleOf
- int
specifies that this field must have a numeric value that's a multiple of this one.
- int
- kubebuilder:validation:Optional
specifies that this field is optional, if fields are required by default.
- kubebuilder:validation:Optional
specifies that all fields in this package are optional by default.
- kubebuilder:validation:Pattern
- string
specifies that this string must match the given regular expression.
- string
- kubebuilder:validation:Pattern
- string
specifies that this string must match the given regular expression.
- string
- kubebuilder:validation:Required
specifies that this field is required, if fields are optional by default.
- kubebuilder:validation:Required
specifies that all fields in this package are required by default.
- kubebuilder:validation:Type
- string
overrides the type for this field (which defaults to the equivalent of the Go type).
This generally must be paired with custom serialization. For example, the metav1.Time field would be marked as “type: string“ and “format: date-time“.
- string
- kubebuilder:validation:Type
- string
overrides the type for this field (which defaults to the equivalent of the Go type).
This generally must be paired with custom serialization. For example, the metav1.Time field would be marked as “type: string“ and “format: date-time“.
- string
- kubebuilder:validation:UniqueItems
- bool
specifies that all items in this list must be unique.
- bool
- kubebuilder:validation:UniqueItems
- bool
specifies that all items in this list must be unique.
- bool
- kubebuilder:validation:XEmbeddedResource
EmbeddedResource marks a fields as an embedded resource with apiVersion, kind and metadata fields.
An embedded resource is a value that has apiVersion, kind and metadata fields. They are validated implicitly according to the semantics of the currently running apiserver. It is not necessary to add any additional schema for these field, yet it is possible. This can be combined with PreserveUnknownFields.
- kubebuilder:validation:XEmbeddedResource
EmbeddedResource marks a fields as an embedded resource with apiVersion, kind and metadata fields.
An embedded resource is a value that has apiVersion, kind and metadata fields. They are validated implicitly according to the semantics of the currently running apiserver. It is not necessary to add any additional schema for these field, yet it is possible. This can be combined with PreserveUnknownFields.
- nullable
marks this field as allowing the "null" value.
This is often not necessary, but may be helpful with custom serialization.
- optional
specifies that this field is optional, if fields are required by default.
CRD Processing
These markers help control how the Kubernetes API server processes API requests involving your custom resources.
See Generating CRDs for examples.
- kubebuilder:pruning:PreserveUnknownFields
PreserveUnknownFields stops the apiserver from pruning fields which are not specified.
By default the apiserver drops unknown fields from the request payload during the decoding step. This marker stops the API server from doing so. It affects fields recursively, but switches back to normal pruning behaviour if nested properties or additionalProperties are specified in the schema. This can either be true or undefined. False is forbidden.
- kubebuilder:validation:XPreserveUnknownFields
PreserveUnknownFields stops the apiserver from pruning fields which are not specified.
By default the apiserver drops unknown fields from the request payload during the decoding step. This marker stops the API server from doing so. It affects fields recursively, but switches back to normal pruning behaviour if nested properties or additionalProperties are specified in the schema. This can either be true or undefined. False is forbidden.
- kubebuilder:validation:XPreserveUnknownFields
PreserveUnknownFields stops the apiserver from pruning fields which are not specified.
By default the apiserver drops unknown fields from the request payload during the decoding step. This marker stops the API server from doing so. It affects fields recursively, but switches back to normal pruning behaviour if nested properties or additionalProperties are specified in the schema. This can either be true or undefined. False is forbidden.
Webhook
These markers describe how webhook configuration is generated. Use these to keep the description of your webhooks close to the code that implements them.
- kubebuilder:webhook
- failurePolicy
- string
- groups
- string
- mutating
- bool
- name
- string
- path
- string
- resources
- string
- verbs
- string
- versions
- string
specifies how a webhook should be served.
It specifies only the details that are intrinsic to the application serving it (e.g. the resources it can handle, or the path it serves on).
- failurePolicy
- string
specifies what should happen if the API server cannot reach the webhook.
It may be either “ignore“ (to skip the webhook and continue on) or “fail“ (to reject the object in question).
- groups
- string
specifies the API groups that this webhook receives requests for.
- mutating
- bool
marks this as a mutating webhook (it's validating only if false)
Mutating webhooks are allowed to change the object in their response, and are called before all validating webhooks. Mutating webhooks may choose to reject an object, similarly to a validating webhook.
- name
- string
indicates the name of this webhook configuration.
- path
- string
specifies that path that the API server should connect to this webhook on.
- resources
- string
specifies the API resources that this webhook receives requests for.
- verbs
- string
specifies the Kubernetes API verbs that this webhook receives requests for.
Only modification-like verbs may be specified. May be “create“, “update“, “delete“, “connect“, or “*“ (for all).
- versions
- string
specifies the API versions that this webhook receives requests for.
Object/DeepCopy
These markers control when DeepCopy
and runtime.Object
implementation
methods are generated.
- k8s:deepcopy-gen
- raw
enables or disables object interface & deepcopy implementation generation for this package
- raw
- k8s:deepcopy-gen
- raw
overrides enabling or disabling deepcopy generation for this type
- raw
- k8s:deepcopy-gen:interfaces
- string
enables object interface implementation generation for this type
- string
- kubebuilder:object:generate
- bool
enables or disables object interface & deepcopy implementation generation for this package
- bool
- kubebuilder:object:generate
- bool
overrides enabling or disabling deepcopy generation for this type
- bool
- kubebuilder:object:root
- bool
enables object interface implementation generation for this type
- bool
RBAC
These markers cause an RBAC ClusterRole to be generated. This allows you to describe the permissions that your controller requires alongside the code that makes use of those permissions.
- kubebuilder:rbac
- groups
- string
- namespace
- string
- resources
- string
- urls
- string
- verbs
- string
specifies an RBAC rule to all access to some resources or non-resource URLs.
- groups
- string
specifies the API groups that this rule encompasses.
- namespace
- string
specifies the scope of the Rule. If not set, the Rule belongs to the generated ClusterRole. If set, the Rule belongs to a Role, whose namespace is specified by this field.
- resources
- string
specifies the API resources that this rule encompasses.
- urls
- string
URL specifies the non-resource URLs that this rule encompasses.
- verbs
- string
specifies the (lowercase) kubernetes API verbs that this rule encompasses.
controller-gen CLI
KubeBuilder makes use of a tool called controller-gen for generating utility code and Kubernetes YAML. This code and config generation is controlled by the presence of special “marker comments” in Go code.
controller-gen is built out of different “generators” (which specify what to generate) and “output rules” (which specify how and where to write the results).
Both are configured through command line options specified in marker format.
For instance,
controller-gen paths=./... crd:trivialVersions=true rbac:roleName=controller-perms output:crd:artifacts:config=config/crd/bases
generates CRDs and RBAC, and specifically stores the generated CRD YAML in
config/crd/bases
. For the RBAC, it uses the default output rules
(config/rbac
). It considers every package in the current directory tree
(as per the normal rules of the go ...
wildcard).
Generators
Each different generator is configured through a CLI option. Multiple
generators may be used in a single invocation of controller-gen
.
- webhook
generates (partial) {Mutating,Validating}WebhookConfiguration objects.
- schemapatch
- manifests
- string
- maxDescLen
- int
patches existing CRDs with new schemata.
For legacy (v1beta1) single-version CRDs, it will simply replace the global schema. For legacy (v1beta1) multi-version CRDs, and any v1 CRDs, it will replace schemata of existing versions and clear the schema from any versions not specified in the Go code. It will not add new versions, or remove old ones. For legacy multi-version CRDs with identical schemata, it will take care of lifting the per-version schema up to the global schema. It will generate output for each “CRD Version“ (API version of the CRD type itself) , e.g. apiextensions/v1beta1 and apiextensions/v1) available.
- manifests
- string
contains the CustomResourceDefinition YAML files.
- maxDescLen
- int
specifies the maximum description length for fields in CRD's OpenAPI schema.
0 indicates drop the description for all fields completely. n indicates limit the description to at most n characters and truncate the description to closest sentence boundary if it exceeds n characters.
- rbac
- roleName
- string
generates ClusterRole objects.
- roleName
- string
sets the name of the generated ClusterRole.
- object
- headerFile
- string
- year
- string
generates code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations.
- headerFile
- string
specifies the header text (e.g. license) to prepend to generated files.
- year
- string
specifies the year to substitute for " YEAR" in the header file.
- crd
- crdVersions
- string
- maxDescLen
- int
- preserveUnknownFields
- bool
- trivialVersions
- bool
generates CustomResourceDefinition objects.
- crdVersions
- string
specifies the target API versions of the CRD type itself to generate. Defaults to v1beta1.
The first version listed will be assumed to be the “default“ version and will not get a version suffix in the output filename. You‘ll need to use “v1“ to get support for features like defaulting, along with an API server that supports it (Kubernetes 1.16+).
- maxDescLen
- int
specifies the maximum description length for fields in CRD's OpenAPI schema.
0 indicates drop the description for all fields completely. n indicates limit the description to at most n characters and truncate the description to closest sentence boundary if it exceeds n characters.
- preserveUnknownFields
- bool
- trivialVersions
- bool
indicates that we should produce a single-version CRD.
Single “trivial-version“ CRDs are compatible with older (pre 1.13) Kubernetes API servers. The storage version‘s schema will be used as the CRD‘s schema. Only works with the v1beta1 CRD version.
Output Rules
Output rules configure how a given generator outputs its results. There is
always one global “fallback” output rule (specified as output:<rule>
),
plus per-generator overrides (specified as output:<generator>:<rule>
).
For brevity, the per-generator output rules (output:<generator>:<rule>
)
are omitted below. They are equivalent to the global fallback options
listed here.
- output:artifacts
- code
- string
- config
- string
outputs artifacts to different locations, depending on whether they're package-associated or not.
Non-package associated artifacts are output to the Config directory, while package-associated ones are output to their package‘s source files‘ directory, unless an alternate path is specified in Code.
- code
- string
overrides the directory in which to write new code (defaults to where the existing code lives).
- config
- string
points to the directory to which to write configuration.
- output:dir
- string
outputs each artifact to the given directory, regardless of if it's package-associated or not.
- string
- output:none
skips outputting anything.
- output:stdout
outputs everything to standard-out, with no separation.
Generally useful for single-artifact outputs.
Other Options
- paths
- string
represents paths and go-style path patterns to use as package roots.
- string
Artifacts
Kubebuilder publishes test binaries and container images in addition to the main binary releases.
Test Binaries
You can find all of the test binaries at https://go.kubebuilder.io/test-tools
.
You can find individual test binaries at https://go.kubebuilder.io/test-tools/${version}/${os}/${arch}
.
Container Images
You can find all container images for your os at https://go.kubebuilder.io/images/${os}
or at gcr.io/kubebuilder/thirdparty-${os}
.
You can find individual container images at https://go.kubebuilder.io/images/${os}/${version}
or at gcr.io/kubebuilder/thirdparty-${os}:${version}
.
Writing controller tests
Testing Kubernetes controller is a big subject, and the boilerplate testing files generated for you by kubebuilder are fairly minimal.
Writing and Running Integration Tests documents steps to consider when writing integration steps for your controllers, and available options for configuring your test control plane using envtest
.
Until more documentation has been written, your best bet to get started is to look at some existing examples, such as:
- Azure Databricks Operator: see their fully fleshed-out
suite_test.go
as well as any*_test.go
file in that directory like this one.
The basic approach is that, in your generated suite_test.go
file, you will
create a local Kubernetes API server, instantiate and run your controllers, and
then write additional *_test.go
files to test it using
Ginko.
Using envtest in integration tests
controller-runtime
offers envtest
(godoc), a package that helps write integration tests for your controllers by setting up and starting an instance of etcd and the Kubernetes API server, without kubelet, controller-manager or other components.
Using envtest
in integration tests follows the general flow of:
import sigs.k8s.io/controller-runtime/pkg/envtest
//specify testEnv configuration
testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{filepath.Join("..", "config", "crd", "bases")},
}
//start testEnv
cfg, err = testEnv.Start()
//write test logic
//stop testEnv
err = testEnv.Stop()
kubebuilder
does the boilerplate setup and teardown of testEnv for you, in the ginkgo test suite that it generates under the /controllers
directory.
Logs from the test runs are prefixed with test-env
.
Configuring your test control plane
You can use environment variables and/or flags to specify the api-server
and etcd
setup within your integration tests.
Environment Variables
Variable name | Type | When to use |
---|---|---|
USE_EXISTING_CLUSTER | boolean | Instead of setting up a local control plane, point to the control plane of an existing cluster. |
KUBEBUILDER_ASSETS | path to directory | Point integration tests to a directory containing all binaries (api-server, etcd and kubectl). |
TEST_ASSET_KUBE_APISERVER , TEST_ASSET_ETCD , TEST_ASSET_KUBECTL | paths to, respectively, api-server, etcd and kubectl binaries | Similar to KUBEBUILDER_ASSETS , but more granular. Point integration tests to use binaries other than the default ones. These environment variables can also be used to ensure specific tests run with expected versions of these binaries. |
KUBEBUILDER_CONTROLPLANE_START_TIMEOUT and KUBEBUILDER_CONTROLPLANE_STOP_TIMEOUT | durations in format supported by time.ParseDuration | Specify timeouts different from the default for the test control plane to (respectively) start and stop; any test run that exceeds them will fail. |
KUBEBUILDER_ATTACH_CONTROL_PLANE_OUTPUT | boolean | Set to true to attach the control plane’s stdout and stderr to os.Stdout and os.Stderr. This can be useful when debugging test failures, as output will include output from the control plane. |
Flags
Here’s an example of modifying the flags with which to start the API server in your integration tests, compared to the default values in envtest.DefaultKubeAPIServerFlags
:
customApiServerFlags := []string{
"--secure-port=6884",
"--admission-control=MutatingAdmissionWebhook",
}
apiServerFlags := append([]string(nil), envtest.DefaultKubeAPIServerFlags...)
apiServerFlags = append(apiServerFlags, customApiServerFlags...)
testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{filepath.Join("..", "config", "crd", "bases")},
KubeAPIServerFlags: apiServerFlags,
}
Testing considerations
Unless you’re using an existing cluster, keep in mind that no built-in controllers are running in the test context. In some ways, the test control plane will behave differently from “real” clusters, and that might have an impact on how you write tests. One common example is garbage collection; because there are no controllers monitoring built-in resources, objects do not get deleted, even if an OwnerReference
is set up.
To test that the deletion lifecycle works, test the ownership instead of asserting on existence. For example:
expectedOwnerReference := v1.OwnerReference{
Kind: "MyCoolCustomResource",
APIVersion: "my.api.example.com/v1beta1",
UID: "d9607e19-f88f-11e6-a518-42010a800195",
Name: "userSpecifiedResourceName",
}
Expect(deployment.ObjectMeta.OwnerReferences).To(ContainElement(expectedOwnerReference))
Metrics
By default, controller-runtime builds a global prometheus registry and publishes a collection of performance metrics for each controller.
Protecting the Metrics
These metrics are protected by kube-auth-proxy
by default if using kubebuilder. Kubebuilder v2.2.0+ scaffold a clusterrole which
can be found at config/rbac/auth_proxy_client_clusterrole.yaml
.
You will need to grant permissions to your Prometheus server so that it can
scrape the protected metrics. To achieve that, you can create a
clusterRoleBinding
to bind the clusterRole
to the service account that your
Prometheus server uses.
You can run the following kubectl command to create it. If using kubebuilder
<project-prefix>
is the namePrefix
field in config/default/kustomization.yaml
.
kubectl create clusterrolebinding metrics --clusterrole=<project-prefix>-metrics-reader --serviceaccount=<namespace>:<service-account-name>
Exporting Metrics for Prometheus
Follow the steps below to export the metrics using the Prometheus Operator:
- Install Prometheus and Prometheus Operator. We recommend using kube-prometheus in production if you don’t have your own monitoring system. If you are just experimenting, you can only install Prometheus and Prometheus Operator.
- Uncomment the line
- ../prometheus
in theconfig/default/kustomization.yaml
. It creates theServiceMonitor
resource which enables exporting the metrics.
# [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.
- ../prometheus
Note that, when you install your project in the cluster, it will create the
ServiceMonitor
to export the metrics. To check the ServiceMonitor,
run kubectl get ServiceMonitor -n <project>-system
. See an example:
$ kubectl get ServiceMonitor -n monitor-system
NAME AGE
monitor-controller-manager-metrics-monitor 2m8s
Also, notice that the metrics are exported by default through port 8443
. In this way,
you are able to check the Prometheus metrics in its dashboard. To verify it, search
for the metrics exported from the namespace where the project is running
{namespace="<project>-system"}
. See an example:
Publishing Additional Metrics
If you wish to publish additional metrics from your controllers, this
can be easily achieved by using the global registry from
controller-runtime/pkg/metrics
.
One way to achieve this is to declare your collectors as global variables and then register them using init()
.
For example:
import (
"github.com/prometheus/client_golang/prometheus"
"sigs.k8s.io/controller-runtime/pkg/metrics"
)
var (
goobers = prometheus.NewCounter(
prometheus.CounterOpts{
Name: "goobers_total",
Help: "Number of goobers proccessed",
},
)
gooberFailures = prometheus.NewCounter(
prometheus.CounterOpts{
Name: "goober_failures_total",
Help: "Number of failed goobers",
},
)
)
func init() {
// Register custom metrics with the global prometheus registry
metrics.Registry.MustRegister(goobers, gooberFailures)
}
You may then record metrics to those collectors from any part of your reconcile loop, and those metrics will be available for prometheus or other openmetrics systems to scrape.
TODO
If you’re seeing this page, it’s probably because something’s not done in the book yet. Go see if anyone else has found this or bug the maintainers.