Skip to content

Commit

Permalink
feat: Add ragengine controller scaffolding code and chart (#600)
Browse files Browse the repository at this point in the history
This PR adds the related scaffolding code/chart for the ragengine
controller.

- Create a new directory in /cmd to hold the ragengine main package.
Move workspace main package to /cmd/workspace
- Create pkg/controllers/ragengine_controller.go with minimum code
required to run a controller.
- Add entries in Makefile for build binary and image.
- Add a new helm chart for installing ragengine controller deployment.
-- The major change is that I add `fullnameOverride: ragengine` in
values.yaml so that I can largely reuse the existing chart for
workpsace.

Testing:

I can build kaito/ragengine:0.0.1 in my acr by running `make
docker-build-ragengine`
```
helm install ragengine ./charts/kaito/ragengine --set image.repository=guofei.azurecr.io/kaito/ragengine --set 
 image.tag=0.0.1 --namespace ragengine --create-namespace
NAME: ragengine
LAST DEPLOYED: Thu Sep 19 10:31:04 2024
NAMESPACE: ragengine
STATUS: deployed
REVISION: 1
TEST SUITE: None

 k get pod -n ragengine
NAME                         READY   STATUS    RESTARTS   AGE
ragengine-5756f8f58f-smmhl   1/1     Running   0          73s

k logs ragengine-5756f8f58f-smmhl -n ragengine
I0919 17:31:14.940028       1 main.go:153] "starting manager"
2024-09-19T17:31:14Z    INFO    controller-runtime.metrics      Starting metrics server
2024-09-19T17:31:14Z    INFO    controller-runtime.metrics      Serving metrics server  {"bindAddress": ":8080", "secure": false}
2024-09-19T17:31:14Z    INFO    starting server {"name": "health probe", "addr": "[::]:8081"}
2024-09-19T17:31:14Z    INFO    Starting EventSource    {"controller": "ragengine", "controllerGroup": "kaito.sh", "controllerKind": "RAGEngine", "source": "kind source: *v1alpha1.RAGEngine"}
2024-09-19T17:31:14Z    INFO    Starting Controller     {"controller": "ragengine", "controllerGroup": "kaito.sh", "controllerKind": "RAGEngine"}
2024-09-19T17:31:16Z    INFO    Starting workers        {"controller": "ragengine", "controllerGroup": "kaito.sh", "controllerKind": "RAGEngine", "worker count": 5}
```
  • Loading branch information
Fei-Guo authored Sep 21, 2024
1 parent a06cf97 commit ba1a62d
Show file tree
Hide file tree
Showing 22 changed files with 917 additions and 5 deletions.
20 changes: 20 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,15 @@ docker-build-kaito: docker-buildx
--pull \
--tag $(REGISTRY)/$(IMG_NAME):$(IMG_TAG) .

.PHONY: docker-build-ragengine
docker-build-ragengine: docker-buildx
docker buildx build \
--file ./docker/ragengine/Dockerfile \
--output=$(OUTPUT_TYPE) \
--platform="linux/$(ARCH)" \
--pull \
--tag $(REGISTRY)/$(IMG_NAME):$(IMG_TAG) .

.PHONY: docker-build-adapter
docker-build-adapter: docker-buildx
docker buildx build \
Expand Down Expand Up @@ -309,6 +318,17 @@ LOCALBIN ?= $(shell pwd)/bin
$(LOCALBIN):
mkdir -p $(LOCALBIN)

## --------------------------------------
## RAGEngine
## --------------------------------------
.PHONY: build-ragengine
build-ragengine: manifests generate fmt vet
go build -o bin/rag-engine-manager cmd/ragengine/*.go

.PHONY: run-ragengine
run-ragengine: manifests generate fmt vet
go run ./cmd/ragengine/main.go

##@ Deployment
ifndef ignore-not-found
ignore-not-found = false
Expand Down
25 changes: 25 additions & 0 deletions charts/kaito/ragengine/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
apiVersion: v2
name: ragengine
description: A Helm chart to install AI Toolchain Operator RAGEngine
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.0.1

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: 0.0.1
home: https://github.com/Azure/kaito
sources:
- https://github.com/Azure/kaito
maintainers:
- name: Fei-Guo
email: vrgf2003@gmail.com
- name: helayoty
email: hebaelayoty@gmail.com
- name: ishaansehgal99
email: ishaanforthewin@gmail.com
35 changes: 35 additions & 0 deletions charts/kaito/ragengine/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# KAITO RAGEngine Helm Chart

## Install

```bash
export REGISTRY=mcr.microsoft.com/aks/kaito
export IMG_NAME=ragengine
export IMG_TAG=0.0.1
helm install ragengine ./charts/kaito/ragengine \
--set image.repository=${REGISTRY}/$(IMG_NAME) --set image.tag=$(IMG_TAG) \
--namespace ragengine --create-namespace
```

## Values

| Key | Type | Default | Description |
|------------------------------------------|--------|-----------------------------------------|---------------------------------------------------------------|
| affinity | object | `{}` | |
| image.pullPolicy | string | `"IfNotPresent"` | |
| image.repository | string | `mcr.microsoft.com/aks/kaito/ragengine` | |
| image.tag | string | `"0.3.0"` | |
| imagePullSecrets | list | `[]` | |
| nodeSelector | object | `{}` | |
| podAnnotations | object | `{}` | |
| podSecurityContext.runAsNonRoot | bool | `true` | |
| presetRegistryName | string | `"mcr.microsoft.com/aks/kaito"` | |
| replicaCount | int | `1` | |
| resources.limits.cpu | string | `"500m"` | |
| resources.limits.memory | string | `"128Mi"` | |
| resources.requests.cpu | string | `"10m"` | |
| resources.requests.memory | string | `"64Mi"` | |
| securityContext.allowPrivilegeEscalation | bool | `false` | |
| securityContext.capabilities.drop[0] | string | `"ALL"` | |
| tolerations | list | `[]` | |
| webhook.port | int | `9443` | |
269 changes: 269 additions & 0 deletions charts/kaito/ragengine/crds/kaito.sh_ragengines.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.0
name: ragengines.kaito.sh
spec:
group: kaito.sh
names:
categories:
- ragengine
kind: RAGEngine
listKind: RAGEngineList
plural: ragengines
singular: ragengine
scope: Namespaced
versions:
- name: v1alpha1
schema:
openAPIV3Schema:
description: RAGEngine is the Schema for the ragengine API
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
properties:
compute:
description: Compute specifies the dedicated GPU resource used by
an embedding model running locally if required.
properties:
count:
default: 1
description: Count is the required number of GPU nodes.
type: integer
instanceType:
default: Standard_NC12s_v3
description: |-
InstanceType specifies the GPU node SKU.
This field defaults to "Standard_NC12s_v3" if not specified.
type: string
labelSelector:
description: LabelSelector specifies the required labels for the
GPU nodes.
properties:
matchExpressions:
description: matchExpressions is a list of label selector
requirements. The requirements are ANDed.
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the selector
applies to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
x-kubernetes-list-type: atomic
required:
- key
- operator
type: object
type: array
x-kubernetes-list-type: atomic
matchLabels:
additionalProperties:
type: string
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
x-kubernetes-map-type: atomic
preferredNodes:
description: |-
PreferredNodes is an optional node list specified by the user.
If a node in the list does not have the required labels or
the required instanceType, it will be ignored.
items:
type: string
type: array
required:
- labelSelector
type: object
embedding:
description: |-
Embedding specifies whether the RAG engine generates embedding vectors using a remote service
or using a embedding model running locally.
properties:
local:
description: Local specifies how to generate embeddings for index
data using a model run locally.
properties:
image:
description: Image is the name of the containerized embedding
model image.
type: string
imagePullSecret:
type: string
modelAccessSecret:
description: ModelAccessSecret is the name of the secret that
contains the huggingface access token.
type: string
modelID:
description: |-
ModelID is the ID of the embedding model hosted by huggingface, e.g., BAAI/bge-small-en-v1.5.
When this field is specified, the RAG engine will download the embedding model
from huggingface repository during startup. The embedding model will not persist in local storage.
Note that if Image is specified, ModelID should not be specified and vice versa.
type: string
type: object
remote:
description: |-
Remote specifies how to generate embeddings for index data using a remote service.
Note that either Remote or Local needs to be specified, not both.
properties:
accessSecret:
description: AccessSecret is the name of the secret that contains
the service access token.
type: string
url:
description: URL points to a publicly available embedding
service, such as OpenAI.
type: string
required:
- url
type: object
type: object
indexServiceName:
description: |-
IndexServiceName is the name of the service which exposes the endpoint for user to input the index data
to generate embeddings. If not specified, a default service name will be created by the RAG engine.
type: string
inferenceService:
properties:
accessSecret:
description: AccessSecret is the name of the secret that contains
the service access token.
type: string
url:
description: URL points to a running inference service endpoint
which accepts http(s) payload.
type: string
required:
- url
type: object
queryServiceName:
description: |-
QueryServiceName is the name of the service which exposes the endpoint for accepting user queries to the
inference service. If not specified, a default service name will be created by the RAG engine.
type: string
storage:
description: |-
Storage specifies how to access the vector database used to save the embedding vectors.
If this field is not specified, by default, an in-memory vector DB will be used.
The data will not be persisted.
type: object
required:
- embedding
- inferenceService
type: object
status:
description: RAGEngineStatus defines the observed state of RAGEngine
properties:
conditions:
items:
description: "Condition contains details for one aspect of the current
state of this API Resource.\n---\nThis struct is intended for
direct use as an array at the field path .status.conditions. For
example,\n\n\n\ttype FooStatus struct{\n\t // Represents the
observations of a foo's current state.\n\t // Known .status.conditions.type
are: \"Available\", \"Progressing\", and \"Degraded\"\n\t //
+patchMergeKey=type\n\t // +patchStrategy=merge\n\t // +listType=map\n\t
\ // +listMapKey=type\n\t Conditions []metav1.Condition `json:\"conditions,omitempty\"
patchStrategy:\"merge\" patchMergeKey:\"type\" protobuf:\"bytes,1,rep,name=conditions\"`\n\n\n\t
\ // other fields\n\t}"
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
type: object
type: object
served: true
storage: true
subresources:
status: {}
Loading

0 comments on commit ba1a62d

Please sign in to comment.