KServe Autoscaler KPA와 HPA 비교

KServe는 Knative Pod Autoscaler (KPA)와 Horizontal Pod Autoscaler (HPA) 두 가지 유형의 auto scaler를 지원한다.

KPA는 Knative Serving 설치 시 기본적으로 활성화되지만, HPA를 사용하기 위해서는 별도 설치 및 설정이 필요하다.

KPA

제로 스케일링(scale to zero) 기능을 지원한다.
Knative Serving 코어의 일부로, Knative Serving 설치 시 기본적으로 활성화된다.
CPU, MEMORY 기반 auto scaling을 지원하지 않는다.
HTTP 기반 워크로드에 최적화되어 있다.

HPA

Knative Serving 설치 후 별도로 설치해야 한다.
제로 스케일링(scale to zero) 기능을 지원하지 않는다.
CPU, MEMORY 기반 auto scaling을 지원한다.

HPA autoscaler 설치

`knative-v1.12.4`를 사용하는 버전에 맞게 적절하게 수정 후 설치한다.

kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.12.4/serving-hpa.yaml

설치가 완료되면 knative-serving에 아래와 같이 autoscaler-hpa pod가 생성된다.

[root@km ~]# k get pod -n knative-serving | grep hpa
autoscaler-hpa-d9f489c5f-9nrs9          2/2     Running     0          47h

구성 방법

자동 스케일러 구현 유형(KPA 또는 HPA)은 annotation을 사용하여 구성할 수 있다.

config-autoscaler ConfigMap의 pod-autoscaler-class 항목을 보면 auto scaler 기본값이 kpa로 설정되어 있어, annotation을 작성하지 않으면 기본적으로 kpa를 사용하게 된다.

[root@km ~]# k get cm -n knative-serving config-autoscaler -oyaml | grep pod-autoscaler-class | head -n 2
    # pod-autoscaler-class specifies the default pod autoscaler class
    pod-autoscaler-class: "kpa.autoscaling.knative.dev"

hpa를 사용하고 싶다면 아래와 같이 metadata.annotation.autoscaling.knative.dev/class에 hpa.autoscaling.knative.dev를 작성하면 된다.

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "cpu"
  annotations:
    autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev"
spec:
  predictor:
    serviceAccountName: s3-sa
    scaleTarget: 75
    scaleMetric: cpu
    maxReplicas: 3
    resources:
      requests:
        cpu: "100m"
        memory: "100Mi"
      limits:
        cpu: "1"
        memory: "200Mi"
    model:
      modelFormat:
        name: sklearn
      storageUri: "s3://sandbox/iris_svm/v1/"

저작자표시 비영리 변경금지 (새창열림)

'kubenetes' 카테고리의 다른 글

KServe의 Inference Batcher (0)	2024.10.12
JuiceFS CSI driver를 이용해 MinIO와 HDFS를 Kubernetes와 연동하기 (1)	2024.10.11
KServe Autoscaling & Zero Scale (0)	2024.10.09
KServe v2 프로토콜: 모델 메타데이터 (2)	2024.10.01
KServe Custom Predictor 이미지 빌드 가이드 - v2 protocol (1)	2024.10.01

kyeongseo.oh

KServe Autoscaler KPA와 HPA 비교

KPA

HPA

HPA autoscaler 설치

구성 방법

'kubenetes' 카테고리의 다른 글

댓글

티스토리툴바

KServe Autoscaler KPA와 HPA 비교

KPA

HPA

HPA autoscaler 설치

구성 방법

'kubenetes' 카테고리의 다른 글

관련글

댓글

티스토리툴바