Pre-installation Configuration

Deploy Service Mesh

Since Alauda AI leverages Service Mesh capabilities for model inference services, Service Mesh must be deployed in the cluster before deploying Alauda AI. For detailed deployment procedures, refer to Create Service Mesh v1.

INFO

After completing the Prerequisites on the Create Service Mesh page, proceed to the Creating a Service Mesh page and follow the on-screen instructions to finalize the deployment of the Service Mesh.

Preparing the GitLab Service

In Alauda AI, GitLab is the core component for Model Management. Before deploying Alauda AI, you must prepare a GitLab service.

Deployment Options

1. GitLab service requirements

Regardless of deployment method, all GitLab instances must satisfy:

  • Version: Must be v15 or later.
  • Protocol: Must use HTTPS. For setup instructions, refer to Configure HTTPS.
  • Git LFS: Must be enabled. For setup instructions, refer to Managing Large Files with LFS.
  • Hosting: Must be self-hosted (public cloud-hosted GitLab services are not supported).
  • Access Tokens: Disable expiration dates for access tokens.

2. Use the platform-provided plugin

Deploy a new GitLab service using the 'Alauda Build of GitLab' plugin. For instructions, refer to: Deploy Alauda Build of GitLab.

3. Use your own GitLab service

Alternatively, you can use a self-managed GitLab instance, but it must meet the GitLab service requirements.

GitLab Configuration

Before deploying Alauda AI, perform these GitLab configuration steps after service acquisition.

1. Disable expiration dates for access tokens

If GitLab is running v17.0 or greater, we need to disable expiration dates for access tokens.

WARNING

If expiration date for access token keeps enabled, we have to refresh admin token manually at least yearly, or Alauda AI may stop functionally.

To disable expiration dates for new access tokens:

  1. On the left sidebar, at the bottom, select Admin.
  2. Select Settings > General.
  3. Expand Account and limit.
  4. Uncheck the Personal / Project / Group access token expiration checkbox.
  5. Select Save changes.

2. Generate new token

To generate impersonation token for admin:

  1. On the left sidebar, at the bottom, select Admin.
  2. Select Overview > Users.
  3. Select the admin user (Administrator for example).
  4. On the top navigation bar, select Impersonation Tokens.
  5. Select Add new token.
  6. In the popup form:
    1. Input a Token name for Alauda AI (aml-root, for example).
    2. Remove Expiration Date (select "x" icon to remove expiration date).
    3. Check ALL scopes (especially api scope) for Select scopes.
  7. Select Create impersonation token.
  8. Save the newly generated token under Your new impersonation token, we need to use it later.
WARNING

Make sure you save the newly generate token - you won't be able to access it again.

3. Create kubernetes secret for admin token

Then we create secret for gitlab admin token named aml-gitlab-admin-token under cpaas-system namespace:

# Please replace ${TOKEN} with real token saved previously
kubectl create secret generic aml-gitlab-admin-token \
  --from-literal="password=${TOKEN}" \
  -n cpaas-system
  1. Create a gitlab admin token secret named aml-gitlab-admin-token
  2. The token is saved under password key, please replace ${TOKEN} with the real token saved previously.
  3. The secret is created under cpaas-system namespace.

Frequently Asked Questions (FAQ)

1. How to optimize GitLab 18.5 and later configuration for large LFS objects?

Problem: When pushing large LFS objects to GitLab 18.5 and later, you may encounter an HTTP 413 error. For AI model management, you often need to upload large model files via LFS, which exceed the default proxy-body-size limit (typically 512M) in the Nginx ingress controller (these Nginx ingress annotations are generally version-agnostic and applicable to other GitLab versions experiencing LFS upload size limits).

The following is authentic diagnostic output from the Git LFS client; the %!!(string=...) fragments are raw Go-formatting artifacts and can be ignored—focus on the HTTP 413 response as the actionable error.

 git push origin main
Locking support detected on remote "origin". Consider enabling it with:
  $ git config lfs.https://gitlab-18-5-aml.alaudatech.net/mlops-demo-ai-test/amlmodels/qa.git/info/lfs.locksverify true
LFS: Client error &{%!!(string=https) %!!(string=) %!!(*url.Userinfo=<nil>) %!!(string=gitlab-18-5-aml.alaudatech.net) %!!(string=/mlops-demo-ai-test/amlmodels/qa.git/gitlab-lfs/objects/fdf756fa7fcbe7404d5c60e26bff1a0c8b8aa1f72ced49e7dd0210fe288fb7fe/988097824) %!!(string=) %!!(bool=false) %!!(bool=false) %!!(string=) %!!(string=) %!!(string=)}s(MISSING) from HTTP 413
Uploading LFS objects:   0% (0/1), 0 B | 0 B/s, done.
error: failed to push some refs to 'https://gitlab-18-5-aml.alaudatech.net/mlops-demo-ai-test/amlmodels/qa.git'

Solution: To handle large file uploads and improve overall performance, you need to configure specific Nginx Ingress annotations on your GitLab service.

Ingress Annotation Parameters

Below is a list of recommended Ingress parameters and their functionality:

ParameterRecommended ValueDescription
nginx.ingress.kubernetes.io/proxy-body-size"0"Disables the client request body size limit, allowing arbitrarily large file uploads (crucial for AI models).
nginx.ingress.kubernetes.io/proxy-buffering"off"Disables proxy buffering, improving response times for large requests and allowing data to stream directly to the client/server.
nginx.ingress.kubernetes.io/proxy-read-timeout"3600"Increases the timeout (in seconds) for reading a response from the proxied server to 1 hour, preventing timeouts during long-running operations.
nginx.ingress.kubernetes.io/proxy-request-buffering"off"Disables buffering of the client request body, passing data directly to the upstream server to reduce memory usage on the ingress controller.
nginx.ingress.kubernetes.io/proxy-send-timeout"3600"Increases the timeout (in seconds) for transmitting a request to the proxied server to 1 hour, supporting prolonged uploads.

Configuration Steps

You can apply these optimizations by updating the GitLabOfficial Custom Resource (CR).

1. Apply via kubectl patch command

Use the following command to directly update the ingress annotations in your GitLabOfficial CR:

# Update GitLabOfficial CR with optimized ingress annotations
# [!code callout:1,2]
kubectl patch gitlabofficial your-instance-name -n your-instance-namespace --type=merge -p '{
  "spec": {
    "helmValues": {
      "global": {
        "ingress": {
          "annotations": {
            "nginx.ingress.kubernetes.io/proxy-body-size": "0",
            "nginx.ingress.kubernetes.io/proxy-buffering": "off",
            "nginx.ingress.kubernetes.io/proxy-read-timeout": "3600",
            "nginx.ingress.kubernetes.io/proxy-request-buffering": "off",
            "nginx.ingress.kubernetes.io/proxy-send-timeout": "3600"
          }
        }
      }
    }
  }
}'
  1. Replace your-instance-name with the name of your GitLabOfficial instance (e.g., gitlab-aml).
  2. Replace your-instance-namespace with the namespace where your GitLabOfficial instance is deployed (e.g., gitlab-system-aml).

2. YAML Hierarchy Reference

For reference, the hierarchical structure of the ingress.annotations within the GitLabOfficial CR spec is as follows:

apiVersion: gitlab.alauda.io/v1alpha1
kind: GitLabOfficial
metadata:
  name: gitlab-aml
  namespace: gitlab-system-aml
spec:
  # ... other specs ...
  helmValues:
    global:
      ingress:
        annotations:
          nginx.ingress.kubernetes.io/proxy-body-size: "0"
          nginx.ingress.kubernetes.io/proxy-buffering: "off"
          nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
          nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
          nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
  1. These optimizations ensure GitLab 18.5 can seamlessly handle large AI model uploads via Git LFS and improve overall data transfer stability.
  2. We highly recommend applying these configurations during the initial GitLab deployment to prevent post-deployment operational issues.