Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idle Usage #63

Closed
chris-schra opened this issue Dec 19, 2021 · 9 comments
Closed

Idle Usage #63

chris-schra opened this issue Dec 19, 2021 · 9 comments
Labels
documentation Improvements or additions to documentation

Comments

@chris-schra
Copy link

Dear All,

pricing page says:
*You can optionally choose to keep a minimum capacity always running in idle mode to optimize response times based on your needs. This capacity is charged at a reduced rate when you are not processing any requests.

Where/how do I opt-in to that?

@johnnyruz
Copy link

I believe this is just controlled via the minReplicas configuration on the container app. If you set MinReplicas to anything greater than 0, Container Apps will keep that number of instances always running and you will be charged. However, if these running replicas are not currently processing any API calls or background jobs, they are considered Idle and are billed at the lower rate.

As an example, I have setup a container apps to act as build agents for Azure Devops Pipeline builds. I have MinReplicas set to 1 so that I always have one in the pool and immediately able to serve requests. However, if I'm not currently executing any builds that container is considered "idle" and is billed at the lower rate.

@cmenzi
Copy link

cmenzi commented Dec 20, 2021

@johnnyruz I've implemented a GitLab Runner scaler. They provide a gitlab-runner run-single command, which waits/blocks for a new job and exits.

My question here is:

  • What is considered as "Idle", because then we could also specify 2 , 3, ... as minReplicas as they are all in idle. 🤔

2nd question:

  • How did you solved the Long-running executions? The scale-down operation could kill your job at any time. The HPA sends a SIGTERM and just waits 30 seconds and sends then a SIGKILL?

I've also tried to play Azure Queue scaler, but this have some delay because of the pollingInterval. I've also checked the implementation of the Azure Pipelines scaler, which also just polls for queue changes. So, scaling like this is not so nice! The Scaling Jobs seems to me the only right solution.

@tomkerkhove
Copy link
Member

@cmenzi Is that scaler open-source? If so, are you willing to contribute this in to KEDA core? If not, are you willing to list it on Artifact Hub?

What you need here is indeed a job, not a deamon, because jobs allow you to run to completion. The Azure Pipelines scaler does support jobs as well so that should be good to go; but we digress.

Idle is typically when there is no work, so this is mainly for queue-like workloads or when there are no HTTP workloads. We have recently introduced idleReplicaCount to decouple it from minReplicaCount but I'm not sure where Azure Container Apps is in supporting this.

@ahmelsayed do you know?

@cmenzi
Copy link

cmenzi commented Dec 20, 2021

@tomkerkhove The scaler I've implemented is currently "a hack" to check capabilities. I've created an Azure Function which subscribes to the "Job Event" and on each new job a added a Blob to the Azure Storage. Once the job is finished, I remove the Blob. :-), so it scales down.

But if you're saying, they right way is to implement a scaler for the KEDA core, I would love to invest in this.

Could you please elaborate, how the Azure Pipelines scaler does support jobs within Azure Container Apps? When I look at the code, it just does implement the scaler interface.

When I look a the Azure Container Apps YAML specs, it looks to me, it doesn't support jobs.

kind: containerapp
location: northeurope
name: gitlab-scaler
resourceGroup: my-resource-group
type: Microsoft.Web/containerApps
tags:
  version: 0.1.0-beta0002
properties:
  kubeEnvironmentId: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/my-resource-group/providers/Microsoft.Web/kubeEnvironments/my-containerappenv
  configuration:
    activeRevisionsMode: Single
    secrets:
    - name: acr-password-1
      value: my-acr-password-1
    - name: gitlab-registration-token
      value: my-gitlab-registration-token
    - name: gitlab-scaler-storage
      value: DefaultEndpointsProtocol=https;AccountName=mystorageaccount;AccountKey=1234==;EndpointSuffix=core.windows.net
    registries:
    - server: myacr.azurecr.io
      username: myacr
      passwordSecretRef: acr-password-1
  template:
    containers:
    - image: myacr.azurecr.io/build-agent-linux-aca:latest
      name: build-agent-linux
      env:
      - name: REGISTRATION_TOKEN
        secretRef: gitlab-registration-token
      - name: CI_SERVER_URL
        value: https://my-gitlab-server/
      - name: RUNNER_TAG_LIST
        value: aca,lnx
      resources:
        cpu: 0.5
        memory: 1.0Gi
    scale:
      minReplicas: 0
      maxReplicas: 10
      rules:
      - name: azure-blob-based-scaling
        custom:
          type: azure-blob
          metadata:
            accountName: my-storage-account
            blobContainerName: gitlab-jobs
            blobCount: 1
          auth:
          - secretRef: gitlab-scaler-storage
            triggerParameter: connection

Here is what I've done so far, but it's really a PoC.

Startup.cs

namespace Buhler.IoT.GitLabScaler
{
    using Azure.Storage.Blobs;
    using Microsoft.Azure.Functions.Extensions.DependencyInjection;
    using Microsoft.Extensions.DependencyInjection;

    using NGitLab;
    using NGitLab.Models;
    using System;
    using System.Linq;

    internal class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {
            SetupGitLabWebHooks(builder);
        }

        private static void SetupGitLabWebHooks(IFunctionsHostBuilder builder)
        {
            var context = builder.GetContext();

            var storageAccountConnectionString = context.Configuration["GITLAB_SCALER_STORAGE"];

            var gitlabScalerFunctionWebHookUrl = new Uri(context.Configuration["GITLAB_SCALER_FUNCTION_URL"]);

            var gitLabUrl = context.Configuration["GITLAB_URL"];
            var gitLabToken = context.Configuration["GITLAB_TOKEN"];
            var gitLabGroups = context.Configuration["GITLAB_GROUPS"].Split(",");
            var gitLabSingleProject = context.Configuration["GITLAB_SINGLE_PROJECT"];

            var gitLabClient = new GitLabClient(gitLabUrl, gitLabToken);
            var blobContainerClient = new BlobContainerClient(storageAccountConnectionString, "gitlab-jobs");

            _ = blobContainerClient.CreateIfNotExists();

            _ = builder.Services.AddSingleton<IGitLabClient>(gitLabClient)
                                .AddSingleton(blobContainerClient);

            var allProjects = Enumerable.Empty<Project>();

            if (!string.IsNullOrEmpty(gitLabSingleProject))
            {
                var singleProject = gitLabClient.Projects[gitLabSingleProject];
                allProjects = allProjects.Append(singleProject);
            }
            else
            {
                allProjects = gitLabGroups.SelectMany(x => gitLabClient.Groups[x].Projects);
            }

            foreach (var project in allProjects)
            {
                var repository = gitLabClient.GetRepository(project.Id);
                var allProjectHooks = repository.ProjectHooks.All.Where(x => x.Url == gitlabScalerFunctionWebHookUrl);
                if (!allProjectHooks.Any())
                {
#pragma warning disable S106 // Standard outputs should not be used directly to log anything
                    Console.WriteLine($"Creating web hook for project '{project.NameWithNamespace}'...");
#pragma warning restore S106 // Standard outputs should not be used directly to log anything
                    _ = repository.ProjectHooks.Create(new ProjectHookUpsert { Url = gitlabScalerFunctionWebHookUrl, JobEvents = true });
                }
            }
        }
    }
}

GitLabWebHook.cs

namespace Buhler.IoT.GitLabScaler
{
    using System;
    using System.IO;
    using System.Threading.Tasks;
    using Microsoft.AspNetCore.Mvc;
    using Microsoft.Azure.WebJobs;
    using Microsoft.Azure.WebJobs.Extensions.Http;
    using Microsoft.AspNetCore.Http;
    using Microsoft.Extensions.Logging;
    using Newtonsoft.Json;
    using System.Linq;
    using Newtonsoft.Json.Linq;
    using Azure.Storage.Blobs;
    using System.Diagnostics.CodeAnalysis;

    public class GitLabWebHook
    {
        private const int MaxQueueSize = 10;

        private readonly BlobContainerClient _blobContainerClient;
        private readonly ILogger<GitLabWebHook> _logger;

        public GitLabWebHook(
            BlobContainerClient queueClient,
            ILogger<GitLabWebHook> logger)
        {
            _blobContainerClient = queueClient;
            _logger = logger;
        }

        [FunctionName("job")]
        [SuppressMessage("Major Code Smell", "S4457:Parameter validation in \"async\"/\"await\" methods should be wrapped", Justification = "No Issue")]
        public async Task<IActionResult> OnJobAsync(
            [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req)
        {
            if (req is null)
            {
                throw new ArgumentNullException(nameof(req));
            }

            if (!req.Headers.TryGetValue("X-Gitlab-Event", out var values))
            {
                _logger.LogError("The header 'X-Gitlab-Event' not present.");

                return new UnprocessableEntityResult();
            }

            if (!values.Contains("Job Hook"))
            {
                _logger.LogError("The event is not of type 'Job Hook'.");

                return new UnprocessableEntityResult();
            }

            using var streamReader = new StreamReader(req.Body);
            var requestBody = await streamReader.ReadToEndAsync().ConfigureAwait(false);

            var data = JsonConvert.DeserializeObject<JObject>(requestBody);
            var projectId = data.Value<int>("pipeline_id");
            var pipelineId = data.Value<int>("pipeline_id");
            var jobId = data.Value<int>("build_id");
            var jobStatus = data.Value<string>("build_status");

            var blobName = $"{projectId}-{pipelineId}-{jobId}";

            if (jobStatus == "pending")
            {
                _logger.LogInformation($"Sending scale up request for job {jobId} on pipeline '{pipelineId}' on project '{projectId}'...");

                if (_blobContainerClient.GetBlobs().Count() <= MaxQueueSize)
                {
                    _ = _blobContainerClient.UploadBlob(blobName, BinaryData.FromString(blobName));
                }

                _logger.LogInformation($"Sending scale up request for job {jobId} on pipeline '{pipelineId}' on project '{projectId}' successfully submitted.");
            }
            else
            {
                _logger.LogInformation($"Sending scale down request for job {jobId} on pipeline '{pipelineId}' on project '{projectId}'...");

                _ = _blobContainerClient.DeleteBlob(blobName);

                _logger.LogInformation($"Sending scale down request for job {jobId} on pipeline '{pipelineId}' on project '{projectId}' successfully submitted.");
            }

            return new OkResult();
        }
    }
}

Then the Dockerimage where I installed the gitlab-runner:

FROM mcr.microsoft.com/dotnet/core/sdk:3.1.416-focal

RUN apt-get update && apt-get install -y git gpg && rm -rf /var/lib/apt/lists/*
RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash

# uploading artifacts/cache needs local gitlab runner executable
RUN curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | os=ubuntu dist=focal bash
RUN apt-get update && apt-get install git-lfs gitlab-runner && rm -rf /var/lib/apt/lists/*

RUN mkdir /builds

COPY aca-entrypoint.sh .

RUN chmod +x aca-entrypoint.sh

CMD ["/build/aca-entrypoint.sh"]
#!/usr/bin/env bash
set -e

export REGISTER_NON_INTERACTIVE=true
export REGISTER_LOCKED=false

DYNAMIC_NAME=$(cat /dev/urandom | tr -cd 'a-f0-9' | head -c 8)

export RUNNER_NAME="Azure Dynamic GitLab Runner ($DYNAMIC_NAME)"
export RUNNER_EXECUTOR=shell
export RUNNER_BUILDS_DIR=/builds
export RUNNER_SHELL=bash

cleanup() {
  print_header "Removing $RUNNER_NAME..."
  gitlab-runner unregister -n "$RUNNER_NAME"
  kill -s SIGQUIT "${glr_pid}"
  wait "${glr_pid}"
  print_success "Runner $RUNNER_NAME successfully stopped and removed."
}

print_header() {
  lightcyan='\033[1;36m'
  nocolor='\033[0m'
  echo -e "${lightcyan}$1${nocolor}"
}

print_success() {
  lightgreen='\033[1;32m'
  nocolor='\033[0m'
  echo -e "${lightgreen}$1${nocolor}"
}

print_header "Registering $RUNNER_NAME..."
gitlab-runner register

CI_SERVER_TOKEN=$(cat /etc/gitlab-runner/config.toml | grep -i "token = \".*\"" | sed 's/\s\stoken =\s\"\(.*\)\"$/\1/')

trap 'cleanup;' SIGINT SIGTERM SIGQUIT

print_header "Running $RUNNER_NAME..."
gitlab-runner run-single -t $CI_SERVER_TOKEN --max-builds 1 &

glr_pid="$!"

wait "${glr_pid}"

@johnnyruz
Copy link

@cmenzi hey, to be fair I honestly don't have the inside information as to what is "idle" in the case of something like the agent scenario. I'm also not an expert on the KEDA/Azure Pipeline scaler stuff, more just testing things out.

However, following the MS guides on configuring a container agent & applying the KEDA Azure Pipeline scaler seems to have the desired effect for my scenario. I always have one agent in the pool, and if there are multiple builds in the queue then multiple instances are scaled up to execute simultaneous builds. My agents are set with the --once flag so once they finish a job they tear themselves down, and it appears that the original container is re-started to comply with the minReplicas = 1 setting.

A couple of things that you've addressed that I don't have an answer for:

  • While the agent listener is connected and polling Azure Pipeline queues, is it truly "Idle"? I don't know and it doesn't appear in the current Preview state that information is readily available through the portal or CLI. Was hoping maybe I could check billing but that also seems to have limited data in the current state
  • What happens in a Long Running Job? I haven't run into this problem yet as the base container/build agent just seems to hang around and be present and available in the Agent Pool. As I mentioned my agents are configured to destroy themselves after a build completes, so if there are more builds in the queue the scaler will re-start them, otherwise it will only restart agents to maintain the minReplicas
  • You're right this works a lot more cleanly for HTTP/Message based queuing stuff, and in my tests I've found that after 5 minutes of no messages or HTTP requests, a scale-down event is triggered for any replicas above minReplica count. I have not seen documented where you can adjust this for Container Apps.

Seems to still be a long way to go as far as features and documentation for Container Apps, but I'm definitely looking forward to seeing enhancements as this service offering gets built out.

@kendallroden
Copy link
Contributor

@chris-schra we are working on pricing documentation which will provide insight into what is considered Idle!

@kendallroden kendallroden added the documentation Improvements or additions to documentation label Feb 23, 2022
@BigMorty
Copy link
Member

@epomatti
Copy link

@BigMorty small issue so I'll keep it here in this thread.

I was doing some estimates in the calculator, and I found the section "Create dedicated instances". I thought it was a concept of the service and tried to find it in the APIs, Portal and docs but couldn't find anything.

"Create a dedicated instance" sounds like a tier selection or a capability, when in fact these are related to the Idle concept.

Perhaps the naming in the calculator can be changed, or maybe add a link there to the pricing page Idle section. Another option would be to add the concept of dedicated instances to the idle section in the docs, because they are the same thing, but it is not specified (or at least I didn't find).

image

image

@BigMorty
Copy link
Member

Agree this is a little confusing, I will talk with someone about this (have to figure out who that is first). As you figured out this is setting the minimum scaling instance count rather than letting the app scale to zero. These "idling" replicas are cheaper, but not free like when scaling to zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

7 participants