Skip to content

R&D: Horizon

jay vyas edited this page Sep 7, 2018 · 4 revisions

This is an R&D page, these features are not yet supported in OpsSight, but are available as prototypes if your interested.

OpsSight 3.x will scale horizontally to cover upwards of 10,000 Images.

A typical openshift cluster may have 50,000 or more images, if we estimate that it is reasonable to have 500 to 1000 nodes, each having 50 or more images. Even at the lower end (50 nodes, 50 pods per node), we rapidly approach 5000 images, which can cover several terabytes of scan data.

I/O levels for an OpsSight loaded database may look like this when running in production:

OpsSight 3.x uses an ephemeral but resilient storage model.

Scanning 1000s of containers is disk I/O intensive. It also is very demanding on the postgres database that OpsSight's hub relies on.

OpsSight 3.x testing notes: Initial results.

Variable scan throughput over time is inevitable across multiple hubs, however, overall throughput is constant with opssight 3x:

  • Given 5 hubs, spun up dynamically with protoform, each with a minimal amount of resources:
	smallWebappCPULimit     = "1"
	smallWebappMemoryLimit  = "2560M"
	smallWebappHubMaxMemory = "2048m"
	smallScanReplicas     = 1
	smallScanMemoryLimit  = "2560M"
	smallScanHubMaxMemory = "2048m"
	smallJobRunnerReplicas     = 1
	smallJobRunnerMemoryLimit  = "4608M"
	smallJobRunnerHubMaxMemory = "4096m"

We are able to acheive constant throughput, even when we are killing them over time, for vulnerabilities. Images can be updated in new, ephemeral hubs, at any given time, so the loss of a hub isn't significant.

Compared with starvation scenarios, where you may get a flat line of progress for a large period of time when a given hub goes down.

In order to accomodate this, OpsSight 3.0 will horizontally scale hub instances, and manage its scanning SLA by renewing storage and performance expectations continually, rather then vertically scaling the Hub's resource requirements.

How much load do we have ?

This can be visualized with our vulnerability simulator. The bottom vulnerability chart represents performance when you have multiple hubs in place.

Whats the cost of multiple hubs?

Roughly the same, normalized to performance:

Clone this wiki locally