Skip to content

Commit

Permalink
[QT-437] Optimize matrix operations (#77)
Browse files Browse the repository at this point in the history
This work was spawned by reports of people encountering context deadline errors when trying to list scenarios. We previously had a hard timeout of 5 seconds for listing which we have changed to use scenario level `--timeout` flag going forward. This solved the immediate problem but begged the question of why scenario listing was taking nearly that long at all.

When I was investigating this I determined there were several reasons that listing was slow:
  * We were always decoding and evaluating all scenarios when listing, even though we only needed the reference information. If you're using `enos scenario list` to validate that the configuration for all scenarios is correct, that makes sense, but most people are probably using it to see which scenarios are available and don't necessarily want to validate all of them every time the command is invoked.
  * When filtering we would always fully decode all scenarios before filtering to keep the scenarios that are desired.
  * Every time we were comparing a matrix vector we were making a copy of it. The problem compounded exponentially as we added additional vectors and elements to the matrix. Nearly 3/4's of the CPU time we spent when listing scenarios was actually being used here for allocations, garbage collection, and sorting after copy.

To improve our situation we do the following:
  * Add support to the decoder for shallow decoding of scenarios to the reference level.
  * Add filtering support to the decoder. Now we can optionally shallow decode, then filter, before fully decoding a scenario.
  * Rewrite our matrix vector implementation. Instead of an array alias the matrix Vector type is a struct which we use for new efficiency gains. We now always pass references to Vectors instead of passing them by value and creating copies. We also lazily keep track of a sorted copy of vectors to allow faster repeat comparisons. This has drastically reduced our allocations and garbage collection.

When we combine all of these changes we improved the listing time by at least an order of magnitude. A drawback to this approach is that listing no longer validates that the full configuration in a flight plan, the scenarios and all of their variants. To handle that we introduce a new `validate` sub-command that fully decodes all matched scenarios to ensure that the flight plan is valid. We also introduce the `--profile` hidden flag that will turn on CPU and memory profiling and output the pprof files into the current directory. These profiles were useful in determining the bottleneck of the prior implementation so we'll leave them there for possible future use.

* Add support for reference level scenario decoding
* Add support for filtering during decoding
* Rewrite matrix vector implementation to reduce allocations and GC
  * Use references to vectors instead of passing by value
  * Lazily create ordered copies of vectors when comparing
* Add `--profile` hidden flag to enable CPU and memory profiling
* Add `validate` sub-command for validating configuration of a flight plan.
* Bump version

Signed-off-by: Ryan Cragun <me@ryan.ec>
  • Loading branch information
ryancragun authored Nov 30, 2022
1 parent a4ec981 commit 9a833b5
Show file tree
Hide file tree
Showing 33 changed files with 1,878 additions and 1,236 deletions.
2 changes: 1 addition & 1 deletion acceptance/scenario_check_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ func TestAcc_Cmd_Scenario_Check_WithWarnings(t *testing.T) {
path, err := filepath.Abs(filepath.Join("./scenarios", "scenario_generate_has_warnings"))
require.NoError(t, err)

cmd := fmt.Sprintf("scenario validate --chdir %s --out %s --format json", path, outDir)
cmd := fmt.Sprintf("scenario check --chdir %s --out %s --format json", path, outDir)
if failOnWarnings {
cmd = fmt.Sprintf("%s --fail-on-warnings", cmd)
}
Expand Down
48 changes: 48 additions & 0 deletions acceptance/scenario_validate_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
package acceptance

import (
"context"
"fmt"
"path/filepath"
"testing"

"github.com/stretchr/testify/require"
"google.golang.org/protobuf/encoding/protojson"

"github.com/hashicorp/enos/proto/hashicorp/enos/v1/pb"
)

func TestAcc_Cmd_Scenario_Validate(t *testing.T) {
enos := newAcceptanceRunner(t)

for _, test := range []struct {
dir string
out *pb.ValidateScenariosConfigurationResponse
fail bool
}{
{
dir: "scenario_list_pass_0",
out: &pb.ValidateScenariosConfigurationResponse{},
},
{
dir: "scenario_list_fail_malformed",
fail: true,
},
} {
t.Run(test.dir, func(t *testing.T) {
path, err := filepath.Abs(filepath.Join("./scenarios", test.dir))
require.NoError(t, err)
cmd := fmt.Sprintf("scenario validate --chdir %s --format json", path)
fmt.Println(path)
out, err := enos.run(context.Background(), cmd)
if test.fail {
require.Error(t, err)
return
}

require.NoError(t, err)
got := &pb.ValidateScenariosConfigurationResponse{}
require.NoError(t, protojson.Unmarshal(out, got))
})
}
}
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ require (
github.com/mitchellh/go-wordwrap v0.0.0-20150314170334-ad45545899c7
github.com/olekukonko/tablewriter v0.0.5
github.com/spf13/cobra v1.4.0
github.com/stretchr/testify v1.7.0
github.com/stretchr/testify v1.8.0
github.com/zclconf/go-cty v1.11.0
golang.org/x/term v0.0.0-20220411215600-e5f449aeb171
golang.org/x/text v0.3.8
Expand Down Expand Up @@ -65,5 +65,5 @@ require (
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f // indirect
google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013 // indirect
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 // indirect
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
9 changes: 6 additions & 3 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -166,12 +166,14 @@ github.com/spf13/cobra v1.4.0/go.mod h1:Wo4iy3BUC+X2Fybo0PDqwJIv3dNRiZLHQymsfxlB
github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.0 h1:nwc3DEeHmmLAfoZucVR881uASk0Mfjw8xYJ99tb5CcY=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.8.0 h1:pSgiaMZlXftHpm5L7V1+rVB+AZJydKsMxsQBIJw4PKk=
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/vmihailenco/msgpack v3.3.3+incompatible/go.mod h1:fy3FlTQTDXWkZ7Bh6AcGMlsjHatGryHQYUTf1ShIgkk=
github.com/vmihailenco/msgpack/v4 v4.3.12/go.mod h1:gborTTJjAo/GWTqqRjrLCn9pgNN+NXzzngzBKDPIqw4=
github.com/vmihailenco/tagparser v0.1.1/go.mod h1:OeAg3pn3UbLjkWt+rN9oFYB6u/cQgqMEUPoW2WPyhdI=
Expand Down Expand Up @@ -258,7 +260,8 @@ gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.3.0/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c h1:dUUwHk2QECo/6vqA44rthZ8ie2QXMNeKRTHCNY2nXvo=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4=
69 changes: 69 additions & 0 deletions internal/command/enos/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@ import (
"context"
"errors"
"fmt"
"io"
"os"
"path/filepath"
"runtime"
"runtime/pprof"
"strings"
"time"

Expand Down Expand Up @@ -37,6 +41,8 @@ type rootStateS struct {
enosServer *server.ServiceV1
enosConnection *client.Connection
operatorConfig *pb.Operator_Config
profile bool
cpuProfileOut io.ReadWriteCloser
}

var rootState = &rootStateS{
Expand All @@ -60,6 +66,9 @@ func Execute() {
rootCmd.PersistentFlags().StringVar(&rootState.stdoutPath, "stdout", "", "Path to write output. (default $STDOUT)")
rootCmd.PersistentFlags().StringVar(&rootState.stderrPath, "stderr", "", "Path to write error output. (default $STDERR)")
rootCmd.PersistentFlags().Int32Var(&rootState.operatorConfig.WorkerCount, "worker-count", 4, "Number of scenario operation workers")
rootCmd.PersistentFlags().BoolVar(&rootState.profile, "profile", false, "Enable Go profiling")
_ = rootCmd.PersistentFlags().MarkHidden("profile")

if err := rootCmd.Execute(); err != nil {
var exitErr *status.ErrExit
if errors.As(err, &exitErr) {
Expand Down Expand Up @@ -130,9 +139,54 @@ func setupCLIUI() error {
return err
}

func startCPUProfiling() error {
wd, err := os.Getwd()
if err != nil {
return err
}

rootState.cpuProfileOut, err = os.Create(filepath.Join(wd, "cpu.pprof"))
if err != nil {
return err
}

if err := pprof.StartCPUProfile(rootState.cpuProfileOut); err != nil {
return err
}

return nil
}

func runMemoryProfiling() error {
wd, err := os.Getwd()
if err != nil {
return err
}

m, err := os.Create(filepath.Join(wd, "memory.pprof"))
if err != nil {
return err
}
defer m.Close()

runtime.GC()

if err := pprof.WriteHeapProfile(m); err != nil {
return err
}

return nil
}

func rootCmdPreRun(cmd *cobra.Command, args []string) error {
cmd.SilenceErrors = true // we handle this ourselves

if rootState.profile {
if err := startCPUProfiling(); err != nil {
return err
}
}

// Setup our UI configuration first
err := setupCLIUI()
if err != nil {
Expand All @@ -155,12 +209,27 @@ func rootCmdPreRun(cmd *cobra.Command, args []string) error {
}

func rootCmdPostRun(cmd *cobra.Command, args []string) {
if rootState.profile {
if rootState.cpuProfileOut != nil {
defer rootState.cpuProfileOut.Close()
}
defer pprof.StopCPUProfile()
}

if rootState.enosServer != nil {
err := rootState.enosServer.Stop()
if err != nil {
_ = ui.ShowError(err)
}
}

// Run memory profiling after we've shut everything down everything but
// our UI
if rootState.profile {
if err := runMemoryProfiling(); err != nil {
_ = ui.ShowError(err)
}
}

ui.Close()
}
8 changes: 2 additions & 6 deletions internal/command/enos/cmd/scenario.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ func newScenarioCmd() *cobra.Command {
scenarioCmd.AddCommand(newScenarioRunCmd())
scenarioCmd.AddCommand(newScenarioExecCmd())
scenarioCmd.AddCommand(newScenarioOutputCmd())
scenarioCmd.AddCommand(newScenarioValidateConfigCmd())

return scenarioCmd
}
Expand Down Expand Up @@ -104,12 +105,7 @@ func scenarioCmdPreRun(cmd *cobra.Command, args []string) error {
// scenarioCmdPostRun is the scenario sub-command post-run. We'll use it to shut
// down the server.
func scenarioCmdPostRun(cmd *cobra.Command, args []string) {
if rootState.enosServer != nil {
err := rootState.enosServer.Stop()
if err != nil {
_ = ui.ShowError(err)
}
}
rootCmdPostRun(cmd, args)
}

// setupDefaultScenarioCfg sets up default scenario configuration
Expand Down
1 change: 0 additions & 1 deletion internal/command/enos/cmd/scenario_check.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ import (
func newScenarioCheckCmd() *cobra.Command {
cmd := &cobra.Command{
Use: "check [FILTER]",
Aliases: []string{"validate"}, // old name of the check command
Short: "Check that scenarios are valid",
Long: fmt.Sprintf("Check that scenarios are valid by generating the Scenario's Terraform Root Module, initializing it, validating it, and planning. %s", scenarioFilterDesc),
RunE: runScenarioCheckCmd,
Expand Down
5 changes: 1 addition & 4 deletions internal/command/enos/cmd/scenario_list.go
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
package cmd

import (
"context"
"time"

"github.com/spf13/cobra"

"github.com/hashicorp/enos/internal/diagnostics"
Expand All @@ -24,7 +21,7 @@ func newScenarioListCmd() *cobra.Command {

// runScenarioListCmd runs a scenario list
func runScenarioListCmd(cmd *cobra.Command, args []string) error {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
ctx, cancel := scenarioTimeoutContext()
defer cancel()

sf, err := flightplan.ParseScenarioFilter(args)
Expand Down
46 changes: 46 additions & 0 deletions internal/command/enos/cmd/scenario_validate_config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
package cmd

import (
"github.com/spf13/cobra"

"github.com/hashicorp/enos/internal/diagnostics"
"github.com/hashicorp/enos/internal/flightplan"
"github.com/hashicorp/enos/proto/hashicorp/enos/v1/pb"
)

func newScenarioValidateConfigCmd() *cobra.Command {
return &cobra.Command{
Use: "validate [FILTER]",
Short: "Validate configuration",
Long: "Validate all scenario and variant configurations",
RunE: runScenarioValidateCfgCmd,
ValidArgsFunction: scenarioNameCompletion,
}
}

// runScenarioValidateCfgCmd is the function that validates all flight plan configuration
func runScenarioValidateCfgCmd(cmd *cobra.Command, args []string) error {
ctx, cancel := scenarioTimeoutContext()
defer cancel()

sf, err := flightplan.ParseScenarioFilter(args)
if err != nil {
return ui.ShowScenariosValidateConfig(&pb.ValidateScenariosConfigurationResponse{
Diagnostics: diagnostics.FromErr(err),
})
}

res, err := rootState.enosConnection.Client.ValidateScenariosConfiguration(
ctx, &pb.ValidateScenariosConfigurationRequest{
Workspace: &pb.Workspace{
Flightplan: scenarioState.protoFp,
},
Filter: sf.Proto(),
},
)
if err != nil {
return err
}

return ui.ShowScenariosValidateConfig(res)
}
7 changes: 0 additions & 7 deletions internal/command/enos/main.go

This file was deleted.

Loading

0 comments on commit 9a833b5

Please sign in to comment.