Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a new AWS ECS Health Check Extension #4451

Closed
wants to merge 17 commits into from

Conversation

skyduo
Copy link
Contributor

@skyduo skyduo commented Aug 5, 2021

Description:
This is a new health check extension feature for AWS/ECS, as we are going to monitor the OT collector health through the return status from endpoint of the new extension, detailed design in the design doc below.

Link to tracking Issue:
open-telemetry/opentelemetry-collector#2573

Testing:
make otelcontribcol and run executable file

Documentation:
https://docs.google.com/document/d/1SpUMsWA2DeaoVazeQ8uEc1Wvu5LphmQU_TjzLmuJ4QM/edit#heading=h.rs1luwizct2w

original pr which has some discussion with @bogdandrutu and code is almost same as this pr:
open-telemetry/opentelemetry-collector#3614

@skyduo skyduo requested review from a team and owais August 5, 2021 05:53
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 5, 2021

CLA Signed

The committers are authorized under a signed CLA.

@skyduo
Copy link
Contributor Author

skyduo commented Aug 11, 2021

Hi @bogdandrutu ! could you help review this pr? thanks!

@hossain-rayhan
Copy link
Contributor

@Aneurysm9 can you please help to review this one from AWS side? Thanks.
CC: @alolita

@skyduo skyduo requested a review from Aneurysm9 August 18, 2021 22:58
Aneurysm9
Aneurysm9 previously approved these changes Aug 18, 2021
Copy link
Contributor

@hossain-rayhan hossain-rayhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a README explaining the config and rotation mechanism.

@bogdandrutu
Copy link
Member

Please rebase :)

extension/awsecshealthcheckextension/exporter.go Outdated Show resolved Hide resolved
extension/awsecshealthcheckextension/exporter.go Outdated Show resolved Hide resolved
extension/awsecshealthcheckextension/extension.go Outdated Show resolved Hide resolved
extension/awsecshealthcheckextension/extension.go Outdated Show resolved Hide resolved
extension/awsecshealthcheckextension/exporter.go Outdated Show resolved Hide resolved
@skyduo
Copy link
Contributor Author

skyduo commented Aug 20, 2021

Hey @bogdandrutu ! please have a review again, and point out if any changes needed.

@skyduo skyduo requested a review from bogdandrutu August 24, 2021 22:37
@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

# Conflicts:
#	extension/awsecshealthcheckextension/go.mod
#	extension/awsecshealthcheckextension/go.sum
#	internal/components/components.go
@PettitWesley
Copy link

This should not be marked as stale, we still want it released.

@anuraaga
Copy link
Contributor

anuraaga commented Sep 3, 2021

There doesn't seem to be anything ECS-specific in this code. Shouldn't the ability to become not-ready when there are an excessive number of exporter failures be added to the normal healthcheckextension instead of a new extension? This is broadly applicable.

Related is there is interest in rewriting that extension to use a system such as https://github.com/alexliesenfeld/health @jpkrohling @bogdandrutu

@jpkrohling
Copy link
Member

Related is there is interest in rewriting that extension to use a system such as https://github.com/alexliesenfeld/health

Yes, I would prefer to have this in place of the current health check extension.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 1, 2021
@bogdandrutu bogdandrutu removed the Stale label Oct 6, 2021
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.


func TestLoadConfig(t *testing.T) {
factories, err := componenttest.NopFactories()
assert.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert.NoError(t, err)
require.NoError(t, err)

If this fails, it would create a flaky test with a nil pointer exception


// ECSHealthCheckExporter is a struct implement the exporter interface in open census that could export metrics
type ecsHealthCheckExporter struct {
mu sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mu sync.Mutex
rw sync.RWMutex

Using a RW Mutex will allow better performance.


viewNum := len(e.exporterErrorQueue)
currentTime := time.Now()
for i := 0; i < viewNum; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is managing a sliding window so index values could be stored inside the struct, improving performance.

"testing"
"time"

"gotest.tools/v3/assert"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could i ask you https://pkg.go.dev/github.com/stretchr/testify/assert instead? It heavily used within the project and keeps testing patterns the same.

)

func TestECSHealthCheckExporter_ExportView(t *testing.T) {
exporter := &ecsHealthCheckExporter{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I ask you to us t.Parallel() in tests?

Would allow to speed up testing throughout the project 🙏🏽

default:
}

if err := hc.server.Serve(ln); err != http.ErrServerClosed && err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use errors.Is ?

"github.com/stretchr/testify/require"
"go.uber.org/zap"

"go.opencensus.io/stats/view"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correctly "import-ed" ?

ExporterErrorLimit: 1,
}

hcExt := newServer(config, zap.NewNop())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hcExt := newServer(config, zap.NewNop())
hcExt := newServer(config, zaptest.NewTestLogger(t))

}

func TestHealthCheckExtensionPortAlreadyInUse(t *testing.T) {
endpoint := "localhost:13134"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has potential to cause bind issues since it matches above, would be better to use httptest.NewServer ?

Interval: "5m",
ExporterErrorLimit: 5,
}
hcExt := newServer(config, zap.NewNop())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hcExt := newServer(config, zap.NewNop())
hcExt := newServer(config, zaptest.NewTestLogger(t))

@MovieStoreGuy
Copy link
Contributor

Can you resolve the conflicts please?

@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 4, 2021
@Aneurysm9 Aneurysm9 self-requested a review December 6, 2021 22:33
@Aneurysm9 Aneurysm9 dismissed their stale review December 6, 2021 22:35

Comments from @MovieStoreGuy require reassessment.

@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Dec 14, 2021
hex1848 pushed a commit to hex1848/opentelemetry-collector-contrib that referenced this pull request Jun 2, 2022
* `configgrpc`: Update `ToDialOptions` with settings

Adding the ability to pass TelemetrySettings to ToDialOptions to configure underlying instrumentation library.

Fixes open-telemetry#4424

* update changelog

* add telemetrysetting to exporter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants