Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Elasticsearch Log Exporter with tests #18

Closed
wants to merge 20 commits into from

Conversation

MarkSeufert
Copy link

@MarkSeufert MarkSeufert commented Dec 11, 2020

This PR adds an Elasticsearch exporter for logs, which is a requirement from issue #337. This includes:

  • Specifying the host, port, and index/data stream of the Elasticsearch instance
  • Sending an HTTP POST request to Elasticsearch using the HTTP Client Interface, and parsing the response message to ensure it was successfully received
  • Sending batches of LogRecords using the Bulk API
  • Timeout logic, such that the Export() method doesn't block indefinitely
  • Unit tests for the above functionality

Some overlap exists with the OStream exporter PR (open-telemetry#430) currently, but will be addressed in a future PR.

Notes:

  • This exporter current does not support CMake build as there is currently no CMake support for the Nlohmann/JSON library - will try to find a way around this using a backport for it (cc @ maxgolov)
  • Changes to logger.h and log_record.h can be ignore, as this PR will soon be changed to use a recordable instead of an API LogRecord.

cc @xukaren @alolita

@MarkSeufert MarkSeufert force-pushed the logs-elasticsearch-pr branch from 2da8cda to 4255ac1 Compare December 11, 2020 22:18
@codecov-io
Copy link

codecov-io commented Dec 11, 2020

Codecov Report

Merging #18 (056e9cc) into master (e4d0b2c) will increase coverage by 0.07%.
The diff coverage is 97.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #18      +/-   ##
==========================================
+ Coverage   94.38%   94.45%   +0.07%     
==========================================
  Files         185      187       +2     
  Lines        8014     8244     +230     
==========================================
+ Hits         7564     7787     +223     
- Misses        450      457       +7     
Impacted Files Coverage Δ
.../include/opentelemetry/common/key_value_iterable.h 100.00% <ø> (ø)
api/include/opentelemetry/nostd/function_ref.h 77.77% <ø> (ø)
api/include/opentelemetry/nostd/shared_ptr.h 100.00% <ø> (ø)
api/include/opentelemetry/nostd/span.h 89.36% <ø> (+1.36%) ⬆️
api/include/opentelemetry/nostd/string_view.h 97.50% <ø> (ø)
api/include/opentelemetry/nostd/unique_ptr.h 100.00% <ø> (ø)
api/include/opentelemetry/nostd/utility.h 83.33% <ø> (ø)
api/include/opentelemetry/trace/span.h 100.00% <ø> (ø)
api/test/logs/provider_test.cc 100.00% <ø> (ø)
api/test/nostd/string_view_test.cc 100.00% <ø> (ø)
... and 24 more

@alolita alolita changed the title Add Elasticsearch Log Exporter + Tests Add Elasticsearch Log Exporter with tests Dec 11, 2020
@kxyr kxyr force-pushed the master branch 2 times, most recently from 0d650af to 68e510c Compare December 12, 2020 03:34
Copy link

@alolita alolita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Missing copyright header: None of the source files have the OpenTelemetry authors copyright. Please make sure all source files filed as PRs have this copyright header.
  2. Add header comments noting purpose of each source file
  3. Add inline function headers noting purpose of each function

@MarkSeufert MarkSeufert force-pushed the logs-elasticsearch-pr branch 3 times, most recently from 79e6591 to e1c34b1 Compare December 15, 2020 04:16
@MarkSeufert MarkSeufert force-pushed the logs-elasticsearch-pr branch from e1c34b1 to c2ef1af Compare December 15, 2020 04:29
record->SetAttribute("key2", "value2");

// Write the log record to the exporter, and time the duration
auto t1 = std::chrono::high_resolution_clock::now();
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would std::chrono::system_clock::now() suffice?

{}

std::unique_ptr<sdklogs::Recordable> ElasticsearchLogExporter::MakeRecordable() noexcept
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if the exporter is shut down, does it still need to make a recordable? ie. could it return nullptr?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saw your discussion on ostream, leaving as is

request->SetUri(options_.index_ + "/_bulk?pretty");
request->SetMethod(http_client::Method::Post);
request->AddHeader("Content-Type", "application/json");
request->SetTimeoutMs(std::chrono::milliseconds(1000 * options_.response_timeout_));
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could just use microseconds instead of multiplying by 1000?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using seconds because http timeouts can be fairly large (like 30 seconds), so typing 30 makes more sense than 30000 imo.

Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean would request->SetTimeoutMs(std::chrono::microseconds(options_.response_timeout_)); work?
Edit: ignore this, woul dneed std::duration_cast

static_cast<ElasticSearchRecordable *>(record.release()));
body += json_record->GetJSON().dump() + "\n";
}
std::vector<uint8_t> body_vec(body.begin(), body.end());
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why is body set as a uint8_t?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in Lalit's http client interface the body is represented internally as a vector of uint8_t, so I want to keep consistent with that

*/
std::string GetResponseBody()
{
if (!response_received_)
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it possible to check if response_received before entering this function instead? not sure if setting a random response body here is good practice or

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea it's being checked in the export method already, I had this as a just in case kinda thing tho. I'll take it out tho since it's never gonna get called

bool response_received_ = false;

// A string to store the response body
std::string body_ = "";
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this needs a default initialization? the default string ctor should be fine :)

request->SetTimeoutMs(std::chrono::milliseconds(1000 * options_.response_timeout_));

// Add the request body
std::string body = "";
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this variable not already initialized as a private member variable? (see comment above) does this overwrite it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, the body member variable is for the response classes response body, and this is for the request body

Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh i see, could they maybe be named like response_body and request_body? idk if you think thats necessary

{
// Create options for the elasticsearch exporter
logs_exporter::ElasticsearchExporterOptions options("localhost", -1, "logs", 5, true);
options.response_timeout_ = 10; // Wait 10 seconds to receive a response
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this overkill? (if the CI needed to wait 10 seconds while running this test, I'm just wondering if this will be a problem)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering Lalit's http unit test does this except with 30 seconds, I think its fine :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see yeah, I get wait during the CI and I'm always afraid it's deadlocking or sth, gotcha gotcha

auto exporter =
std::unique_ptr<sdklogs::LogExporter>(new logs_exporter::ElasticsearchLogExporter);
bool shutdownResult = exporter->Shutdown();
ASSERT_TRUE(shutdownResult);
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: EXPECT_TRUE(exporter->Shutdown());?


// Ensure the timeout is within the range of the timeout specified ([10, 10 + 1] seconds)
auto duration = std::chrono::duration_cast<std::chrono::seconds>(t2 - t1).count();
ASSERT_TRUE((duration >= options.response_timeout_) &&
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i'm confused why the timeout here has such a strict lowerbound? (i thought anything greater than 0 would be acceptable) - is it sufficient to just check ASSERT_TRUE(duration < options.response_timeout_ +1)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'm checking that it waits for the timeout instead of immediately returning failure. Its important for this exporter to wait for at least the timeout to give time for a connection/response to come.

* @param host The host of the Elasticsearch instance
* @param port The port of the Elasticsearch instance
* @param index The index/shard that the logs will be written to
* @param response_timeout The maximum time the exporter should wait after sending a request to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: specify whether this timeout is in milliseconds, microseconds, etc?

ElasticsearchExporterOptions(std::string host = "localhost",
int port = 9200,
std::string index = "logs",
int response_timeout = 30,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related nit: would it be better just to use std::chrono::milliseconds/microseconds here isntead of int?

Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also curious- what's the difference/purpose of passing in the params separately into the constructor, as opposed to just the plain old constructor with the options struct?

/**
* Exports a vector of log records to the Elasticsearch instance. Guaranteed to return after a
* timeout specified from the options passed from the constructor.
* @param records A list of log records to send to Elasticsearch.
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: because it might be confusing log records with the LogRecord recordable: change comment to say just a list of "logs"?

// Return failure if this exporter has been shutdown
if (isShutdown_)
{
if (options_.console_debug_)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add {} around this if block too?

{
// TODO: retry logic

if (options_.console_debug_)
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{} here (i forget which place i commented on in the upstream PR :p)

* localhost:9200/logs with a timeout of 30 seconds and disabled console debugging
* @param host The host of the Elasticsearch instance
* @param port The port of the Elasticsearch instance
* @param index The index/shard that the logs will be written to
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering - should this index variable also be used in the body_ variable? which currently has the default index of index:{}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way elasticsearch works is you connect to it with an endpoint, like localhost::3000/logs for example. The /logs part is the index, so having the index:{} means that it will use the index specified in the endpoint instead of a specific one.


private:
// Stores if this exporter had its Shutdown() method called
bool isShutdown_ = false;
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another nit: is_shutdown_ for google naming convention

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also just wondering through (relates to the ostream exporter too) - do you think this var should be initialized in the ctor instead of here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm having it here means we don't have to set it in all the different constructors (this one has 2), but I'm also not sure about if OTEL encourages this... I'll ask during the next maintainers meeting!

* @param index The index/shard that the logs will be written to
* @param response_timeout The maximum time the exporter should wait after sending a request to
* Elasticsearch
* @param console_debug Print the status of the exporter methods in the console
Copy link

@kxyr kxyr Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename var to console_debug_on to indicte it's a bool? and maybe the comment could be: "whether or not this exporter prints [...]", so it indicates this is a flag to be toggled

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, if there's a bool variable called console_debug its pretty explicit that turning it on is going to enable console debugging. So I think leaving it as is is cool

/**
* Returns a JSON object contain the log information
*/
nlohmann::json GetJSON() noexcept { return json_; };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: GetJSONRecord()?

@kxyr
Copy link

kxyr commented Dec 15, 2020

Sorry for all the comments - "nit" has also become my favorite word :)

@kxyr kxyr force-pushed the logs-elasticsearch-pr branch 7 times, most recently from f7dda89 to 6fd1289 Compare December 22, 2020 10:19
@kxyr kxyr force-pushed the logs-elasticsearch-pr branch from 056e9cc to 0526050 Compare December 22, 2020 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants