Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logging tree support #146

Open
FedericoCeratto opened this issue Apr 16, 2019 · 28 comments
Open

logging tree support #146

FedericoCeratto opened this issue Apr 16, 2019 · 28 comments

Comments

@FedericoCeratto
Copy link
Member

FedericoCeratto commented Apr 16, 2019

Problem: the current logging module is not designed to be used in libraries.
If a library uses ConsoleLogger or FileLogger there is no way for the application to configure its logging behavior. As such, libraries do not do any logging.

To encourage good logging across the library ecosystem:

  • The logger library should implement a logger hierarchy where the the main application is at the root and can control the behavior of the imported modules (e.g. changing logging level only for a submodule)
  • Provide an abstract logger var logger = newLogger() that is backend-independent and not opinionated on the logging format and is muted by default
  • Logging procs accept a string, key-value pairs and typed data
  • Custom converters for typed data can be passed. By default $ is used.
  • Silent loggers should have negligible performance impact (e.g. do not perform string formatting)
  • Provide procs to discover and configure the logging tree
  • Allow external libraries to handle logging and provide other backends
  • The documentation should encourage library developers to use newLogger / debug / info / error
  • Distinguish the main VS library use-cases in the documentation

[1] nim-lang/Nim#3270

@andreaferretti
Copy link

Whatever does not mimic what happened in the JVM space is fine for me. I have written fairly complex applications, and every single time I have to fight the logger configuration, ovverride the imports for libraries, remove duplicated versions of loggin libraries and finally give up and ask a colleague

@Araq
Copy link
Member

Araq commented Apr 17, 2019

I try to avoid "logging" as far as possible, I fill Sqlite DBs instead. Either the data is important, then it should end up in a database or it is not, then it's just noise filling up hard disks everywhere and slowing down the programs.

Having said that, others believe in "logging" and the stdlib's facilities are pretty weak indeed. I'm not sure that a "hierarchy of logging configuration" will end up any better than a "hierarchy for styling" (CSS). Let's maybe instead add some "subsystem: string" idea that can be used to en/disable things? That would be on a "subsystem" level, flat, no hierarchy involved.

@FedericoCeratto
Copy link
Member Author

FedericoCeratto commented Apr 17, 2019

It's not uncommon to store structured logging into databases. journald does that on localhost and remote logging is often stored into log databases with search and alarming capabilities.
Allowing pluggable backends for an abstract logger would help with this.
Example: https://github.com/FedericoCeratto/nim-morelogging#systemds-journald-logger

With a hierarchy similar to python , suppose my application uses httpclient directly and through a 3rd-party library for AWS and I want to debug the latter. Using logger.setThreshold("aws.httpclient", debug) I can enable the logs I care about without getting noise from other httpclient calls. By default the path is made from the name of each module.

@dom96
Copy link
Contributor

dom96 commented Apr 18, 2019

Problem: the current logging module is not designed to be used in libraries.

AFAIK it is, you just can't do everything in your list of bullet points, in particular "changing logging level only for a submodule". Everything else is pretty much there AFAIK. I'm using logging across httpbeast/jester and it works fine.

@FedericoCeratto
Copy link
Member Author

Is this meant to integrate a library's logger with the application logger? https://github.com/dom96/jester/blob/master/jester.nim#L456 If so there should be a helper to do that and it should be recommended for library authors.

@krux02
Copy link
Contributor

krux02 commented May 8, 2019

Well, logging is really a solved problem for some people, and for other people it is an unsolvable problem. But I don't see why this is a language RFC. Logging is one of the easiest things that can be implemented in a nimble package, you can start by porting your favorite logging module from languagX to nim. When a good logging module has been written, it might be worth to check if it could be taken over as the standard logging package that will come with Nim. But I don't think the development of a logging module should happen in the Nim project.

@andreaferretti
Copy link

@krux02 The disadvantage of putting logging in a Nimble package instead is that there will be multiple conflicting logging packages, and you can be sure that the various libraries you import will use all of them. So at some point, someone will standardize an interface, and newer libraries will use that interface and there will arise packages that allow compatibility between this interface and existing logging libraries. This all happened in the JVM and the result is that there are people who are able to, say, train a machine learning algorithm on a distributed cluster but will have fear to configure the logger.

My view is that the stdlib should:

  • provide something like a standard interface (using concepts?)
  • provide a default, even if slow and simplistic, implementation
  • baptize a way for third party packages to perform logging without directly depending on a logger library

This way we can hope to keep the whole thing sane

@krux02
Copy link
Contributor

krux02 commented May 8, 2019

There is a slow and simplistic implementation of a logger: echo. Never had problems with it, fulfills all my needs for a logging "framework". It uses a standard interface ($), does not need concepts.

As I said, everybody has a different perspective on how logging should be done.

@Araq
Copy link
Member

Araq commented May 8, 2019

baptize a way for third party packages to perform logging without directly depending on a logger library

Third party "packages" should not log to begin with. Logging belongs into the application layer and nowhere else. Otherwise every complex subsystem will start to log some crap somewhere and you're just doomed. Using interfaces instead didn't help Java, it won't help Nim and is a symptom of mitigating a misdesign. Logging doesn't need to become simpler to use, it needs to go away entirely.

@andreaferretti
Copy link

I agree with the sentiment. Still, I don't think that logging in libraries will go away anytime soon...

@FedericoCeratto
Copy link
Member Author

@andreaferretti there are external logging libraries e.g. https://github.com/FedericoCeratto/nim-morelogging but, as you said, we need a standard interface that all applications and libraries can use. Hence this RFC.
Logging and metric generation are crucial for complex and critical systems that require continuous monitoring and the ability to be inspect and trace past events. Unfortunately the lack of observability in open source libraries pushes some organizations to reimplement a lot of things internally.

@dom96
Copy link
Contributor

dom96 commented May 8, 2019

Otherwise every complex subsystem will start to log some crap somewhere and you're just doomed.

All the packages I write log using the logging module...

@Araq
Copy link
Member

Araq commented May 8, 2019

All the packages I write log using the logging module...

Well stop it and feel bad for doing that. :P

@dom96
Copy link
Contributor

dom96 commented May 8, 2019

@Araq I don't even know what to say to that. I will say this with as much respect as I can: you need more experience writing and maintaining long-running and scalable services to appreciate the importance of good logging.

I'm really puzzled why you're against the logging module, didn't you design it?

@Araq
Copy link
Member

Araq commented May 8, 2019

The services/web crawlers I wrote worked for months and did run 24 hours every day without crashes so no, I don't need more experience about how to write "long-running and scalable services". And they did use logging and then I wrote a program to analyse these logs... These days I would simply fill a database instead and skip the silly toString/parsing business. The data that I logged was crucial too, so using a database would have been justified.

I'm really puzzled why you're against the logging module, didn't you design it?

I'm not against the logging module, I'm against logging itself.

@zah
Copy link
Member

zah commented May 9, 2019

FWIW, our Chronicles library supports "logging to a database", allows you to disable the logging code on a per-library basis and has many other bells and whistles.

@dom96
Copy link
Contributor

dom96 commented May 9, 2019

@Araq Database require a lot more resources and maintenance than flat files. Sure, they are the ideal, and why I've created the prometheus package, but not every user wants to set up a DB for this.

@Clyybber
Copy link

Clyybber commented May 9, 2019

IMO if a library wants to give me a way to log what it's doing it should allow me to pass a proc to it, with the arguments of that proc defined by the library, but typically include the logging level (an int or enum) and whatever the library gives me to log.
This doesn't require settling on one-almighty logging library; when I want to use a different one, I just change the proc I passed to the library.

So in short: I don't think the stdlib needs a logging module, but rather the libraries should provide an API that lets me plug in my own logging solution.

@zah
Copy link
Member

zah commented May 11, 2019

@Clyybber, enabling and disabling logging and choosing the output format at compile-time is nicer though. With the "injected proc" model, the API will be a bit more limited, because it will have to rely on type erasure to reduce all values that can be logged to something like a formatted string. And the libraries will still face the dilemma whether they should have logging at all, because it will still produce run-time code even when not used. With the model provided by Chronicles, you can just completely eliminate the logging from a certain library with a compile-time switch.

@FedericoCeratto
Copy link
Member Author

A real-life example of a Python application using the boto3 library leads to the following nested loggers: application -> botocore -> s3transfer -> urllib3. Setting all loggers are set to "debug" level proved to be crucial to investigate an issue around a transfer. In other times the application level is set to "debug" while botocore and its children are set to "info" to prevent its very verbose logging.

@zah
Copy link
Member

zah commented Jul 19, 2019

@arnetheduck gave another good argument against the "logging doesn't belong in libraries" philosophy. The best software nowadays is such that everything is a library. For example, is Clang a compiler or a library? If you examine their build scripts, you'll see that the "application" there consist of a single main file, everything else is a reusable library.

The key enabler for having some logging in every library is the ability to turn it off with a compile time switch. Then it can have zero cost in the default configuration, but if you have to debug an issue, you have a richer set of tools at your disposal.

@FedericoCeratto
Copy link
Member Author

FedericoCeratto commented Jul 19, 2019

@zah better a runtime switch. The majority of software is not run and debugged by upstream developers, especially libraires.

@zah
Copy link
Member

zah commented Jul 19, 2019

Chronicles have options to either remove all logging at compile-time or to leave a run-time switch. You can control this for the individual third party libraries that you are using (as long as they play along and also use Chronicles).

@Araq
Copy link
Member

Araq commented Jul 19, 2019

For example, is Clang a compiler or a library? If you examine their build scripts, you'll see that the "application" there consist of a single main file, everything else is a reusable library.

Same for the Nim compiler these days, it's usable as a library. It also doesn't log, so there, you can write sizable applications without logging.

@arnetheduck
Copy link

but nim does log: it prints compile errors, warnings and hints.

clang also logs: it allows you to pass in a log collector (they call it a diagnostic collector) to the compilation process, thus avoiding the ugly global that a lot of logging ends up using (in fact, if doing a good library, this would likely be a cleaner approach rather than the global)

server applications often log - that's often their only ui. not all applications are created equal and need the same amount of logging. that's the nature of the beast, just like not all applications benefit from a gui - nim for example doesn't have a gui. does that make guis less useful in the right context?

the only argument against using logging in a library that I see is that it is that it leads to the information in the log not being exposed in the API so it's harder to get at it programmatically - in return, you get a more simple API with less noise.

all this said, this is a prime example of something that can live in a separate library - there's no reason to have it in the std lib really.. like https://github.com/status-im/nim-chronicles for example.

@Araq
Copy link
Member

Araq commented Jul 19, 2019

the only argument against using logging in a library that I see is that it is that it leads to the information in the log not being exposed in the API so it's harder to get at it programmatically - in return, you get a more simple API with less noise.

Ok, that's a good answer, however that's also what I was getting at: If you expose it in the API it will likely be some callback mechanism and these callbacks can have a typesafe, specialized interface, it's not some textblob you later have to parse.

@disruptek
Copy link

I want to offer a few opinions because I find myself both agreeing and disagreeing with a lot of what I see here, and GitHub rightly disallows offering both a 👍 and a 👎 on a single post. 😉

  • Semantically, all logging should be the same as far as the user is concerned. If I'm writing software and I deem something worth logging, I want the means by which I do that to be invariant, regardless of whether I am authoring library or application. Why should style at time of development, in a static source code document, affect a side-effect of running machine code through silicon? It's runtime at which I want to log, by definition, so that's where my lever should be pitched -- not during development, which happens once, by myriad persons, over an indeterminant time period.

  • I don't want to contemplate the possible? theoretical? weight of logging in that moment in which the thought, "Gee, maybe I should log this." occurs to me, which happens only during development. If I, the expert, at the time the code is written, when I understand it best, think that someone less informed of state might be interested in same, then yes, I need to be able to provide that to my user at minimal/zero cost. We're talking about what is essentially runtime comments. They should be just as considered, but designed for what may be a different audience and what is clearly a different application. Nevertheless, we want them to be just as easy to add, because of how very valuable they may be -- just like source comments.

  • I want to weigh the cost of logging at compile time, when I'm best prepared to do something expensive (changing binary) and I can best reason about my compile-time needs. Similarly, I want to be able to compile-in the option of enabling logging at runtime, because there are very real circumstances in which I may wish to turning logging on and off in a program that is only available in situ binary, due to environmental or developmental or monitoring constraints.

If those qualities can exist in a stdlib solution, regardless of how simple (or incapable) it may be, I think you have a real winner. To me, the rest of it is, really, gravy.

Finally, and since issue #146 is titled as such, with just a small amount of introspection, libraries could be automatically tree'd into a logging system at compile-time, right? Doesn't the Python system use just a simple string, like "moduleA.submoduleB.subsubmoduleC"?

@FedericoCeratto
Copy link
Member Author

FedericoCeratto commented Oct 25, 2019

I don't think the stdlib needs a logging module, but rather the libraries should provide an API that lets me plug in my own logging solution.

That's what a logging module should do. Provide an API so that libraries can send log into something. Support pluggable backends so that application developer can send logs where needed at the required verbosity.
To do that you need a logging tree.

Also, the API to create and extract log events can be typed and support key-value pairs. Logs are not necessarily opaque, raw text blobs. [Edit: I updated the RFC text to clarify this]
Many log management tools support name-value pairs and can do indexing/searching on them. Many logging protocols including syslog and journald support that (https://www.rfc-editor.org/rfc/rfc5424)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants