[WIP] Flexible and convenient logging #30

c42f · 2017-02-24T15:42:02Z

As discussed in JuliaLang/julia#13515, there's still some desire to iterate on julia's logging interface to make it capable of supporting production needs.

So far, I've tried to write down some desired features, without focusing on a particular implementation strategy, though certainly informed by a prototype which covers most of what is mentioned here.

This is still very much a work in progress, but I'm posting it here for early feedback.

This is an attempt to write down some desired features of a julian logging API, without referring to a particular implementation strategy.

StefanKarpinski · 2017-02-24T16:20:59Z

Excellent start – can't wait to see more!

c42f · 2017-04-01T22:50:18Z

Logging.md

+* It should be possible to log to a user defined log context; automatically
+  choosing a context for zero setup logging may not suit all cases.  For
+  example, in some cases we may want to use a log context explicitly attached to
+  a user-defined data structure.


This is basically the "dependency injection" case - needs some more thought about how loggers are associated with contexts.

StefanKarpinski · 2017-05-13T03:59:30Z

Should we merge this?

rofinn · 2017-05-13T04:40:48Z

Sorry, I forgot to comment on this. FWIW, I think starting with a really simple AbstractLogger interface and leaving the more opinionated design choices (like log record types, filters, etc) up to libraries to handle would be a prudent choice.

StefanKarpinski · 2017-05-13T06:02:38Z

I'd like to have a minimal set of standard log record types that should work everywhere at least, but they could certainly be subtyped/extended by more opinionated packages.

c42f · 2017-05-13T10:28:25Z

To give you an update on where my thinking is going with this one -

What I want out of this Julep is to standardize an easy to use frontend interface which will be used by package authors. To me this means specifying a minimum of syntax and documenting some clear conventions for structured log records. After a lot of prototyping, I've now got a rather concrete proposal to cover these design requirements. I'm happy to update the current PR with these and we can get into discussing the detail.

I don't intend to design a full logging backend library with complete functionality - this can be further explored in logging packages outside of Base. However, I want the interface to be sufficiently general to support such packages and I've been trying to anticipate a bunch of the requirements they'll have by examining other logging libraries and checking that the same basic functionality is supported in the prototype.

c42f · 2017-05-13T10:41:42Z

As to whether this should be merged, I suppose I somewhat prefer to continue in this PR until I've got the concrete design written down.

c42f · 2017-05-13T11:38:45Z

@rofinn agreed that we should design an abstract logger interface here (though I'm not yet sure we actually need an AbstractLogger type).

@StefanKarpinski I was coming to the conclusion that a log record is:

A log level, taken from the name of the logging macro
A message (of any type?)
An optional set of key value pairs
Metadata extracted automatically from the call site (line number, file name, module, unique message id, etc)

This allows for a nice minimum of syntax at the use site

@info "A message"

@warn "Something bad happened" my_var=100

@info "Algorithm A" progress=i/100

@info "You only see this twice" max_log=2

I was hoping to document some particularly useful keys as having a specific meaning, for instance progress reporting via the progress keyword.

ChrisRackauckas · 2017-05-13T11:58:15Z

+💯 for a standardized progress setup. It would be really nice if this was standardized, because as it stands right now you have to implement multiple packages to use progress tracking differently (ProgressMeter.jl, progress in Juno, etc.), so having a single one that they are supposed to extend off of would be nice. But that's sufficiently complicated to be its own discussion, see

JunoLab/Juno.jl#29

and you'd want to contact @pfitzseb and see what would be required for it to at least handle what Juno needs, otherwise I think it would be ignored.

MikeInnes · 2017-05-13T12:06:30Z

Would be great to make this pluggable like the display system is. It'd be a huge boon to have things like a UI that collates deprecation warnings and such, especially if it can be extended to user logging as well.

c42f · 2017-05-13T12:16:28Z

@ChrisRackauckas @MikeInnes exactly what I was thinking! I want programs like Juno to be able to suck in logs and display them however seems most appropriate. When a progress keyword is included, this would naturally be a progress bar. My general plan is that programs like Juno should install a global logging backend to intercept all logs and do what they see fit with them.

I've already played with making ProgressMeter.@showprogress use MicroLogging as a backend, and it seems to work quite nicely.

ChrisRackauckas · 2017-05-13T12:18:24Z

I've already played with making ProgressMeter.@progress use MicroLogging as a backend, and it seems to work quite nicely.

But does it do the layered loops and layered progress monitoring? That's what I am asking. It's nice to have a basic system which everything extends off of, but if it doesn't actually capture the functionality that already exists and is useful, it would be tough for things to change and use it.

c42f · 2017-05-13T12:27:04Z

But does it do the layered loops and layered progress monitoring?

Unlikely, as I didn't know this existed - but that makes this the best kind of question :-) I assume you refer to JunoLab/Juno.jl#19, I'll have a look.

MikeInnes · 2017-05-13T13:11:45Z

It handles concurrency, nested loops etc, so it's pretty sweet. We'd like to extend it so that you can cancel tasks as well.

It does seem like this could be unified with logging infrastructure somehow. You could do something like l = Logger() per task, sending and possibly receiving arbitrary messages. Progress is then just a subset of that, and on the receiving end you can do things like differentiate between parallel tasks.

c42f · 2017-05-13T13:28:56Z

The Juno composable progress stuff looks great. I think MicroLogging would already support being a transport layer for this (it's got arbitrary log message types and arbitrary key value pairs), but maybe not in a really natural way. The main thing seems to be making sure there's a way to communicate that the progress step is finished (the done() function) so the progress step can be removed from the stack.

I've put a bit of thought into how the logging stuff would work with tasks. As it exists so far, I've decided that dynamic scoping via the task local storage is the best way to look up the current logger (falling back to a global logger). This is quite a departure from logging libraries in most languages which usually associate loggers with modules or classes. However, dynamically scoped logging appears to have some interesting advantages, especially for multithreaded code and capturing logs while unit testing. Example:

using MicroLogging

@info "This goes to the global logger"

with_logger(MyLogger()) do
    @info "This goes to the MyLogger instance which is looked up via task local storage"
    foo() # Any log messages in here also go to the MyLogger instance
end

MikeInnes · 2017-05-13T13:34:17Z

Yes, you'd have to treat the logger as something like a channel and close it when done.

I don't think the object-y approach is necessarily incompatible with the dynamically scoped version (which does look useful). If the lowest level API looks like:

l = Logger()
@info l "foo"
close(l)

You could easily build with_logger on top of that, as well as having @info fall back to TLS with one argument etc. That gives you a ton of flexibility. In Juno we could build @progress for on top of that api as well.

c42f · 2017-05-13T13:59:25Z

The object-y version was exactly what I had early on, but I removed it as an "unnecessary" feature when I discovered the TLS approach :-) The reasoning being that package authors generally won't want and shouldn't be required to thread the logger context through their function calls.

I could add it back in again without any difficulty. Do you have a compelling use case in mind?

c42f · 2017-05-13T14:07:38Z

As it exists so far, I've decided that dynamic scoping via the task local storage is the best way to look up the current logger. This is quite a departure from logging libraries in most languages which usually associate loggers with modules or classes

To expand - the TLS is used for finding the initial entry point for log dispatch. However, the module is captured as part of the log record metadata so further dispatch could redirect to a more traditional module-based logger.

rofinn · 2017-05-13T14:19:38Z

Also, how would hierarchical logging fit into this API? This seems like something folks would want and perhaps it can mostly be left up to libraries like MiniLogging.jl and Memento.jl, but it might useful to consider this functionality when designing the API.

c42f · 2017-05-13T14:39:19Z

Yes, you'd have to treat the logger as something like a channel and close it when done.

@MikeInnes Actually, I'm not quite sure I understood your use case very well. Is it that you want to turn

@progress "name" for x in range
   body
end

into something like

p = ProgressLogger()
try
    for (i,x) in enumerate(range)
        body
        @info p "name" progress=i/length(range)
    end
finally
    done(p)
end

? Actually having written this, it doesn't make a lot of sense to me (generally packages shouldn't be messing with loggers unless they're in the outermost scope, otherwise they'll prevent the logger from being injected by the user) so I think I'm still confused. Here's what I'm thinking might make sense:

try
    id = $(gensym())
    for (i,x) in enumerate(range)
        body
        @info "name" progress=i/length(range) id=id
    end
finally
    @info "name" progress=1.0 id=id  # Or possibly progress_done=true ?
end

with the assumption that achieving a progress of 1.0 terminates the step. Possibly you'd want the gensym'd message id to be overridden if you didn't want to key the progress step off "name". (All log messages generated by the lower level MicroLogging.@logmsg have a unique message id as a key for things like log limiting.)

c42f · 2017-05-13T14:49:43Z

how would hierarchical logging fit into this API

@rofinn Good question! Here's where I need your advice on how you're using this in production, because while I see a lot of other libraries implement it, I admit I don't really get why this is so useful. I'm worried about that, and I'd like someone to explain it to me :-) Especially with a handful of use cases so I can understand whether it's a necessity, or whether there are other equally good ways of achieving the same end goal.

c42f · 2017-05-13T15:02:06Z

By the way, I did start out with something resembling hierarchical logging (following python, which I understand follows log4j1). But things diverged from there into the TLS based system as I tried to understand how you can solve the problem of controlling logging in the multithreaded case, while also avoiding the burden of package authors keeping track of the log context. The authors of log4j2 just use loggers in a global hierarchy, and handle threading issues by inserting a piece of TLS (the "fish tagging" approach https://logging.apache.org/log4j/2.x/manual/thread-context.html). I went a step further, and put the logger itself in the TLS...

Anyway, which features of the hierarchy do you use, and how do you use them?

rofinn · 2017-05-13T15:19:08Z

Typically, I use hierarchical logging to add extra logging behaviour for specific modules. For example, if I'm interacting with 3rd party services like aws I find it convenient to add an extra handler for writing to a separate file or pushing to a log server for later audit. The nice part of it being hierarchical is that I can have a bunch of fine grained loggers floating around, but I only need to configure the child loggers if I want additional behaviour (on top of what I've configured in the root logger for my application). I've also occasionally set the child logger to not propagate messages if I want to tightly control the behaviour of log messages within a given module (this isn't very often though).

c42f · 2017-05-13T23:49:54Z

Thanks @rofinn, the hierarchical thing definitely makes sense for controlling filtering, formatting and dispatch for groups of log messages. I had it in my head that there's more to it than that, but perhaps not.

My current plan was that a logging backend could deal with each record by looking up whatever filter/format/dispatch rules using the log metadata. A hierarchy is one convenient way to do this, but I'm not sure we should bake it into the system - sometimes I think it'd be useful to have rules which step outside the hierarchy. As I've implemented it so far, the relevant metadata for "rule matching" are

log level
Module
filename / group
line number / id
(roughly increasing order of fine-grainedness)

Even a minimal backend in Base will need some kind of configuration system (eg, for depwarn control independent of other warnings).

The idea of "log group" is a speculative user-defined grouping based on a conventional group keyword. It might look something like

function foo()
    @warn "You shouldn't use this, use bar() instead" group=:deprecated
end

rofinn · 2017-05-14T00:38:23Z

Hmmm, okay. Sorry, I'm still confused how a backend that provides hierarchical logging would plugin to this API though.

Let's say I configure my application with:

# I'm gonna use Memento syntax just cause that's what I know pretty well
using Memento

# Configure info logging for the entire app with info level logging and only a stdout handler
Memento.config("info")

# Get an "aws" logger and specifically setup a handler using syslog for it.
aws_logger = get_logger("AWS")
add_handler(aws_logger, DefaultHandler(Syslog(:local0, "julia"), DefaultFormatter("{level}: {msg}"))

And then later on in the AWS module I have:

logger = get_logger("AWS")
info(logger, "Brought up a new instance")

I'm guessing the former would be specific to the backend and the latter would equate to @info "Brought up a new instance" group=:AWS, but I'm unclear how this result in executing the handler for syslog only when group=:AWS. Would you have a global list of filters to iterate over or would it call Backend.log(record) and leave it up to the backend to call info(get_logger(record.group), "Brought up a new instance")?

c42f · 2017-05-14T01:06:48Z

Nice, let's try to figure out how that concrete example would work (figuring out how this fits in with hierarchical backends was one of the next things on my list). I'll try to create a shim to make Memento a backend for MicroLogging, and we can see how it looks.

By the way I'm really happy to have all these good questions about how the design will work for various use cases - keep them coming!

c42f · 2017-05-14T02:25:33Z

@rofinn ok, as a very rough WIP, how about this: https://github.com/c42f/MicroLogging.jl/blob/master/examples/memento.jl

rofinn · 2017-05-14T17:50:00Z

@c42f Awesome, thanks for taking the time to work though that example. Yeah, it might be nice to have a distinct type hierarchy and API for Backends and Loggers to avoid confusion, but otherwise I really like that the example doesn't require a lot of changes to the application code.

pfitzseb · 2017-05-14T18:08:15Z

@c42f Regarding the progress stuff: As far as I can tell, having a unique ID for each message is fine -- there's no need for an explicit ProgressLogger() like we use in Juno.
All in all, this looks pretty good to me -- having a logging/messaging frontend in Base that supports custom backends is awesome, especially if it exposes metadata and stuff like that! +💯

c42f · 2017-05-15T07:23:36Z

it might be nice to have a distinct type hierarchy and API for Backends and Loggers to avoid confusion

@rofinn Yes, we have a bit of a terminology clash here. Personally I think the term "logger" is appropriate for the first piece of user-configurable code which has the opportunity to process a log record. This is the same way that Memento uses the term, but in the design above Memento.Logger is now plugged in at a slightly later stage. What we're doing here is clarifying and disentangling the two roles that Memento.Logger currently has:

A front end interface to directly receive user messages
A hierarchal filtering and dispatch strategy for log records

As for the second point here, I think an explicit hierarchy of objects is just one way to do this. I'm thinking of the set of log records as a database table (columns being required/optional metadata - module, file name, etc - and each row being a record). A logger hierarchy can then be seen as a particular way of arranging and optimizing queries against this table, but it's not hard to imagine more general queries which don't fit into this scheme.

@pfitzseb Excellent :-) If you do have any time to experiment with the prototype implementation - especially in the context of Juno - I'd greatly appreciate feedback on things which are and aren't working for you.

* Quickstart for frontend * What is a log record? * Some work on inspiration section

tkelman · 2017-06-23T10:48:54Z

Logging.md

+automatically extracted and stored with each log record.  Some of this is
+extracted from the lexical environment or generated by the logging frontend
+macro, including code location (module, file, line number) and a unique message
+identifier.  The rest is dynamic state which can generally on demand by the


can be generated?

Yes, I generally don't know how this happened!

still missing "be"

cdluminate · 2017-07-08T06:04:51Z

Can't wait to use it :-)

StefanKarpinski · 2017-07-11T20:51:47Z

Is this ready enough to be merged? It can remain WIP and get more updates and feedback, of course.

c42f · 2017-09-17T01:47:53Z

@StefanKarpinski excellent question, sorry for the long delay. I pushed a refactor of my thoughts, and I'm happy for this to be merged now.

Despite appearances, the project isn't abandoned - I've definitely been thinking about this, writing code, and trying to use it, both in the prototype, and now tentatively in Base over at JuliaLang/julia#23712 .

Logging julep - initial WIP

66734c2

This is an attempt to write down some desired features of a julian logging API, without referring to a particular implementation strategy.

c42f mentioned this pull request Feb 24, 2017

Basic logging in Base JuliaLang/julia#13515

Closed

c42f mentioned this pull request Mar 6, 2017

WIP: Caller location for macros JuliaLang/julia#20895

Closed

c42f commented Apr 1, 2017

View reviewed changes

Mention threads and structured logging

fdae066

oxinabox mentioned this pull request Jun 6, 2017

More logging/warnings and more configurable logging/warnings JuliaML/MLDataPattern.jl#13

Closed

Rewrite abstract

cc73d4a

c42f force-pushed the logging-api branch from 3ea93ed to cc73d4a Compare June 22, 2017 23:08

c42f added 2 commits June 22, 2017 17:45

Start describing proposed implementation

a0df6d4

* Quickstart for frontend * What is a log record? * Some work on inspiration section

Minor fixes

a4f2f97

tkelman reviewed Jun 23, 2017

View reviewed changes

c42f added 2 commits June 23, 2017 09:01

Fix typo

39a5769

Refactor section describing the tension in the design problem

e27d6d3

pfitzseb mentioned this pull request Jul 25, 2017

Progress monitoring fallback to the console JunoLab/Juno.jl#29

Closed

HarrisonGrodin mentioned this pull request Aug 22, 2017

Make --quiet and --banner independent (fix #23380) JuliaLang/julia#23399

Merged

c42f mentioned this pull request Sep 14, 2017

[WIP] More flexible and convenient logging JuliaLang/julia#23712

Closed

More refactoring of the background section.

070ab4f

StefanKarpinski merged commit 3d876d6 into JuliaLang:master Sep 17, 2017

c42f mentioned this pull request Dec 15, 2017

Base logging todo JuliaLang/julia#25109

Closed

15 tasks

[WIP] Flexible and convenient logging #30

[WIP] Flexible and convenient logging #30

Conversation

c42f commented Feb 24, 2017 • edited Loading

StefanKarpinski commented Feb 24, 2017

c42f Apr 1, 2017

Choose a reason for hiding this comment

StefanKarpinski commented May 13, 2017

rofinn commented May 13, 2017

StefanKarpinski commented May 13, 2017

c42f commented May 13, 2017 • edited Loading

c42f commented May 13, 2017

c42f commented May 13, 2017

ChrisRackauckas commented May 13, 2017 • edited Loading

MikeInnes commented May 13, 2017

c42f commented May 13, 2017 • edited Loading

ChrisRackauckas commented May 13, 2017

c42f commented May 13, 2017

MikeInnes commented May 13, 2017

c42f commented May 13, 2017

MikeInnes commented May 13, 2017 • edited Loading

c42f commented May 13, 2017

c42f commented May 13, 2017

rofinn commented May 13, 2017 • edited Loading

c42f commented May 13, 2017 • edited Loading

c42f commented May 13, 2017

c42f commented May 13, 2017

rofinn commented May 13, 2017 • edited Loading

c42f commented May 13, 2017

rofinn commented May 14, 2017

c42f commented May 14, 2017

c42f commented May 14, 2017

rofinn commented May 14, 2017 • edited Loading

pfitzseb commented May 14, 2017

c42f commented May 15, 2017

tkelman Jun 23, 2017

Choose a reason for hiding this comment

c42f Jun 23, 2017

Choose a reason for hiding this comment

tkelman Jul 11, 2017

Choose a reason for hiding this comment

cdluminate commented Jul 8, 2017

StefanKarpinski commented Jul 11, 2017

c42f commented Sep 17, 2017

c42f commented Feb 24, 2017 •

edited

Loading

c42f commented May 13, 2017 •

edited

Loading

ChrisRackauckas commented May 13, 2017 •

edited

Loading

c42f commented May 13, 2017 •

edited

Loading

MikeInnes commented May 13, 2017 •

edited

Loading

rofinn commented May 13, 2017 •

edited

Loading

c42f commented May 13, 2017 •

edited

Loading

rofinn commented May 13, 2017 •

edited

Loading

rofinn commented May 14, 2017 •

edited

Loading