An application log is an output text that describes what's happening inside of the application. The value of the thoughtfully designed logging strategy and practices across all applications in your system can help to ensure that the developers have the ability to debug issues when they arise (and they will!).
There can be two extremes, an expensive verbose stream of mostly useless messages or silence before everything breaks. The challenge is to find the middle ground between these two. Moreover, it's might be very important to log security-relevant actions, such as account-related activities, data writes, changes in state or ownership. At the same time keeping personally identifiable information away from the logs.
Of course, it is not only about a thoughtful log message and its place in the source code, it's also about its severity level. So let's go first into the logging levels. Please note that every language/runtime has its specifics (e.g. there is no trace in Python), this list aims to give a general idea.
- Duplicates an entire request/event payload
- Expected to cause severe performance degradation (too slow for production)
- May cause performance degradation (too slow for production)
- Not excessively noisy as Trace
- Somewhat verbose during startup (e.g. config dump)
- Does not log frequently expected events
- Indicates degraded experience of customers with the system continues to function
- Issues may be fixed with a config change
- Should be collected for later triage
- Indicates blocking issue for customers
- Indicates issues that may require code changes (a bug)
- Should be surfaced in monitoring and addressed soon
Fatal / Critical
- The system should be considered offline (if not self-healing)
- Requires immediate attention
As for the default log level per stage, a good idea can be to use Debug on DEV and Info on PROD.
More context better than less context.
Again, to save yourself time (and face) in the future, we need not put any sensitive information, such as personally identifiable information or authentication credentials in the log messages. Pay attention to the libraries you use, sometimes you need to explicitly exclude auth headers from being logged.
Also giving a bit more details, diagnostic contexts, and not making a log message depending on the previous one is what makes logs useful, e.g. let's compare:
User account action succeeded
User 123456 account has been deleted
Sending message failed
Sending update message id 123456 failed. Service is unavailable. Will retry in 5 seconds.
- choose a reasonable severity level for logs that your applications write
- keep the severity levels and their verbosity consistent across your system
- don't log personally identifiable information or credentials
- give as many details and context as possible
- ... and keep costs in mind (retention, indexing, etc)