Skip to main content

Logging and debugging

System information

To monitor system health and analyse different system issues, Aos Service Manager provides the following system information:

  • System logs
  • A list of folders and their size in case of disk usage alerts
  • A list of processes and their memory usage in case of memory usage alerts
  • Aos Service Manager stack trace in case of segmentation faults

System logs

Aos Service Manager uses the systemd journal as a system log. It includes dmesg and syslog as well. Aos Service Manager monitors the systemd journal and collects events with priorities "emerg" (0), "alert" (1), "crit" (2), "err" (3). Usually, the number of these events is quite low, but we should limit the number of events sent to AosCloud per defined time interval, for example. [TBD]

List of folders with size

In case of system disk usage alerts, Aos Service Manager collects and store a list of folders with size in descending order. The list length should be limited by configured value. This list is created with folders from the partition on which the working directory is located. The list is sent by Aos Service Manager to AosCloud by request or immediately after the list is ready. – TBD

List of processes with memory usage

In case of system memory usage alerts, Aos Service Manager collects and stores a list of processes with memory usage in descending order. The list length should be limited by configured value. The list is sent by Aos Service Manager to AosCloud by request or immediately after the list is ready. [TBD]

Aos Service Manager stack trace

On segmentation fault, Aos Service Manager stores the stack trace to the file. After restarting, if the file exists, Aos Service Manager sends the stack trace to AosCloud and deletes the file (TBD mechanism).

Services

  • Send the log from journalctl by request.
  • Collect some amount of the latest logs and send by request.
  • Collect some amount of the latest logs after some event (a disk or memory alert).
  • Collect some amount of the latest logs when a service stops unexpectedly.