Extreme Thinking
python linux syslog to csv make html view


作業 ….




Here's your challenge: Write a script to generate two different reports based on the ranking of errors generated by the system and the user usage statistics for the service. You'll write the script on your own, but we'll guide you throughout.

First, import all the Python modules that you'll use in this Python script. After importing the necessary modules, initialize two dictionaries: one for the number of different error messages and another to count the number of entries for each user (splitting between INFO and ERROR).

Now, parse through each log entry in the syslog.log file by iterating over the file.

For each log entry, you'll have to first check if it matches the INFO or ERROR message formats. You should use regular expressions for this. When you get a successful match, add one to the corresponding value in the per_user dictionary. If you get an ERROR message, add one to the corresponding entry in the error dictionary by using proper data structure.

After you've processed the log entries from the syslog.log file, you need to sort both the per_user and error dictionary before creating CSV report files.

Keep in mind that:

The error dictionary should be sorted by the number of errors from most common to least common.
The user dictionary should be sorted by username.
Insert column names as ("Error", "Count") at the zero index position of the sorted error dictionary. And insert column names as ("Username", "INFO", "ERROR") at the zero index position of the sorted per_user dictionary.

After sorting these dictionaries, store them in two different files: error_message.csv and user_statistics.csv.


#!/usr/bin/env python3
import sys
import os
import re
import operator
def error_search(log_file,er):
  returned_errors = []
  error_patterns = [er]
  with open(log_file, mode='r',encoding='UTF-8') as file:
    for log in file.readlines():
      if all(re.search(error_pattern, log) for error_pattern in error_patterns):
  return returned_errors
if __name__ == "__main__":
  ErrorCount = {}
  data_error = {}
  data_info = {}
  data = {}
  log_file = 'syslog.log'

  returned_errors = error_search(log_file, 'ERROR')
  for line in returned_errors:
    a = re.findall(r"ticky: ERROR ([\w ]*) ", line)
    if a[0] in ErrorCount:
      ErrorCount[a[0]] += 1
      ErrorCount[a[0]] = 1
    a = re.findall(r"\((\D+)\)$", line)
    if a[0] in data_error:
      data_error[a[0]] += 1
      data_error[a[0]] = 1

  returned_errors = error_search(log_file, 'INFO')
  for line in returned_errors:
    a = re.findall(r"\((\D+)\)$", line)
    if a[0] in data_info:
      data_info[a[0]] += 1
      data_info[a[0]] = 1

  new_ErrorCount = sorted(ErrorCount.items(), key = operator.itemgetter(1), reverse=True)

  for key_info in data_info:
    data[key_info] = [data_info[key_info],0]
  for key_error in data_error:
    if key_error in data:
      data[key_error][1] += data_error[key_error]
      data[key_error] = [0,data_error[key_error]]
  new_data = sorted(data.items())

  with open('error_message.csv', 'w') as f:
    for ErrorCount in new_ErrorCount:
  with open('user_statistics.csv', 'w') as f:
    for data in new_data: