Shallow Talk about the Maintenance and Operation Management System
Author/Zhang Naizhong [Issue Date: 2016/5/5]
IT maintenance and operation involves maintaining and operating IT operating environments in order to maintain the effectiveness of equipment and external service quality. However, a large amount of IT equipment and software assets require complex maintenance and this results in low processing efficiency due to the difficulty of managing this maintenance. To solve the problems of traditional management, the maintenance and operation management system will assist the maintenance and operation management personnel to deal with the complex work of maintenance and operation. Statistical data about the management is available as information via the window function of the system and can improve the efficiency of IT maintenance and operation.
Features of the Maintenance and Operation System
IT maintenance and operation tasks are many and complex and therefore without systematic integration they usually cause overloading for the system engineer, messy management, and difficulties for the transfer of work.
The maintenance and operation management system integrates these work items to improve the operation procedure of the maintenance and operation, and overall efficiency of the machine room. This system also provides a monitoring service platform that combines integrated data with the information from device monitoring. The system is able to obtain correct information so that difficulties of maintenance and operation are avoided. The system handles backup, periodic inspection, and report output.
Forms and Processes
To simplify maintenance and operation, we integrate a large number of existing work items and regard them as a single request. Through forms and a process engine, this makea IT maintenance and operation an electronic process. In addition to windows for standard functions, we support user-created and user-defined windows for use by those users themselves for their units various features. Using these windows, users are able to standardize various maintenance and operation tasks.
The system provides a variety of incident detections, including compliance inspection, system monitoring, service monitoring, backup schedules, abnormal logs, etc. When receiving an abnormal incident, the system automatically bills for it in accordance with rules established and then through the process assigns a front-line processing personnel to deal with it in accordance with the flow of the process.
When receiving the accident window, front-line processing personnel are able to rapidly understand the situation via the problem management and control processing program in accordance with the flow. This improves processing quality and speed and also quickly grasp causes of the incident through the device and asset database and the record of changes. In addition, we are able to reduce the occurrence of abnormal incidents by managing equipment changes. Through the maintenance and operation report, the management director can quickly understand service quality and staff efficiency and then improve services. The system records the process and solutions to help shorten processing time in future. Response time to the problem can be used to measure the level and quality of services. The stability of maintenance and operation of a machine room will be guaranteed by multi-in-one monitor mode (local host / third party), backup services, safety inspection and logs collection.
Process Engine and Machine Agent
For different functions, there is different process. The user is able to define and then establish a process via simple steps. The mechanism, using a machine agent, makes the automated process possible. The agent is able to automatically assign default commands while it controls the process using the window's engine to operate this window. All the operations are recorded through the window's record function so that when there is any unexpected situation or unusual results the front-line staff are able to intervene at any time, fix the problem and then switch to agent operation.
This system architecture supports LB and HA mechanisms. The back-end database uses its own HA mechanism and exchanges commands with each other using the agent. The agent is able to control syslog-ng and write log messages to the database. The Agent Less also uses the agent to execute commands to realize monitoring function. In addition, the window itself provides REST API for external system calls and command line tools (CLI) for opening and modifying the action of a window, and interfaces and integrates with other external systems using these two methods.
Maintenance and operation management are tasks are many, varied, and complex. Using customized windows and processes, process automation mechanisms and integrated monitoring, this system integrates these tasks and simplifies the many cumbersome steps thereby improving reliability and speed of the services. Additionally, the system supports real-time and correct information for front-line personnel and management executives. This gives them have a clear understanding of situations and problems during maintenance and operation and in the end improves services.