Failure of tasks 'H2E Operation on host #xxx' causes Control Panel outage
Modified on: Fri, 17 Nov 2023 12:08 PM2020-01-22
Symptoms
During brands synchronization several tasks 'H2E Operation on host #xxx' have been found in failed state with the following output:
Task name H2E Operation on host #xxx
Internal error: /sbin/service /sbin/service httpd graceful failed with code 1 saying: STDOUT: '' STDERR 'Job for httpd.service invalid.
At the same time httpd
service became unavailable on Branding UI hosts causing outage of control panel. All other OA services are working correctly.
In /var/log/messages
the following errors are logged:
abrt-hook-ccpp: Process 19637 (httpd) of user 0 killed by SIGSEGV - dumping core
systemd: httpd.service: main process exited, code=dumped, status=11/SEGV
abrt-server: Generating core_backtrace
httpd: httpd not running, trying to start
kill: kill: cannot find process ""
systemd: httpd.service: control process exited, code=exited status=1
In error_log
of Apache:
AH00060: seg fault or similar nasty error detected in the parent process
Version of httpd package is httpd-2.4.6-67.el7.centos.6
.
Cause
Synchronization of brands performs a series of rapid Apache service reloads on all branding UI nodes in scope of 'H2E Operation on host #xxx' task. But due to software-related issue on RedHat side, Apache crashes after several reloads, resulting in the outage of control panel.
This issue is marked as 'Fixed' on RedHat side, but it still could be reproduced with repository-supplied httpd package.
Resolution
Workaround should be applied on all affected Branding UI hosts to prevent httpd from crashing:
-
Modify file
/usr/lib/systemd/system/httpd.service
by changing the line:ExecReload=/usr/sbin/httpd $OPTIONS -k graceful
to
ExecReload=/bin/sleep 0.5 ; /usr/sbin/httpd $OPTIONS -k graceful
-
Propagate changes to systemd:
# systemctl daemon-reload
-
Start httpd (if it is stopped):
# service httpd start