Thursday, 30 October 2014

Microsoft Azure Operational Insights – Hyper-V Errors (EventID 1000 & 1026)

I’ve recently been trialling the next evolution of the System Center Advisor service, recently renamed to Microsoft Operational Insights.

This is an amazing evolution of the old Advisor system into an analytics tool, harnessing the power of Azure to churn through events, logs and analyse data without the need for large compute resources on premise and huge data warehouse stores.

It's also extensible through Intelligence Packs to bring new features/analysis for what matters most to your organisation and now customisable using refined searches and the "My Dashboard" feature.



It is currently free to use as it’s in preview stage and I thoroughly recommend you take a look.
Signup and more information can be found here: https://preview.opinsights.azure.com

You can use the service as either a standalone service by deploying the Microsoft Monitoring Agent to your servers, or by integrating into a System Center 2012 R2 Operations Manager infrastructure.
I'm using it in the integrated deployment and it was because of this that I noticed something strange.

After enabling the recently added Intelligence Pack for Change Tracking, my Hyper-V hosts in SCOM started to flip in and out of a monitored state.
When I started to dig into it, I saw the Application Event log was repeatedly logging errors (Event ID's 1000 & 1026) that the MonitoringHost.exe application was terminating unexpectedly.


Since MonitoringHost.exe is one of the key parts of the Microsoft Monitoring Agent used by SCOM, this was why the host was dropping in and out of a monitored status inside SCOM.
But what was causing it to crash?
Thankfully the Operational Insights product team came to my rescue and tracked it down with amazing speed.
I’m (as per best practice) using Server Core for my Hyper-V hosts. As part of the Change Tracking feature of OpInsights it utilises a dll call to part of the Application Experience Program Inventory Component and it turns out this isn’t installed by default on a Server Core deployment.
Microsoft are aware and will no doubt be looking for a solution but in the meantime it’s an easy workaround:
Find another server of the same OS that is non-core (i.e. has a GUI) and locate the aeinv.dll file in the %SystemRoot%\System32 directory and simply copy this across to the affected server in the same folder.


While this is a quick workaround for now, the service is still in preview so will possibly be fixed at some point and I’ll update when things change.

Also, while I experienced this with Hyper-V on Server 2012 R2, chances are this will affect other versions and roles utilising Server Core.
I also recommend checking out the superb series written by fellow MVP Stanislav around the various Intelligence Packs for Microsoft Azure Operational Insights here:
https://cloudadministrator.wordpress.com/2014/10/23/microsoft-azure-operational-insights-preview-series-sql-assessment-part-7/

No comments: