Dude where is my Availability Report data from the SCOM DWH ??

23 Mar

” Huston we got a problem!” when I run a availability report  the data isn’t complete. I’m missing a huge number of days. The graph shows UP (Monitoring unavailable) But I am really sure the server was up and monitored !!

image

 

Analyze:

Don’t panic, we are going to solve this. (I hope..) First we are going to look up the days we are missing. Simply click on the white bar. And the detail report will be rendered.

image

Okay looks like we are missing the most data from of 4-3-2012. And we see strange gaps of data that is present.

Okay that’s what the report says, but I am a core stuff guy I check it this way:

Open an SQL session and connect to the DWH db. Run this query. The last aggregated data will be on the first row. So you know what the last data date is you have. We change the DateTime to the same datetime we used in the report.

SELECT     ManagedEntity.FullName, vStateHourly.*
FROM         ManagedEntityMonitor INNER JOIN
                      ManagedEntity ON ManagedEntityMonitor.ManagedEntityRowId = ManagedEntity.ManagedEntityRowId INNER JOIN
                      State.vStateHourly ON ManagedEntityMonitor.ManagedEntityMonitorRowId = State.vStateHourly.ManagedEntityMonitorRowId
WHERE     (ManagedEntity.FullName LIKE ‘Microsoft.Windows.Computer:opsrms01%’)
AND vStateHourly.DateTime between  ‘20120301’ and ‘20120401’
order by vStateHourly.DateTime desc

The output will be:

image

So the last successful hourly aggregation was 02-03-2012 (dd-mm-yyyy). Hmmmm but when I look at the rendered report I see periods of data after this date ??? I must confess I really don’t have a idea now why Knipogende emoticon

Now we have to find the root cause and fix this missing aggregations. Luckily we can enable debug information for the aggregation process so we can see more what going wrong.

Open SQL and run the query below to enable debugging for the State aggregation .

UPDATE [OperationsManagerDW].[dbo].[StandardDataset]
   SET [DebugLevel] = 5

     WHERE [SchemaName] = ‘State’
GO

Now we can see the debug date with this query:

SELECT     TOP (100) DATEADD(hh, 1, DebugMessage.MessageDateTime) AS CET_datetime, StandardDataset.SchemaName , DebugMessage.MessageLevel, DebugMessage.MessageText
FROM         DebugMessage WITH (nolock) INNER JOIN
                      StandardDataset ON DebugMessage.DatasetId = StandardDataset.DatasetId
WHERE    (StandardDataset.SchemaName = N’State’)
order by messagedatetime desc

The output will be as below:

image

It looks like my aggregation process is way behind !

Since the maintenance for the DWH has a sequence run it means when some procedure before fails (lets say the event staging) the other won’t be hit. So I look in the debug table for other messages with ‘failed’ in the message.

Notice that we are now going a little of track , we main problem was the State report incomplete , but now we are looking at the Events. Just follow me.

SELECT     TOP (100) DATEADD(hh, 1, DebugMessage.MessageDateTime) AS CET_datetime, StandardDataset.SchemaName , DebugMessage.MessageLevel, DebugMessage.MessageText
FROM         DebugMessage WITH (nolock) INNER JOIN
                      StandardDataset ON DebugMessage.DatasetId = StandardDataset.DatasetId
where messagetext like ‘%Failed%’
order by messagedatetime desc

O no this is not good:

image

It looks like the event staging is broken. The error is:

Failed to process stagingarea for data set. Error 777971002, Procedure EventProcessStaging, Line 398, Message: Sql execution failed. Error 515, Level 16, State 2, Procedure DebugMessageInsert, Line 15, Message: Cannot insert the value NULL into column ‘MessageText’, table ‘OperationsManagerDW.dbo.DebugMessage’; column does not allow nulls. INSERT fails.

Mmmm When I look at the error I see the debug procedure that writes to the debug log has a problem writing a debug message. Strange this error… So we have to find the real error.  So I open the stored procedure “EventProcessStaging”. And there I found a BUG .. brrrr. The variable @InsertTableName is not set to a value before it is used as part of the debug message variable @MessageText. Because you can’t concat NULL to a string variable an exception is raised. I fixed this by moving the sql where this variable @InsertTableName is assigned to above the first use of the @InsertTableName variable.  (this is for SCOM 2007 and 2012!) I raised already a bug request @Microsoft throughout the TAP program. This only occurs when you set the debuglevel > 3.

For you it simple means don’t set it above 3  or fix this stp own your own risk. (as I have done ;-0 ) In our case the debug level was already above 3 for the state dataset for the last month. So because the event processing was braking the total maintenance (bad architecture , sorry) all my state staging  was stopped. And caused my empty reports.

So now we know it we can go back to the real issue. The missing states fix.

Fix:

This is now very simple. And if you Bing you can find plenty of info on it. One I found out in the beginning very helpful was this one from the Microsoft SCOM team ( http://blogs.technet.com/b/operationsmgr/archive/2011/09/06/standard-dataset-maintenance-troubleshooter-for-system-center-operations-manager-2007.aspx )

What I do is most of the time :

Set a enable = false override on the rule “Standard Data Warehouse Data Set maintenance rule” for all instances of “Standard Data Set”.

SNAGHTML15ff6a4a

Now I am really sure no maintenance process is running.

And I run my own maintenance process every 1 min. Because I know catching up the state data aggregation will take some time and I don’t want to create problem’s for the other datasets (performance , events ..) I will also run the important ones in the same script.

Open a query to the DWH and run:

USE [OperationsManagerDW]
DECLARE @DataSet uniqueidentifier

Print ‘starting loop of StandardDatasetMaintenance jobs’   
Print ‘Processing dataset:’   
Print @DataSet

while (1=1)
begin
    PRINT getdate()
    Print ‘Start StandardDatasetMaintenance’   
SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = ‘Perf’)
    EXEC StandardDatasetMaintenance @DataSet

SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = ‘Exchange2010’)
    EXEC StandardDatasetMaintenance @DataSet

SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = ‘State’)
    EXEC StandardDatasetMaintenance @DataSet

SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = ‘Event’)
   
    EXEC StandardDatasetMaintenance @DataSet
    –EXEC StandardDatasetProcessStaging @Dataset
    PRINT getdate()
    PRINT ‘END StandardDatasetMaintenance’
    WAITFOR DELAY ’00:01′
end

now you check the debug log on regularly base to see if the state aggregation is completed.

You can also use the query below:

———————————————————————–
— check first and last aggregation time from still to be processed  data
— first and last date must be equal
———————————————————————–
Declare @DataSet as uniqueidentifier
Set @DataSet = (Select DataSetId From StandardDataSet Where SchemaName = ‘State’)
Select AggregationTypeId, COUNT(*) as ‘Count’, MIN(AggregationDateTime) as ‘First’, MAX(AggregationDateTime) as ‘Last’ From StandardDataSetAggregationHistory
Where DataSetId = @DataSet AND LastAggregationDurationSeconds IS NULL
group by AggregationTypeId

So lets check if the process is running okay. Simply rerun the report. the output will be:

image

looks like its all going to be alright. Just be patient.

DO NOT FORGET:

after the states are complete to remove the overrides , other wise you will have for sure the same and more , problem again.

Not to be continued:

I really hope not. Because in my case we have a DWH size almost against 1TB and because of this size it can be very complex and tricky to solve this sort of problems. So if mr. Murphy is reading this , skip my place please …

Happy SCOMMING,

Michel Kamp

Advertisements

34 Responses to “Dude where is my Availability Report data from the SCOM DWH ??”

  1. Jenn July 31, 2012 at 20:30 #

    Hello Michel,

    Thank you very much for the article! I am have the same issue and trying to fix it. Would you be so kind to advise where to perform the following steps?

    “Set a enable = false override on the rule “Standard Data Warehouse Data Set maintenance rule” for all instances of “Standard Data Set”.”

    Thank you very much in advance!

    • Jenn July 31, 2012 at 21:43 #

      There is no “Type: Standard Data Set” in the Authoring/Management Pack Objects/Rules.

      Also, is it better to create a new management pack or use the default management pack?

      Thank you!

      • Michel Kamp August 1, 2012 at 13:36 #

        Hi,

        Best practice is to create a new one and don’t ever use the default one. I advise to create one override mp per product group. So sql.overrides, iis.overrides ect..

        Michel

        Verzonden met mijn Windows Phone

    • Michel Kamp August 1, 2012 at 13:34 #

      Hi,

      Yes you are correct. This will disable all maintenance for all datasets. After you have fixed the problem don’t forget to enable it again.

      Michel

      Verzonden met mijn Windows Phone

      • Alex August 1, 2012 at 19:34 #

        Hello Michel!

        Would you be so kind to look at that issue in the Technet forum: http://goo.gl/2Zyb6 ? We would highly appreciate your input into the troubleshooting!

        Thank you very much for your time and attention.

      • Alex August 3, 2012 at 14:36 #

        Hi Michel,

        As you advised we executed the query:
        – checking the to be processed aggregations ————–
        SELECT COUNT(*) AS Aggr_behind, Dataset.DatasetDefaultName
        FROM StandardDatasetAggregationHistory INNER JOIN
        Dataset ON StandardDatasetAggregationHistory.DatasetId = Dataset.DatasetId
        WHERE (StandardDatasetAggregationHistory.DirtyInd = 1)
        GROUP BY Dataset.DatasetDefaultName

        Here is an output from the query (8 rows):
        Aggr_behind DatasetDefaultName
        1 Client Monitoring data set
        1 Microsoft.Exchange.2010.Dataset.AlertImpact
        1 Microsoft.Exchange.2010.Reports.Dataset.Availability
        1 Microsoft.Exchange.2010.Reports.Dataset.TenantMapping
        1 Microsoft.Exchange.2010.Reports.Transport.ActiveUserMailflowStatistics.Data
        1 Microsoft.Exchange.2010.Reports.Transport.ServerMailflowStatistics.Data
        1 Performance data set
        1 State data set

        Would you recommend to have a look at the Case 2 of your post:
        https://michelkamp.wordpress.com/2012/04/10/scom-dwh-aggregations-data-loose-tip-and-tricks/ ?

        Thank you!

  2. Stephen (@StephenTL) December 3, 2012 at 20:39 #

    When running SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE SchemaName = ‘Exchange2010’)
    EXEC StandardDatasetMaintenance @DataSet

    I got:

    Msg 512, Level 16, State 1, Line 2
    Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, , >= or when the subquery is used as an expression.

    There were multiple data sets for Exchange 2012. So I grabbed the DataSetID’s from Select * From StandardDataSet WHERE SchemaName = ‘Exchange2010’ and ran them manually.

    • Stephen (@StephenTL) December 3, 2012 at 20:51 #

      Keep in mind I’m on OpsMgr 2012 and on the latest Exchange 2010 MP.

    • Michel Kamp December 4, 2012 at 08:24 #

      Hi,

      Looks like you have 2 ‘Exchange2010’ schemas. Do you have more than 1 management group reporting to this DWH ? If so you must make the query more specific to only return 1 row or you will have to specify the id by hand.

      Michel

      Verzonden met mijn Windows Phone ________________________________

  3. Kerrie September 5, 2013 at 16:27 #

    Selecting the” right” way, resulting in power, will
    be key evidence when Pistorius goes news on trial for premeditated murder.

    With summer just ten weeks long, getting discount tickets to Lagoon with your stay.
    Unfortunately, al-Shabaab has continued to tarnish his brand with his unprofessional antics only to have his
    very lucrative deal ended, all in the sessions.

  4. Jesty March 18, 2014 at 11:19 #

    Hi Micheal,

    I ran this query in 2007 R2 DW but unable to retrieve any data from it( have amended the dates). Please advise.

    SELECT ManagedEntity.FullName, vStateHourly.*
    FROM ManagedEntityMonitor INNER JOIN
    ManagedEntity ON ManagedEntityMonitor.ManagedEntityRowId = ManagedEntity.ManagedEntityRowId INNER JOIN
    State.vStateHourly ON ManagedEntityMonitor.ManagedEntityMonitorRowId = State.vStateHourly.ManagedEntityMonitorRowId
    WHERE (ManagedEntity.FullName LIKE ‘Microsoft.Windows.Computer:opsrms01%’)
    AND vStateHourly.DateTime between ’20120301′ and ’20120401′
    order by vStateHourly.DateTime desc

    • Michel Kamp March 18, 2014 at 12:09 #

      Hi, You will have to change the opsrms01 to the server you want the info for.

      Michel

      • Jesty March 18, 2014 at 12:58 #

        Hi Michel,

        I have changed the server name to our DW name but still the query gets executed without any output. kindly assist.

      • Michel Kamp March 18, 2014 at 13:56 #

        DW name ??

        Change it to the server you want the info from. For example the server where you want to make the report of. Let’s say the AD server.

  5. Jesty March 18, 2014 at 14:47 #

    Thanks Michel…

  6. Jesty March 19, 2014 at 08:00 #

    Hi Michel,

    We are facing issues while fetching reports for SCOM Webconsole availability. The uptime % is 100% but the availability tracker seems to be missing for few days for past month and is currently null for the current month. Could you please help us through some light on the issue.
    What could be the issue on the missing availability tracker or availability %. Kindly assist.

    • Anonymous March 19, 2014 at 14:19 #

      Any suggestions?

    • Jesty March 19, 2014 at 14:20 #

      Any suggestions

      • Jesty March 21, 2014 at 09:16 #

        – checking the to be processed aggregations ————–
        1 Client Monitoring data set
        649 Microsoft.Exchange.2010.Dataset.AlertImpact
        29 Microsoft.Exchange.2010.Reports.Dataset.Availability
        1 Microsoft.Exchange.2010.Reports.Dataset.TenantMapping
        2 Microsoft.Exchange.2010.Reports.Transport.ActiveUserMailflowStatistics.Data
        1 Microsoft.Exchange.2010.Reports.Transport.ServerMailflowStatistics.Data
        4 Performance data set
        702 State data set

        – check first and last aggregation time from still to be processed data
        – first and last date must be equal
        ———————————————————————–
        Declare @DataSet as uniqueidentifier
        Set @DataSet = (Select DataSetId From StandardDataSet Where SchemaName = ‘State’)
        Select AggregationTypeId, COUNT(*) as ‘Count’, MIN(AggregationDateTime) as ‘First’, MAX(AggregationDateTime) as ‘Last’ From StandardDataSetAggregationHistory
        Where DataSetId = @DataSet AND LastAggregationDurationSeconds IS NULL
        group by AggregationTypeId

        20 674 2014-02-21 05:00:00 2014-03-21 06:00:00
        30 29 2014-02-21 05:00:00 2014-03-21 04:00:00

        So from the queries i feel like aggregations are running fine? Any suggestions.

      • Michel Kamp March 21, 2014 at 09:46 #

        Hi,

        NO you have a real ‘Huston’ problem .

        It looks like you aggr are behind on Exchange Alertimpact (649x) and State data (702x).

        You will have to fix this using the methods described in my blog posts. Look also on my other data warehouse posts.

        Michel

  7. Jesty March 21, 2014 at 10:08 #

    Thanks Michel… Also once the debugging is enabled … how do we set it back to 0 once the issue is resolved?

  8. basem shalabi May 19, 2014 at 11:38 #

    Hi,

    Thank you for a great post on scom availability.

    How can I fill up and design availability report with these data.

    Regards,
    Basem

    • Michel Kamp May 19, 2014 at 11:54 #

      Hi,

      Can you explain more details on this ?

      Michel

      • basem shalabi May 19, 2014 at 12:00 #

        Hi

        Thank you for replay.. which parameters should I use to build like above graph and how can I show uptime in percent in the report.

        Hope you understand what I mean.

        Thank you in advance.

        Basem

      • basem shalabi May 19, 2014 at 13:40 #

        Hi Michel,

        Actually I am working on customized report and I want to include the UPTIME for the servers in my customized report,,, how can achieve that ?

        Regards,
        Basem

  9. Twila August 23, 2014 at 19:30 #

    The World Wide Web is an important marketing medium and to gain success in this arena a marketer
    should have well planned online marketing strategy.

    Before engaging an online marketing consultant therefore, make
    sure they have all of these skills and more, in order for them to offer you a great value proposition. When looking to jump into the internet marketing realm,
    first thing is first; set up shop.

  10. Marcelo October 9, 2014 at 10:02 #

    Hi Michel, excellent post!

    I have two questions.

    1. I have the same problem, in all reports. The same period time. What can I do?
    2. What means when you tell “DO NOT FORGET: after the states are complete to remove the overrides , other wise you will have for sure the same and more , problem again.” How to do this?

    It’s my first time working in SCOM 2012 (and in all scom) and I don’t understand any of these problems.

    Thanks in advances,

    Marcelo.-

    • Jesty October 16, 2014 at 14:24 #

      Hi Marcelo,

      We are manually processing all the data using the script given above by Michel. Before running the manual aggregation we need to disable the automatic SCOM aggregation.

      For that please navigate Authoring Pane–>Rules–> Scope to Standard Data set

      Find the rule Standard Data Warehouse Data Set maintenance rule and override for the all the instances of the class and put it to false.

      Once the manual aggregation has been caught up please make sure you revert back the changes in the SCOM rule so that automatic aggregation happens.

      Jesty

  11. Jesty October 16, 2014 at 14:16 #

    Hi Michel,

    This blog has helped us to catch up the aggregation manually. We ran this against the State and Exchange datasets and all have been processed.
    But after that we could see blank reports in our Exchange 2010 reports with no data. How could we correct these.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: