Analyzing dnn EventLog                 Print      Add Favorite     Add Watch     Contact Author

Creator: host   12/7/2011 3:21:50 AM    Author: Peter Donker   Source: http://www.dotnetnuke.com/Resources/Blogs/EntryId/3235/Analyzing-an-exploding-EventLog.aspx   Views: 412    0    0  
Tags:
event log dnn event log Dnn development


Murphy’s law has it that real issues in DotNetNuke (core. modules, skins) don’t manifest themselves until you’ve rolled out in production. We can test all we want, but the fact of the matter is that this has become a highly complex eco-system with many interacting bits and it is next to impossible to test for all scenarios. And I’m pretty sure the kind folks at the Corp’s helpdesk will agree with me on this as well. So let’s not kid ourselves: sh*t happens. It’s our ability to deal with it that separates the good from the bad (or experienced from the inexperienced), IMHO. It is also the reason we have something called dogfooding. This is the last phase of testing where you use your own product (dogfood) in your own production environment. This is after you’ve tested it on a staging server of course. It’s rare, but sometimes it does give rise to issues. After I rolled out DNN 6.1.1 on my own site and had began dogfooding my own DMX 6 my server ground to a halt. Lucky for me I did this over thanksgiving so it was fairly quiet on the interwebs. The first major stumbling block in analyzing what was happening was the fact that the eventlog had exploded. I mean, my 250 Mb odd SQL database had become 4 Gb and that was just one table: the EventLog table.

First aid

As the site ground to a halt I needed to do 2 things: get the site running again and analyze what was happening. But at this point the system is more or less frozen with SQL clipping at 99% CPU. OK, first I brought down the site to get the server back down to earth. Then I proceeded to remove all but the last 100000 records from the EventLog. I could possibly have deleted more, but this brought the DB to a reasonable size (something like 1 Gb). Backup the DB, zip this and bring down to my dev environment for analysis. Final change to the DB: truncate the EventLog table and shrink the DB again. Now to get the site working I got it back online and disabled the GENERAL_EXCEPTION in the EventLog (go to the settings for the eventlog and edit this row). Now the SQL is no longer working overtime. At this point we have to realize the error is still happening but the fact that it is no longer logged prevents SQL server going wild.

Analysis

For those of you who’ve not been here before, the Eventlog table has one major drawback: the actual details of the errors are in an NTEXT field as an XML blob. This prevents us from doing any SQL analysis on the data. For one I’d like to get an idea of which error is prevalent. This requires comparing the details between errors. We need to get to the contents of the XML. Given that it’s in NTEXT, SQL server will refuse to operate on it as it is stored as blobs and can’t be compared. Converting it to NVARCHAR(MAX) is impossible on a serious volume of data (I tried and the process just blocked). I also tried exporting to XML to open up in XML Spy for instance, but this lacks the analysis capabilities that I need. It was useful to run through the first few records to get a first idea, though. You can export to XML using a utility called “bcp”, or bulk copy. This is a tool that comes with SQL server, so you run it from a cmd window:

"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp" 
"select top 100 LogProperties from new_bring2mind_net.dbo.PortalCore_EventLog" queryout 
D:\Websites\local.bring2mind.net\eventlog.xml -U "test" -P "test" -c

Here “new_bring2mind_net” is the name of the database, “test” and “test” are username and password. Now all you need to do is to open the resulting file and stick a <foo> and </foo> around the contents to make it a valid XML file. Open in XML Spy (or similar tool) and you can walk through the log. But you’ll quickly notice that the actual details of the event (in the “message” property) are so large, they make it difficult to “see” what is happening in grid view.

So how to actually use the eventlog for analysis? Well, we need to get the XML data to SQL so we can run queries. And this SQL 2005+ can do. The approach I have now pursued is to (1) create a table for the log properties, (2) run a cursor over the EventLog and (3) read the LogProperties as XML and push that into the new table. So let’s look at some code. First we create the new table:

1 CREATE TABLE PortalCore_EventLogProperties (
2  LogEventID INT NOT NULL,
3  PropertyName VARCHAR(50) NOT NULL,
4  PropertyValue NVARCHAR(MAX)
5 )
6 GO

This sets up the table that will receive our data. Now for our bit of magic:

01 DECLARE @LogProperties XML
02 DECLARE @LogEventID INT
03  
04 DECLARE c CURSOR FAST_FORWARD FOR
05 SELECT LogProperties, LogEventID FROM PortalCore_EventLog
06  
07 OPEN c
08 FETCH NEXT FROM c
09 INTO @LogProperties, @LogEventID
10  
11 WHILE @@FETCH_STATUS = 0
12 BEGIN
13  PRINT @LogEventID
14  
15  INSERT INTO PortalCore_EventLogProperties(LogEventID, PropertyName, PropertyValue)
16  SELECT
17   @LogEventID,
18   Tbl.Col.value('(PropertyName)[1]', 'VARCHAR(50)'),
19   Tbl.Col.value('(PropertyValue)[1]', 'NVARCHAR(MAX)')
20  FROM
21   @LogProperties.nodes('//LogProperty') Tbl(Col)
22  WHERE NOT EXISTS (
23   SELECT * FROM PortalCore_EventLogProperties
24   WHERE LogEventID=@LogEventID
25    AND PropertyName=Tbl.Col.value('(PropertyName)[1]', 'VARCHAR(50)')
26  )
27  
28  FETCH NEXT FROM c
29  INTO @LogProperties, @LogEventID
30 END
31  
32 CLOSE c
33 DEALLOCATE c
34 GO

This migrates the logproperties to its own table. Note that this can take a while depending on the number of records (I ran 30K records in about 45 minutes while typing this). Optionally you can limit the amount by adding “TOP 1000” or another limiting instruction to the SELECT statement. But now we have the data where we want it. Ready for analysis. Let’s run a few queries.

1 SELECT * FROM
2 (SELECT DISTINCT(elp.PropertyValue),
3  (SELECT COUNT(*) FROM PortalCore_EventLogProperties elp1
4   WHERE elp1.PropertyName='RawURL' AND elp1.PropertyValue=elp.PropertyValue) AS cnt
5  FROM PortalCore_EventLogProperties elp
6  WHERE elp.PropertyName='RawURL') tmp
7 ORDER BY cnt DESC

This will show you a list of requested URLs with the number of occurrences in descending order. It may give you a hint where (i.e. which page) is giving you a headache. Now replace the value “RawURL” with “Message” and you’ll get the same but for the actual errors. The message is in effect the stacktrace. So this is the best way forward in determining where an error is occurring. You can of course also export this to a text file so you can work through it.

"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp" "SELECT cnt, PropertyValue FROM (SELECT DISTINCT(elp.PropertyValue), (SELECT COUNT(*) FROM new_bring2mind_net.dbo.PortalCore_EventLogProperties elp1 WHERE elp1.PropertyName='Message' AND elp1.PropertyValue=elp.PropertyValue) AS cnt FROM new_bring2mind_net.dbo.PortalCore_EventLogProperties elp WHERE elp.PropertyName='Message') tmp ORDER BY cnt DESC" queryout D:\Websites\local.bring2mind.net\StackTraces.txt -U "test" -P "test" –c

Good luck debugging!


Rating People: 8   Average Rating:     
Comment List:


  No Record 


Post your comment

Your Name: Required
Your Mail: Email is used only to display Gravatar
Your Site:
CAPTCHA image
Enter the code shown above in the box below
Comment Info:         

     DnnModule.com is build to provide quality modules and skins, Some of them are free,some not. We wish these stuffs( free or not ) can be useful to you.

    Besides that, We also provide a full range of professional services, ranging from web site build, seo, system management, administration, support, senior consultancy and security services.   We act as if your development project or network was ours, with care and respect. We are not satisfied until it works the way you want it to, and we don't silently ignore found issues as somebody else's problem.