Managed Stack Explorer using ClrMD

How often we run into an issue in the field where we just want to see the managed call-stack where the exception is or where the thread is hung. One of the options is debugger or something like ETW.

So I built a managed stack explorer

Managed Stack Explorer provides call stack for .NET applications using managed code with thread local variables. The API as of now does not provide values for the local variables. When it is available we can update it.

It is a single executable without any other dll’s. It uses Costura to embed dll.

The debug shim works specific to processor and would not be able to get call-stacks if it is not. So x86 exe cannot get call-stacks of x64. That’s reason for x86 and x64 specific exe’s. There is no difference in code other than how it is compiled.

It is work in progress and I would love some feedback and code contributions.

The github site also has link to download the pre-complied executables.

Measure GC Allocations and Collections using TraceEvent

In this post I will explore  how we could use TraceEvent to measure our code (even at function level) for GC Allocations and Collections.

Save this with “.linq” extension and then open in  linqpad.

Here is the TL;DR

Why would I want to know GC events on a function level? Doesn’t the PerfMon counter  provide that information on an application level? Isn’t Premature optimization root of all evil?

Yes, for most of the part Premature optimization is not necessary. And PerfMon GC counter’s would give answers for the whole application. But it is usually after we build the application and when we start running into performance issues we start looking at them.

The motivation behind this are two things Measure Early and Often for Performance and Essential Truths Everyone Should Know about Performance in a Large Managed Codebase

If you haven’t read or watched the above video please do it. It let’s us know why and how to do it.

In the above video Dustin talks about Roslyn code base and how they used Perfview to measure Roslyn code base and identify potential bottlenecks early to avoid Perf issues.

One of the key differences between managed code and native code with respect to performance is GC. If GC is working hard then your application might not be able to get the performance that you are expecting. GC is good but if we don’t know which calls allocate what amount of data then it is an issue. Especially if you have a section code that is hit very often and which requires a lot of Perf, it is good know where the allocations are coming from. It is not explicit always.

In the above video Dustin shows few examples of Roslyn code where they were able to identify subtle issues that could allocate a lot when you are trying to get the most out of the code.There is also Roslyn Heap Allocation Analyzer which looks at the code help us identify allocations which isn’t necessary. It is a cool project.

I took one of the examples from the video as a motivation to check if  I could measure and make it a utility in my toolbox to help me when I need one.

In the above example I am trying look for a word “pede” in the lorem ipsum text. The code could get it using “foreach” or using the “Any” operator. I would like to run this few times to check what are the allocations and how long does it take. I used LINQPad as a scratch pad.

Here is the result of GC Allocations of using “Any” for 500 iterations and NOT the foreach


The were 118 allocations of Enumerator and 146 allocations Func. GC usually allocates 100K each time it allocates that’s what is shown in the allocation amount column.

And here is GC Allocations when using “foreach”


There are hardly any new allocations compared to the previous one.

Here is the GC Collections when using  “Any”


There were 18 GC Collections using Any.

Here it is using foreach  and there were 0 collections.


Here is measure it time duration results using Any


And here it is using Foreach


Look ma I figured out why my ETW EventSource isn’t tracing

The EventSource class in the framework 4.5 helps in writing custom ETW tracing.

When using EventSource class built within the framework, if the order of the methods don’t match ordinal number position in the class it would fail generating ETW traces. The EventSource has dependency on the order of the methods in the class.

This code would produce a valid ETW traces

This one would fail producing any ETW Traces.

The difference between them are the order of the methods. If you notice in the failing ETW tracing class the FailedTraceEvent is Second and the FailedDetailedEvent is first which is causing the trace not to be generated. The actual exception text would

Event FailedDetailedEvent is givien event ID 1 but 2 was passed to WriteEvent.

It’s one of those quirks that I ran into when building ETW tracing.

How to troubleshoot these kind of failures?

By default these kind of failed ETW exceptions would not be raised to be handled by the client code. The reason being,in production if you enable ETW tracing and all of a sudden the application crashes would not be something that we would want.

To troubleshoot this, use ETW tracing for exceptions. Use ETW to trace custom ETW failures. How cool is this?  My choice of tool is Perfview.


Within Perfview in the Events Window I enter Test|Ex in the filter text box which is shown in the above picture. FYI filter text box supports Regular expression. So by entering Test|Ex, I am filtering events with Test or Ex which for exceptions. With that information I could filter all the TestEvent’s and any exceptions that have been raised which shows ArgumentException.

The call-stack of the ArgumentException shows on the static contructor .cctor of  FailedEvent.


Making an Image Easier to Debug

I am doing security review for a managed application which is obfuscated. So I am doing a lot of   disassembling code at runtime using Windbg. One of the issues is that code gets JIT optimized because of the retail build. This makes it harder for me debug when mapping it back. Realized  that I could turnoff  JIT Optimization’s using the ini file.

[.NET Framework Debugging Control]

Another use of feature which I guess wasn’t really intended for.


Updating .NET String in memory with Windbg

In this post I would show a simple trick to update .NET strings in memory with Windbg. The caveat is make sure the string that you’re updating is long enough to fit into the string buffer. If not there would be a memory corruption.

Here is a simple windows form application with title “Good”

The goal is to update the title from “Good” to “Bad”.

button1.Click += (s,b) => Text = _caption;

I am updating the title in the button click.

Here is the actual string object within the debugger

0:006> !do 0294d0a0
Name:        System.String
MethodTable: 59b9fb64
EEClass:     598d8bb0
Size:        22(0x16) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\
String:      Good
      MT    Field   Offset                 Type VT     Attr    Value Name
59ba2b30  40000ed        4         System.Int32  1 instance        4 m_stringLength
59ba1f80  40000ee        8          System.Char  1 instance       47 m_firstChar
59b9fb64  40000ef        8        System.String  0   shared   static Empty
    >> Domain:Value  004b0308:02941228 <<

I would be using the e  command to update the memory. The ezu command is used for updating  Null-terminated Unicode string .

Notice the first character starts in the 8th offset from the above. So we would have start updating the string only from the 8th offset. The first 8 bytes of object are for syncblock index and method table pointer.

Here is the command to update the string memory.

ezu 0294d0a0+8 “Bad”

And the updated form title.

Conditional BreakPoint based on callstack within Windbg – .NET

Someone recently asked me “How to have a break-point on a method based on certain function in the call-stack?”

Here is the sample code to demonstrate this

using System;
using System.Threading.Tasks;
using System.Data.SqlClient;
namespace Test
    class Program
        string connectionString = @"Data Source=.\sqlexpress;Initial Catalog=Tfs_Configuration;Integrated Security=True";
        public void Bar()
            using (var c = new SqlConnection(connectionString))
                var command = new SqlCommand(@"update [tbl_AccessMapping] set [DisplayName] = @param", c);
                command.Parameters.Add(new SqlParameter("param", "Bar"));
        public void Foo()
            using (var c = new SqlConnection(connectionString))
                var command = new SqlCommand(@"update [tbl_AccessMapping] set [DisplayName] = @param", c);
                command.Parameters.Add(new SqlParameter("param", "Foo"));
        static void Main(string[] args1)
            var s = new Program();
            Parallel.For(0, 2, (i) => s.Bar());
            Parallel.For(0, 2, (i) => s.Foo());

The requirement is to have a break-point on “ExecuteNonQuery” but it should break only if it is invoked from “Foo” and not from “Bar”.

Launched the exe within windbg and loaded sos,sosex and set a bp on System.Data.SqlClient.SqlCommand.ExecuteNonQuery suing !mbm

And when the break-point hits the first time updated the bp using

bs 0  $$>a<“d:\Debuggersx86\ConditionalBP.txt” Foo

Here are the contents of ConditionalBP.txt

ad /q Contains
aS /c Contains .shell -ci "!CLRStack" FINDSTR $arg1
.block {
            .if ($spat("${Contains}","*${$arg1}*"))
ad /q Contains

Saving Dynamic Assembly in .NET 4.0 using Windbg

I recently had to debug a .NET 4.0 process which was loading the dependent assemblies using the AppDomain.AssemblyResolve event. The dependent assemblies were stored within the executable. I couldn’t disassemble the code to look for the dependent assembly because the exe was obfuscated. FYI the dynamic assembly cannot be saved using !SaveModule and here is the reason for this read the comments especially from Evian. Unlike psscor2.dll the sos for .NET 4.0 does not have a !dumpdynamicassembly with a save option.

Here is the sample code to demonstrate this.

using System;
using System.Reflection;
using TestLib;
namespace Test
 class Foo1
 int[] s = new int[2];
 int v = 100;
 public Foo1()
 Console.WriteLine(new Class1().Foo());
 static void Main(string[] args1)
 AppDomain.CurrentDomain.AssemblyResolve += (sender, args) =>
 String resourceName = "ConsoleApplication13." +new AssemblyName(args.Name).Name + ".dll";
 using (var stream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName))
 Byte[] assemblyData = new Byte[stream.Length];
 stream.Read(assemblyData, 0, assemblyData.Length);
 return Assembly.Load(assemblyData);
 System.IO.File.Delete(@"C:\Users\naveen\Documents\Visual Studio 2010\Projects\ConsoleApplication13\bin\Debug\TestLib.dll");
 var s = new Foo1();

I knew the Assembly had to be loaded using  System.Reflection.Assembly.Load(Byte[]) ,so ended setting a break-point on the method using command !mbm *Assembly.Load* on the launch of the executable.

Here are the output of args and local variables for the above break-point

0:000> !mdv
Frame 0x0: (System.Reflection.Assembly.Load(Byte[])):
[A0]:rawAssembly:0x025fc374 (System.Byte[])

Notice the “rawAssembly” argument which has the assembly contents.  Here are the raw memory contents of the address using dd 0x025fc374

0:000> dd 0x025fc374
025fc374  6a764994 00001000 00905a4d 00000003
025fc384  00000004 0000ffff 000000b8 00000000
025fc394  00000040 00000000 00000000 00000000
025fc3a4  00000000 00000000 00000000 00000000
025fc3b4  00000000 00000080 0eba1f0e cd09b400
025fc3c4  4c01b821 685421cd 70207369 72676f72
025fc3d4  63206d61 6f6e6e61 65622074 6e757220
025fc3e4  206e6920 20534f44 65646f6d 0a0d0d2e

  1. 6a764994 :- Is the Array’s  Method Table
  2. 00001000 : – Is the Array size
  3. The rest are the array contents.

Unlike the reference type arrays, the value type arrays  don’t have a DWORD for Method table of its contents. With this information I could dump the contents from memory in to disk using .writemem command.

.writemem c:\temp\assembly.bin @ecx+8 L?(poi(@ecx+@$ptrsize)*@$ptrsize)

In x86 @ecx register contains argument for rawAssembly. The  @ecx+8 is the start  position of the first byte and that is the reason for using this as the start position for .writemem. The poi(@ecx+@$ptrsize) contains the array size which in our case is 0001000 and multiply it by @$ptrsize which is 4 in x86. The expression (poi(@ecx+@$ptrsize)*@$ptrsize) would in our case result to 4000 bytes.

The assembly.bin would contain data in hex format which has to be converted in to binary format. Here is the code to convert from Hex to Binary format.

Assembly.Load( File.ReadAllBytes(@"c:\temp\assembly.bin")
 .Select(x =>