为什么Visual Studio不能在调试时使用lambda表达式

一、引子

相信很多人都有这样的经历，在Visual Studio的Watch窗口中查看一个List类型的变量，如果这个List中的元素太多，有时想使用LINQ的Where方法筛选一下，类似于下图中的代码：

code

假设我们断在图示断点所在位置，这时我们在Watch窗口中查看personList变量是没问题的，但是如果我只想查看满足Age==20的personList，则会看到Visual Studio提示错误：Expression cannot contain lambda expressions

watch

这个问题一直困扰我很久，为什么在Visual Studio中不能在Watch窗口中使用lambda表达式呢？今天忍不住Google了一把，才发现这并不是我一个人的需求，网上有大把大把的人表示不理解为什么Visual Studio不提供这一功能，以至于Visual Studio官方的uservoice上积累了将近一万人的投票，建议Visual Studio引入这个功能。并且好消息是，在Visual Studio最新的2015版本中终于加入了这一特性。

二、原因初探

查看在uservoice下面的回复，我意识到为什么这一特性姗姗来迟，原来是因为这确实是一件很复杂的事。在Stackoverflow上也有很多相关的提问，其中有一篇JaredPar在评论中给出了很精彩的回复。我将他的回复原文摘抄下来放在下面：

No you cannot use lambda expressions in the watch / locals / immediate window. As Marc has pointed out this is incredibly complex. I wanted to dive a bit further into the topic though. What most people don't consider with executing an anonymous function in the debugger is that it does not occur in a vaccuum. The very act of definining and running an anonymous function changes the underlying structure of the code base. Changing the code, in general, and in particular from the immediate window, is a very difficult task. Consider the following code.
void Example() {
  var v1 = 42;
  var v2 = 56; 
  Func<int> func1 = () => v1;
  System.Diagnostics.Debugger.Break();
  var v3 = v1 + v2;
}
This particular code creates a single closure to capture the value v1. Closure capture is required whenever an anonymous function uses a variable declared outside it's scope. For all intents and purposes v1 no longer exists in this function. The last line actually looks more like the following
var v3 = closure1.v1 + v2;
If the function Example is run in the debugger it will stop at the Break line. Now imagine if the user typed the following into the watch window
(Func<int>)(() => v2);
In order to properly execute this the debugger (or more appropriatel the EE) would need to create a closure for variable v2. This is difficult but not impossible to do. What really makes this a tough job for the EE though is that last line. How should that line now be executed? For all intents and purposes the anonymous function deleted the v2 variable and replaced it with closure2.v2. So the last line of code really now needs to read
var v3 = closure1.v1 + closure2.v2;
Yet to actually get this effect in code requires the EE to change the last line of code which is actually an ENC action. While this specific example is possible, a good portion of the scenarios are not. What's even worse is executing that lambda expression shouldn't be creating a new closure. It should actually be appending data to the original closure. At this point you run straight on into the limitations ENC. My small example unfortunately only scratches the surface of the problems we run into. I keep saying I'll write a full blog post on this subject and hopefully I'll have time this weekend.

大意是讲由于lambda表达式涉及到了C#的闭包，而由于C#闭包的特性，导致在调试时如果在Watch窗口中输入lambda表达式将会修改原有代码结构，这并不是不可能，但是确实是一件非常困难且巨大的工程。

三、理解C#的lambda表达式和闭包

好奇心驱使我使用.NET Reflector查看了一下生成的exe文件，看到了下面这样的代码：

[CompilerGenerated]
private static bool <Main>b__0(Person x)
{
    return x.Age < 20;
}

[CompilerGenerated]
private static void <Main>b__1(Person x)
{
    Console.WriteLine(x.Name);
}

private static void Main(string[] args)
{
    personList.Where<Person>(Program.<Main>b__0).ToList<Person>().ForEach(Program.<Main>b__1);
    Console.Read();
}

可以看到lambda表达式被转换成了带有[CompilerGenerated]特性的静态方法，这也就意味着如果我们在调试的时候在Watch中每写一个lambda表达式，Visual Studio都需要动态的创建一个新的静态方法出来然后重新编译。这恐怕是最简单的情形，对调试器而言只是插入一个新的方法然后重新编译而已，Visual Studio已经支持运行时修改代码了，所以这种情形实现起来应该是没问题的。但是复杂就复杂在并不是每个lambda表达式都是这样简单，闭包特性的引入使得lambda表达式中的代码不再只是上下文无关的一个静态方法了，这使得问题变得越来越有意思。我们看下面的示例代码，其中用到了C#的闭包特性：

static void TestOne()
{
    int x = 1, y = 2, z = 3;
    Func<int> f1 = () => x;
    Func<int> f2 = () => y + z;
    Func<int> f3 = () => 3;

    x = f2();
    y = f1();
    z = f1() + f2();

    Console.WriteLine(x + y + z);
}

再看反编译的代码：

[CompilerGenerated]
private sealed class <>c__DisplayClass6
{
    // Fields
    public int x;
    public int y;
    public int z;

    // Methods
    public int <TestOne>b__4()
    {
        return this.x;
    }

    public int <TestOne>b__5()
    {
        return this.y + this.z;
    }
}

[CompilerGenerated]
private static int <TestOne>b__6()
{
    return 3;
}

private static void TestOne()
{
    <>c__DisplayClass6 CS$<>8__locals7 = new <>c__DisplayClass6();
    CS$<>8__locals7.x = 1;
    CS$<>8__locals7.y = 2;
    CS$<>8__locals7.z = 3;
    Func<int> f1 = new Func<int>(CS$<>8__locals7.<TestOne>b__4);
    Func<int> f2 = new Func<int>(CS$<>8__locals7.<TestOne>b__5);
    Func<int> f3 = new Func<int>(Program.<TestOne>b__6);
    CS$<>8__locals7.x = f2();
    CS$<>8__locals7.y = f1();
    CS$<>8__locals7.z = f1() + f2();
    Console.WriteLine((int) (CS$<>8__locals7.x + CS$<>8__locals7.y + CS$<>8__locals7.z));
}

从生成的代码可以看到编译器帮我们做了很多事情，lambda表达式只是语法糖而已。简单的lambda表达式被转换成带有[CompilerGenerated]特性的静态方法，使用闭包特性的lambda表达式被转换成带有[CompilerGenerated]特性的封装（sealed）类，并将所有涉及到的局部变量移到该类中作为该类的public字段，表达式本身移到该类中作为该类的public方法。而且下面所有对闭包涉及到的变量的操作都转换成了对类的字段的操作。所以回到上文中JaredPar给出的解释，当我们在调试器中输入lambda表达式(Func)(() => v2)时可能存在两种不同的解决方法：

该方法中已经存在一个闭包，则需要将v2变量提到该闭包类中，并添加一个() => v2方法；
新建一个闭包类，将v2变量移到该类中，并添加一个() => v2方法；

但是这还远远不够，所有涉及到v2变量的操作，都需要调整为类似于CS$<>8__locals7.v2这样的代码，哦，想想就觉得好麻烦。

四、解决方法

先不谈Visual Studio 2015已经引入这个特性吧，我还没尝试过，也不知道是怎么实现的。暂时我还是在用Visual Studio 2010，所以如果想在老版本中提供这样的功能，就得另谋他策。下面是一些已知的解决方法，我仅是记下来而已，也没尝试过，等试过再写篇新博客记录下吧。

使用动态LINQ 参考链接中列出了一些动态LINQ的实现，其中用的比较多的应该是Scott提供的LINQ动态查询库，这样在调试时可以使用personList.Where("Age = @0", 20)来替代personList.Where(x => x.Age == 20)
使用一个VS的开源插件：Extended Immediate Window for Visual Studio
升级成Visual Studio 2015 ;-)

一、引子

二、原因初探

三、理解C#的lambda表达式和闭包

四、解决方法

参考

分类