2020-05-02

面向接口编程,你考虑过性能吗?

面向接口编程,你考虑过性能吗?


大家在平时开发中大多都会遵循接口编程,这样就可以方便实现依赖注入也方便实现多态等各种小技巧,但这种是以牺牲性能为代价换取代码的灵活性,万物皆有阴阳,看你的应用场景进行取舍。

一:背景

1. 缘由

在项目的性能改造中,发现很多方法签名的返回值都是采用IEnumerable接口,比如下面这段代码:

  public static void Main(string[] args)  {   var list = GetHasEmailCustomerIDList();   foreach (var item in list){}    Console.ReadLine();  }  public static IEnumerable<int> GetHasEmailCustomerIDList()  {   return Enumerable.Range(1, 5000000).ToArray();  }

2. 有什么问题

这段代码乍一看也没啥什么性能问题,foreach迭代天经地义,这个还能怎么优化???

<1> 从MSIL中寻找问题

首先我们尽可能把原貌还原出来,简化后的MSIL如下。

.method public hidebysig static 	void Main (		string[] args	) cil managed {	IL_0009: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()	IL_000e: stloc.1	.try	{		IL_000f: br.s IL_001a		// loop start (head: IL_001a)			IL_0011: ldloc.1			IL_0012: callvirt instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()			IL_0017: stloc.2			IL_0018: nop			IL_0019: nop			IL_001a: ldloc.1			IL_001b: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()			IL_0020: brtrue.s IL_0011		// end loop		IL_0022: leave.s IL_002f	} // end .try	finally	{		IL_0024: ldloc.1		IL_0025: brfalse.s IL_002e		IL_0027: ldloc.1		IL_0028: callvirt instance void [mscorlib]System.IDisposable::Dispose()		IL_002d: nop		IL_002e: endfinally	} // end handler	IL_002f: ret} // end of method Program::Main

从IL中看到了标准的get_Current,MoveNext,Dispose 还有一个try,finally,一下子多了这么多方法和关键词,不就是一个简单的foreach迭代数组嘛? 至于搞的这么复杂嘛?这样在大数据下怎么快的起来?

还有一个奇葩的事,如果你仔细观察IL代码,比如这句:[mscorlib]System.Collections.Generic.IEnumerable``1<int32>::GetEnumerator(), 这个GetEnumerator前面是接口IEnumerable,正常情况下应该是具体迭代类吧,按理说应该会调用Array的GetEnumerator方法,如下所示。

[Serializable][ComVisible(true)][__DynamicallyInvokable]public abstract class Array : ICloneable, IList, ICollection, IEnumerable, IStructuralComparable, IStructuralEquatable { [__DynamicallyInvokable]	public IEnumerator GetEnumerator()	{		int lowerBound = GetLowerBound(0);		if (Rank == 1 && lowerBound == 0)		{			return new SZArrayEnumerator(this);		}		return new ArrayEnumerator(this, lowerBound, Length);	}}

<2> 从windbg中寻找问题

IL中发现的第二个问题我特别好奇,😄😄,我们到托管堆上去看下到底是哪一个具体类调用了GetEnumerator()方法。

!clrstack -l > !do xx 到线程栈上抓list变量

0:000> !clrstack -l000000229e3feda0 00007ff889e40951 *** WARNING: Unable to verify checksum for ConsoleApp2.exeConsoleApp2.Program.Main(System.String[]) [C:\dream\Csharp\ConsoleApp1\ConsoleApp2\Program.cs @ 32] LOCALS:  0x000000229e3fede8 = 0x0000019bf33b9a88  0x000000229e3fede0 = 0x0000019be33b2d90  0x000000229e3fedfc = 0x00000000004c4b400:000> !do 0x0000019be33b2d90Name:  System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]]MethodTable: 00007ff8e8d36d18EEClass:  00007ff8e7cf5640Size:  32(0x20) bytesFile:  C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dllFields:    MT Field Offset     Type VT  Attr   Value Name00007ff8e7a98538 4002ffe  8  System.Int32[] 0 instance 0000019bf33b9a88 _array00007ff8e7a985a0 4002fff  10   System.Int32 1 instance   5000000 _index00007ff8e7a985a0 4003000  14   System.Int32 1 instance   5000000 _endIndex00007ff8e8d36d18 4003001  0 ...Int32, mscorlib]] 0 shared   static Empty         >> Domain:Value dynamic statics NYI 0000019be1893a80:NotInit <<

居然有这么一个类型 Name: System.SZArrayHelper+SZGenericArrayEnumerator,然来是JIT捣的鬼,生成了这么一个SZGenericArrayEnumerator类型,接下来把它的方法表打出来看看里面都有啥方法。

0:000> !dumpmt -md 00007ff8e8d36d18EEClass:   00007ff8e7cf5640Module:   00007ff8e7a71000Name:   System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]]mdToken:   0000000002000a98File:   C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dllBaseSize:  0x20ComponentSize: 0x0Slots in VTable: 11Number of IFaces in IFaceMap: 3--------------------------------------MethodDesc Table   Entry  MethodDesc JIT Name00007ff8e7ff2450 00007ff8e7a78de8 PreJIT System.Object.ToString()00007ff8e800cc60 00007ff8e7c3b9b0 PreJIT System.Object.Equals(System.Object)00007ff8e7ff2090 00007ff8e7c3b9d8 PreJIT System.Object.GetHashCode()00007ff8e7fef420 00007ff8e7c3b9e0 PreJIT System.Object.Finalize()00007ff8e8b99fd0 00007ff8e7ebf388 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]].MoveNext()00007ff8e8b99f90 00007ff8e7ebf390 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]].get_Current()00007ff8e8b99f60 00007ff8e7ebf398 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]].System.Collections.IEnumerator.get_Current()00007ff8e8b99f50 00007ff8e7ebf3a0 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]].System.Collections.IEnumerator.Reset()00007ff8e8b99f40 00007ff8e7ebf3a8 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]].Dispose()00007ff8e8b99ef0 00007ff8e7ebf3b0 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]]..cctor()00007ff8e8b99ff0 00007ff8e7ebf380 PreJIT System.SZArrayHelper+SZGenericArrayEnumerator`1[[System.Int32, mscorlib]]..ctor(Int32[], Int32)

可以看到这是一个标准的迭代类,这性能又被拖累了。。。

二:优化性能

综合上面分析,貌似问题出在了 foreachIEnumerable<int>这两个方面。

1. IEnumerable 替换 int[], foreach改成for

知道了这两点,接下来把代码修改如下:

  public static void Main(string[] args)  {   var list = GetHasEmailCustomerIDList();   for (int i = 0; i < list.Length; i++) { }   Console.ReadLine();  }  public static int[] GetHasEmailCustomerIDList()  {   return Enumerable.Range(1, 5000000).ToArray();  }.method public hidebysig static 	void Main (		string[] args	) cil managed {	// (no C# code)	IL_0000: nop	// int[] hasEmailCustomerIDList = GetHasEmailCustomerIDList();	IL_0001: call int32[] ConsoleApp2.Program::GetHasEmailCustomerIDList()	IL_0006: stloc.0	// for (int i = 0; i < hasEmailCustomerIDList.Length; i++)	IL_0007: ldc.i4.0	IL_0008: stloc.1	// (no C# code)	IL_0009: br.s IL_0011	// loop start (head: IL_0011)		IL_000b: nop		IL_000c: nop		// for (int i = 0; i < hasEmailCustomerIDList.Length; i++)		IL_000d: ldloc.1		IL_000e: ldc.i4.1		IL_000f: add		IL_0010: stloc.1		// for (int i = 0; i < hasEmailCustomerIDList.Length; i++)		IL_0011: ldloc.1		IL_0012: ldloc.0		IL_0013: ldlen		IL_0014: conv.i4		IL_0015: clt		IL_0017: stloc.2		IL_0018: ldloc.2		// (no C# code)		IL_0019: brtrue.s IL_000b	// end loop	// Console.ReadLine();	IL_001b: call string [mscorlib]System.Console::ReadLine()	// (no C# code)	IL_0020: pop	// }	IL_0021: ret} // end of method Program::Main

可以看到上面的IL指令都是非常基础的指令,大多都有CPU指令直接提供支持,非常简洁,大爱~~~

这里有一点要注意: 我后来观察foreach不需要改成for,vs编辑器在底层帮我们转换了,看的出来foreach在迭代数组类型的时候还是非常智能的,知道怎么帮助我们优化。。。修改代码如下:

  public static void Main(string[] args)  {   var list = GetHasEmailCustomerIDList();   //for (int i = 0; i < list.Length; i++) { }   foreach (var item in list) { }   Console.ReadLine();  }.method public hidebysig static 	void Main (		string[] args	) cil managed {	// (no C# code)	IL_0000: nop	// int[] hasEmailCustomerIDList = GetHasEmailCustomerIDList();	IL_0001: call int32[] ConsoleApp2.Program::GetHasEmailCustomerIDList()	IL_0006: stloc.0	// (no C# code)	IL_0007: nop	// int[] array = hasEmailCustomerIDList;	IL_0008: ldloc.0	IL_0009: stloc.1	// for (int i = 0; i < array.Length; i++)	IL_000a: ldc.i4.0	IL_000b: stloc.2	// (no C# code)	IL_000c: br.s IL_0018	// loop start (head: IL_0018)		// int num = array[i];		IL_000e: ldloc.1		IL_000f: ldloc.2		IL_0010: ldelem.i4		// (no C# code)		IL_0011: stloc.3		IL_0012: nop		IL_0013: nop		// for (int i = 0; i < array.Length; i++)		IL_0014: ldloc.2		IL_0015: ldc.i4.1		IL_0016: add		IL_0017: stloc.2		// for (int i = 0; i < array.Length; i++)		IL_0018: ldloc.2		IL_0019: ldloc.1		IL_001a: ldlen		IL_001b: conv.i4		IL_001c: blt.s IL_000e	// end loop	// Console.ReadLine();	IL_001e: call string [mscorlib]System.Console::ReadLine()	// (no C# code)	IL_0023: pop	// }	IL_0024: ret} // end of method Program::Main

2. 代码测试

微观方面已经带大家分析过了,接下来宏观测试两种方式的性能到底相差多少,每一个方法我都做10次性能对比。

  public static void Main(string[] args)  {   var arr = GetHasEmailCustomerIDArray();   for (int i = 0; i < 10; i++)   {    var watch = Stopwatch.StartNew();    foreach (var item in arr) { }    watch.Stop();    Console.WriteLine($"i={i},时间:{watch.ElapsedMilliseconds}");   }   Console.WriteLine("---------------");   var list = arr as IEnumerable<int>;   for (int i = 0; i < 10; i++)   {    var watch = Stopwatch.StartNew();    foreach (var item in list) { }    watch.Stop();    Console.WriteLine($"i={i},时间:{watch.ElapsedMilliseconds}");   }   Console.ReadLine();  }  public static int[] GetHasEmailCustomerIDArray()  {   return Enumerable.Range(1, 5000000).ToArray();  }i=0,时间:10i=1,时间:10i=2,时间:10i=3,时间:9i=4,时间:9i=5,时间:9i=6,时间:10i=7,时间:10i=8,时间:12i=9,时间:12---------------i=0,时间:45i=1,时间:37i=2,时间:35i=3,时间:35i=4,时间:37i=5,时间:35i=6,时间:36i=7,时间:37i=8,时间:35i=9,时间:36

难以置信的是居然有3-4倍的差距。。。这就是用灵活性换取性能的代价😄😄😄

好了,本篇就说到这里,希望对你有帮助。


如您有更多问题与我互动,扫描下方进来吧~



No comments:

Post a Comment