A long while ago, I posted a detailed answer on what functions the default comparers actually were calling to get a feel for if they would apply or not answering delphi – What does the default TArray.Sort comparator actually do and when would you use it? – Stack Overflow.
I needed that information recently because of some sorting issues I bumped into (sorting generic records), so finally a blog post.
First some links to documentation for even more background information:
- System.Generics.Collections:
– TArray.Sort method. - System.Generics.Defaults:
– IComparer interface with the Compare method.
– IEqualityComparer with the Equals and GetHashCode methods.
– _LookupVtableInfo.
– BinaryCompare.
– BobJenkinsHash.
– TOrdinalIStringComparer. - Generics Defaults TComparer (Delphi) – RAD Studio Code Examples.
There is the answer I gave:
David did a great job of textually describing how the default comparers work, but for some of you it might be easier to follow when you see how the underlying code is structured (and decide if the default comparers do apply).
I will cover only the
Compare_
style of comparisons. TheEquals_
style works in a similar way.What happens is that
_LookupVtableInfo
selects anIComparer
interface forCompare_
style comparisons (and aIEqualityComparer
forEquals_
style).Underneath those interfaces are not ordinary interfaces, but interface wrappers around global functions of this form for
Compare_
style:function Compare_t<T>(Inst: Pointer; const Left, Right: T): Integer;
and global procedures of this form for
Equals_
style:function Equals_t<T>(Inst: Pointer; const Left, Right: T): Integer; function GetHashCode_t<T>(Inst: Pointer; const Left, Right: T): Integer;
The outcome of
Compare_
style functions is straightforward, but slightly different from -1, 0, +1 that some people might expect:< 0 for Left < Right = 0 for Left = Right > 0 for Left > Right
For the majority of cases, the implementation is very simple:
I have grouped the
Compare_
style functions by how they do this.
- Ordinal types (including enumerators and Int64).
- Floating point (Real) types (including Comp and Currency).
- Short strings (from the Turbo Pascal / Delphi 1 days).
- Wide strings (the OLE style ones).
- Methods.
- Pointers (including Classes, Interfaces, Class References and Procedures).
(Ordinal types outside the range of 1, 2, 4, 8 bytes, and real types outside the range of 4, 8, 10 bytes raise an error as they are illegal).
The first group just subtracts Left from Right: signed/unsigned integers of 1 or 2 bytes length
function Compare_I1(Inst: Pointer; const Left, Right: Shortint): Integer; function Compare_I2(Inst: Pointer; const Left, Right: Smallint): Integer; function Compare_U1(Inst: Pointer; const Left, Right: Byte): Integer; function Compare_U2(Inst: Pointer; const Left, Right: Word): Integer; Result := Left - Right;
The second group does a comparison:
function Compare_I4(Inst: Pointer; const Left, Right: Integer): Integer; function Compare_I8(Inst: Pointer; const Left, Right: Int64): Integer; function Compare_U4(Inst: Pointer; const Left, Right: LongWord): Integer; function Compare_U8(Inst: Pointer; const Left, Right: UInt64): Integer; function Compare_R4(Inst: Pointer; const Left, Right: Single): Integer; function Compare_R8(Inst: Pointer; const Left, Right: Double): Integer; function Compare_R10(Inst: Pointer; const Left, Right: Extended): Integer; function Compare_RI8(Inst: Pointer; const Left, Right: Comp): Integer; function Compare_RC8(Inst: Pointer; const Left, Right: Currency): Integer; function Compare_WString(Inst: PSimpleInstance; const Left, Right: WideString): Integer; function Compare_Pointer(Inst: PSimpleInstance; Left, Right: NativeUInt): Integer; type {$IFNDEF NEXTGEN} TPS1 = string[1]; TPS2 = string[2]; TPS3 = string[3]; {$ELSE NEXTGEN} OpenString = type string; TPS1 = string; TPS2 = string; TPS3 = string; {$ENDIF !NEXTGEN} function Compare_PS1(Inst: PSimpleInstance; const Left, Right: TPS1): Integer; function Compare_PS2(Inst: PSimpleInstance; const Left, Right: TPS2): Integer; function Compare_PS3(Inst: PSimpleInstance; const Left, Right: TPS3): Integer; // OpenString allows for any String[n], see http://my.safaribooksonline.com/book/programming/borland-delphi/1565926595/5dot-language-reference/ch05-openstring function Compare_PSn(Inst: PSimpleInstance; const Left, Right: OpenString): Integer; if Left < Right then Result := -1 else if Left > Right then Result := 1 else Result := 0; function Compare_Method(Inst: PSimpleInstance; const Left, Right: TMethodPointer): Integer; var LMethod, RMethod: TMethod; begin LMethod := TMethod(Left); RMethod := TMethod(Right); if LMethod < RMethod then Result := -1 else if LMethod > RMethod then Result := 1 else Result := 0; end;
Now we get to the interesting bits: the not-so-straightforward outcomes.
Strings use
CompareStr
. If you want something different, you can useTOrdinalIStringComparer
function Compare_LString(Inst: PSimpleInstance; const Left, Right: AnsiString): Integer; function Compare_UString(Inst: PSimpleInstance; const Left, Right: UnicodeString): Integer; Result := CompareStr(Left, Right);
BinaryCompare
is used for:
- binary data including unknown, Char/WChar, Set, Array, Record. Exception if the binary data is 1, 2 or 4 bytes size in x86 and x64 and 8 bytes in x64, it will be compared as integers.
- dynamic carrays (be careful when they are multi-dimensional!).
- variants as a last resort (see further below)
For records that can be compared, it makes sense to perform operator overloading, and have the comparer use those operators.
Binary data of 1, 2, 4 or 8 bytes is an exception, which will give strange results on little-endian machines (Intel x86 and x64, and bi-endian Arm in little-endian mode):
function Comparer_Selector_Binary(info: PTypeInfo; size: Integer): Pointer; begin case size of // NOTE: Little-endianness may cause counterintuitive results, // but the results will at least be consistent. 1: Result := @Comparer_Instance_U1; 2: Result := @Comparer_Instance_U2; 4: Result := @Comparer_Instance_U4; {$IFDEF CPUX64} // 64-bit will pass const args in registers 8: Result := @Comparer_Instance_U8; {$ENDIF} else Result := MakeInstance(@Comparer_Vtable_Binary, size); end; end;
The rest is pure binary:
function Compare_Binary(Inst: PSimpleInstance; const Left, Right): Integer; begin Result := BinaryCompare(@Left, @Right, Inst^.Size); end; function Compare_DynArray(Inst: PSimpleInstance; Left, Right: Pointer): NativeInt; var len, lenDiff: NativeInt; begin len := DynLen(Left); lenDiff := len - DynLen(Right); if lenDiff < 0 then Inc(len, lenDiff); Result := BinaryCompare(Left, Right, Inst^.Size * len); if Result = 0 then Result := lenDiff; end;
As usual,
Variants
are in a league of their own. FirstVarCompareValue
is tried. If that fails, thenCompare_UString
is tried. If that fails too,BinaryCompare
is tried. If that fails: tough luck.function Compare_Variant(Inst: PSimpleInstance; Left, Right: Pointer): Integer; var l, r: Variant; lAsString, rAsString: string; begin Result := 0; // Avoid warning. l := PVariant(Left)^; r := PVariant(Right)^; try case VarCompareValue(l, r) of vrEqual: Exit(0); vrLessThan: Exit(-1); vrGreaterThan: Exit(1); vrNotEqual: begin if VarIsEmpty(L) or VarIsNull(L) then Exit(1) else Exit(-1); end; end; except // if comparison failed with exception, compare as string. try lAsString := PVariant(Left)^; rAsString := PVariant(Right)^; Result := Compare_UString(nil, lAsString, rAsString); except // if comparison fails again, compare bytes. Result := BinaryCompare(Left, Right, SizeOf(Variant)); end; end; end;
–jeroen
Filed under: Ansi, Delphi, Delphi 2009, Delphi 2010, Delphi XE, Delphi XE2, Delphi XE3, Delphi XE4, Delphi XE5, Development, Encoding, Software Development, Unicode