Quantcast
Channel: Planet Object Pascal
Viewing all articles
Browse latest Browse all 1725

The Wiert Corner - irregular stream of stuff: jpluimers

$
0
0

A long while ago, I posted a detailed answer on what functions the default comparers actually were calling to get a feel for if they would apply or not answering delphi – What does the default TArray.Sort comparator actually do and when would you use it? – Stack Overflow.

I needed that information recently because of some sorting issues I bumped into (sorting generic records), so finally a blog post.

First some links to documentation for even more background information:

There is the answer I gave:

David did a great job of textually describing how the default comparers work, but for some of you it might be easier to follow when you see how the underlying code is structured (and decide if the default comparers do apply).

I will cover only the Compare_ style of comparisons. The Equals_ style works in a similar way.

What happens is that _LookupVtableInfo selects an IComparer interface for Compare_ style comparisons (and a IEqualityComparer for Equals_ style).

Underneath those interfaces are not ordinary interfaces, but interface wrappers around global functions of this form for Compare_ style:

function Compare_t<T>(Inst: Pointer; const Left, Right: T): Integer;

and global procedures of this form for Equals_ style:

function Equals_t<T>(Inst: Pointer; const Left, Right: T): Integer;
function GetHashCode_t<T>(Inst: Pointer; const Left, Right: T): Integer;

The outcome of Compare_ style functions is straightforward, but slightly different from -1, 0, +1 that some people might expect:

< 0 for Left < Right
= 0 for Left = Right
> 0 for Left > Right

For the majority of cases, the implementation is very simple:

I have grouped the Compare_ style functions by how they do this.

  • Ordinal types (including enumerators and Int64).
  • Floating point (Real) types (including Comp and Currency).
  • Short strings (from the Turbo Pascal / Delphi 1 days).
  • Wide strings (the OLE style ones).
  • Methods.
  • Pointers (including Classes, Interfaces, Class References and Procedures).

(Ordinal types outside the range of 1, 2, 4, 8 bytes, and real types outside the range of 4, 8, 10 bytes raise an error as they are illegal).

The first group just subtracts Left from Right: signed/unsigned integers of 1 or 2 bytes length

function Compare_I1(Inst: Pointer; const Left, Right: Shortint): Integer;
function Compare_I2(Inst: Pointer; const Left, Right: Smallint): Integer;
function Compare_U1(Inst: Pointer; const Left, Right: Byte): Integer;
function Compare_U2(Inst: Pointer; const Left, Right: Word): Integer;

  Result := Left - Right;

The second group does a comparison:

function Compare_I4(Inst: Pointer; const Left, Right: Integer): Integer;
function Compare_I8(Inst: Pointer; const Left, Right: Int64): Integer;
function Compare_U4(Inst: Pointer; const Left, Right: LongWord): Integer;
function Compare_U8(Inst: Pointer; const Left, Right: UInt64): Integer;
function Compare_R4(Inst: Pointer; const Left, Right: Single): Integer;
function Compare_R8(Inst: Pointer; const Left, Right: Double): Integer;
function Compare_R10(Inst: Pointer; const Left, Right: Extended): Integer;
function Compare_RI8(Inst: Pointer; const Left, Right: Comp): Integer;
function Compare_RC8(Inst: Pointer; const Left, Right: Currency): Integer;
function Compare_WString(Inst: PSimpleInstance; const Left, Right: WideString): Integer;
function Compare_Pointer(Inst: PSimpleInstance; Left, Right: NativeUInt): Integer;

type
{$IFNDEF NEXTGEN}
  TPS1 = string[1];
  TPS2 = string[2];
  TPS3 = string[3];
{$ELSE NEXTGEN}
  OpenString = type string;
  TPS1 = string;
  TPS2 = string;
  TPS3 = string;
{$ENDIF !NEXTGEN}

function Compare_PS1(Inst: PSimpleInstance; const Left, Right: TPS1): Integer;
function Compare_PS2(Inst: PSimpleInstance; const Left, Right: TPS2): Integer;
function Compare_PS3(Inst: PSimpleInstance; const Left, Right: TPS3): Integer;
// OpenString allows for any String[n], see http://my.safaribooksonline.com/book/programming/borland-delphi/1565926595/5dot-language-reference/ch05-openstring
function Compare_PSn(Inst: PSimpleInstance; const Left, Right: OpenString): Integer;

  if Left < Right then
    Result := -1
  else if Left > Right then
    Result := 1
  else
    Result := 0;

function Compare_Method(Inst: PSimpleInstance; const Left, Right: TMethodPointer): Integer;
var
  LMethod, RMethod: TMethod;
begin
  LMethod := TMethod(Left);
  RMethod := TMethod(Right);
  if LMethod < RMethod then
    Result := -1
  else if LMethod > RMethod then
    Result := 1
  else
    Result := 0;
end;

Now we get to the interesting bits: the not-so-straightforward outcomes.

Strings use CompareStr. If you want something different, you can use TOrdinalIStringComparer

function Compare_LString(Inst: PSimpleInstance; const Left, Right: AnsiString): Integer;
function Compare_UString(Inst: PSimpleInstance; const Left, Right: UnicodeString): Integer;

  Result := CompareStr(Left, Right);

BinaryCompare is used for:

  • binary data including unknown, Char/WChar, Set, Array, Record. Exception if the binary data is 1, 2 or 4 bytes size in x86 and x64 and 8 bytes in x64, it will be compared as integers.
  • dynamic carrays (be careful when they are multi-dimensional!).
  • variants as a last resort (see further below)

For records that can be compared, it makes sense to perform operator overloading, and have the comparer use those operators.

Binary data of 1, 2, 4 or 8 bytes is an exception, which will give strange results on little-endian machines (Intel x86 and x64, and bi-endian Arm in little-endian mode):

function Comparer_Selector_Binary(info: PTypeInfo; size: Integer): Pointer;
begin
  case size of
    // NOTE: Little-endianness may cause counterintuitive results,
    // but the results will at least be consistent.
    1: Result := @Comparer_Instance_U1;
    2: Result := @Comparer_Instance_U2;
    4: Result := @Comparer_Instance_U4;
    {$IFDEF CPUX64}
    // 64-bit will pass const args in registers
    8: Result := @Comparer_Instance_U8;
    {$ENDIF}
  else
    Result := MakeInstance(@Comparer_Vtable_Binary, size);
  end;
end;

The rest is pure binary:

function Compare_Binary(Inst: PSimpleInstance; const Left, Right): Integer;
begin
  Result := BinaryCompare(@Left, @Right, Inst^.Size);
end;

function Compare_DynArray(Inst: PSimpleInstance; Left, Right: Pointer): NativeInt;
var
  len, lenDiff: NativeInt;
begin
  len := DynLen(Left);
  lenDiff := len - DynLen(Right);
  if lenDiff < 0 then
    Inc(len, lenDiff);
  Result := BinaryCompare(Left, Right, Inst^.Size * len);
  if Result = 0 then
    Result := lenDiff;
end;

As usual, Variants are in a league of their own. First VarCompareValue is tried. If that fails, then Compare_UString is tried. If that fails too, BinaryCompare is tried. If that fails: tough luck.

function Compare_Variant(Inst: PSimpleInstance; Left, Right: Pointer): Integer;
var
  l, r: Variant;
  lAsString, rAsString: string;
begin
  Result := 0; // Avoid warning.
  l := PVariant(Left)^;
  r := PVariant(Right)^;
  try
    case VarCompareValue(l, r) of
      vrEqual:        Exit(0);
      vrLessThan:     Exit(-1);
      vrGreaterThan:  Exit(1);
      vrNotEqual:
      begin
        if VarIsEmpty(L) or VarIsNull(L) then
          Exit(1)
        else
          Exit(-1);
      end;
    end;
  except // if comparison failed with exception, compare as string.
    try
      lAsString := PVariant(Left)^;
      rAsString := PVariant(Right)^;
      Result := Compare_UString(nil, lAsString, rAsString);
    except  // if comparison fails again, compare bytes.
      Result := BinaryCompare(Left, Right, SizeOf(Variant));
    end;
  end;
end;

–jeroen


Filed under: Ansi, Delphi, Delphi 2009, Delphi 2010, Delphi XE, Delphi XE2, Delphi XE3, Delphi XE4, Delphi XE5, Development, Encoding, Software Development, Unicode

Viewing all articles
Browse latest Browse all 1725

Trending Articles