Firebird News: New implementation for wire protocol in Jaybird and all drivers need a faster wire protocol

February 27, 2013, 10:58 pm

≫ Next: From Zero To One: TValue and other “Variants” like implementation tests revised

≪ Previous: See Different: FPC 2.6.2 שוחרר

Mark Rotteveel wrote on twitter about the Jaybird improvement: Writing a new implementation of the wire protocol in Jaybird. I fear it is going to cascade into a total rewrite of the driver Jiri chatted later: IMO the whole “communication layer” needs redesign. It’s (way) too legacy. And later on the twitter channel Brent Rowland [...]

↧

From Zero To One: TValue and other “Variants” like implementation tests revised

February 5, 2013, 12:23 pm

≫ Next: From Zero To One: TValue and other “Variants” like implementation tests – finale

≪ Previous: Firebird News: New implementation for wire protocol in Jaybird and all drivers need a faster wire protocol

Recently I got an inquiry about the speed of my TAnyValue implementation. For the record TAnyValue is a TValue like implementation using records and implicit operators. It lacks in features compared to TValue but it is faster and it works on older Delphi versions (Delphi 2005 and up). I also like to tinker with things like this, so it was fun to play around with it. I was inspired by the TOmniValue which is part of OmniThreadLibrary. My goal was to eventually make it faster, to make it the fastest “variant” type implementation out there.

In the 2010 I did the initial tests with my initial version of the TAnyValue. You can find the results here. Back then TAnyValue was somehow on par with TOmniValue. It was quite slower then Variants and a simple variable record (which is the fastest it can be and is a good comparison on how fast you really are). TValue was catastrophic back then being by far the slowest solution because of generics which it used internally. I then improved my implementation over time being silent about it. But I came a long way and today my solution is the fastest out there. But It must be said that all solutions today are fast, the differences only matters if you use really a lot of these values (assignments) per second and you absolutely need the speed.

When doing such an implementation, the tricky part is to get a good balance between memory consumption and speed. The simplest and most crude solution would be to store every possiible data type you want to handle in the record as internal variable (field). This way you don’t have to explicitly assign memory or copy bytes. The compiler does it all and this is the fastest possible way. But it is also terrible in regards to memory consumption. I will make the case on my TAnyValue and the types it supports at this moment. If I had the simplest and fastest solution I would cover these types (I assume 32 bit compiler here):

Int64 (64)
Integer (32)
Extended (80)
WideString (variable)
AnsiString (variable)
String (variable)
Boolean (8)
TObject (32)
Pointer (32)
Cardinal (32)
TDateTime (80)
IInterface (32)

Ok this is a really dumb aproach but I want an upper memory consumption limit. Application with 10000000 such records, holding one integer each, consumed a whooping 706.244 KB of memory. A lot! On the other side of the spectrum you have two different solution. You can use variants for inner data storage but I really wanted to avoid them because what would be the point using them Another very sleek approach is what TOmniValue has done. It uses one Int64 field for most of the data types with very smart assignments and one IInterface field for Extended and strings types. Basically for all types that need finalization as you cannot have a destructor in a record. The approach TOmniValue uses is good but if you work with strings and floating points a lot, it will be slow as interfaces are notoriously slow. I wanted something fast and still not to hard on memory consumption. So I came to this:

  TSimpleData =recordcaseByteof
      atInteger:(VInteger:Integer);
      atCardinal:(VCardinal:Cardinal);
      atBoolean:(VBoolean:Boolean);
      atObject:(VObject:TObject);
      atPointer:(VPointer:Pointer);
      atClass:(VClass:TClass);
      atWideChar:(VWideChar:WideChar);
      atChar:(VChar:AnsiChar);{$IFDEF AnyValue_UseLargeNumbers}
      atInt64:(VInt64:Int64);
      atExtended:(VExtended:Extended);{$ENDIF}end; 
  TAnyValue =packedrecordprivate
    FValueType: TValueType;{$IFNDEF AnyValue_NoInterfaces}
    FInterface: IInterface;{$ENDIF}
    FSimpleData: TSimpleData;
    FComplexData:arrayofByte;...end;

I use three fields. I use IInterface only for interfaces, thus the conditional define, so you can easily turn them off if you don’t need them and so save memory. Also the trick here is in the variable record. The good side of this record is, it only takes the amount of memory that the largest field does. In my case this is 32 bit, unless “AnyValue_UseLargeNumbers” is defined, then this is 80 bit. This way I cut down on size dramatically, by 48 bit per record. Finally ,there is a dynamic array of bytes, for strings and floating point values if “AnyValue_UseLargeNumbers” is not defined. So lets look that the memory consumption compared to others (again holding 32 bit integers and 10000000 records):

TAnyValue (no defines): 128.960 KB
TAnyValue (AnyValue_NoInterfaces): 89.812 KB
TAnyValue (AnyValue_UseLargeNumbers): 246.376 KB
TAnyValue (AnyValue_NoInterfaces and AnyValue_UseLargeNumbers): 207.228 KB
TOmniValue: 128.960 KB
TValue: 246.376 KB (wow not only is TValue slow but takes a lot of memory)
Variants: 158.308 KB

The only problem when not using “AnyValue_UseLargeNumbers” is that if you then actually use floating point types, you will consume more memory then without that directive. And you will be a little bit slower. You can still use floating points but they should not be in majority. So you can tweak TAnyValue to step up to the task at hand. I should also add that numbers would naturally be different if other data type would be used. But for overall picture 32 bit Integer is just fine.

Now lets look at speed results. The test application is the same as it was last time.

Type	Variants	TValue	TAnyValue	TOmniValue	TVariableRec
j := I	178	3431	62	108	68
j := I/5	187	3968	230	4054	199
j := IntToStr(I)	5491	10593	4342	6632	2728
ALL	4479	19541	4158	10966	3140

As you can see TAnyValue gained a lot of speed, everything else is the same as it was back then. I also did the full test on XE3 to see if TValue improved in any way. The results are bellow

Type	Variants	TValue	TAnyValue	TOmniValue	TVariableRec
ALL	3846	6497	3318	10613	2062

It is clear that they worked on TValue which is now fast enough for use. In fact it is very fast. Given together with flexibility it is a powerful tool.

Probably a lot of you wonder why even bother. I bother because I can. I like to tweak the code and see if I can make it even better. So if any of you have ideas how to make it even smaller in regards to memory consumption and retain the speed throne, please let me know

P.S.
If you are looking for TAnyValue you can find it as a part of Cromis Library in the downloads section.

↧

From Zero To One: TValue and other “Variants” like implementation tests – finale

February 13, 2013, 11:37 pm

≫ Next: From Zero To One: TAnyValue, an attempt to make the best variable data container

≪ Previous: From Zero To One: TValue and other “Variants” like implementation tests revised

I had no intentions to write any additional posts about TAnyValue. I thought it was more then enough and in my mind closed the subject. But then Stefan Glienke jumped in with great comments and test of his own. And as I love such discussions and challenges, I had to reopen the case. It seemed that under XE, XE2 and XE3 TValue was a lot faster. I already mentioned that. But now it seemed it was actually faster then TAnyValue for integer and float types. Something was not right and with Stefans help, I upgraded and tweaked the unit, until I got something I feel is truly fast now and uses little memory out of the box. The comments and previous post can be found here. What we did was:

Conditional defines were changed. Now out of the box TAnyValue uses very little memory (less then variants for instance and same as TOmniValue), way less then TValue. It trades a little bit of speed for that, but very little indeed as you will see in the tests. You can enable speed boost by defining “AnyValue_UseLargeNumbers” which uses more memory but is faster. It is advisable to enable this if you use mainly Int64, TDateTime or Extended types. You can disable interfaces as before and so save even more memory.
Stefan proposed removing inline from implicit class operators
Instead of using Move for copying Extended and Int64 values to byte array Stefan proposed following solution

procedure TAnyValue.SetAsFloat(const Value:Extended);begin
  FValueType := avtFloat;{$IFDEF AnyValue_UseLargeNumbers}
  FSimpleData.VExtended:= Value;{$ELSE}ifLength(FComplexData)&lt;&gt;SizeOf(Extended)thenSetLength(FComplexData,SizeOf(Extended));PExtended(@FComplexData[0])^:= Value;{$ENDIF}end; 
function TAnyValue.GetAsFloat:Extended;begincase FValueType of
    avtInt64: Result := GetAsInt64;
    avtInteger: Result := GetAsInteger;
    avtCardinal: Result := GetAsCardinal;
    avtBoolean: Result :=Integer(GetAsBoolean);
    avtString: Result :=StrToFloat(GetAsString);
    avtAnsiString: Result :=StrToFloat(string(GetAsAnsiString));
    avtWideString: Result :=StrToFloat(string(GetAsWideString));
    avtFloat:begin{$IFDEF AnyValue_UseLargeNumbers}
          Result := FSimpleData.VExtended;{$ELSE}
          Result :=PExtended(@FComplexData[0])^;{$ENDIF}endelseraise Exception.Create('Value cannot be converted to Extended');end;end;

Doing all that, the speed was still considerably slower. Stefan claimed he got down to 200ms on the Extended tests, while I could not drop bellow 700ms. Something smelled there, so I asked him to send me the exact unit he used. When I got the unit I compared it with mine via Kdiff3 (great tool by the way). I noticed he removed inline only from the following class operator

class operator Implicit(const Value:Extended): TAnyValue;

but left in on this one

class operator Implicit(const Value: TAnyValue):Extended;inline;

I did the same and sure enough time dropped down to 200ms. I have no idea why such a difference with or without inline at that particular spot. Maybe someone more at home with assembler and inline mechanism can cast some light onto the matter. When I saw the impact I removed other inline directives and gained a lot for strings also. So without further delay, here are the complete tests for 2010 and XE3. It is also worth mentioning that Stefan made a wrapper for 2010 TValue implementation to bring it on par with XE3 speed. Maybe he will share it for those who have to use 2010 and TValue.

Delphi 2010 test (times are in ms for 10000000 operations):

Type	Variants	TValue	TAnyValue	TOmniValue	TVariableRec
j := I	157	3176	69	82	65
j := I/5	184	3308	214	3410	190
j := IntToStr(I)	4562	10610	3857	6856	2917
ALL	5471	18915	5025	10640	3167

Delphi XE3 test (times are in ms for 10000000 operations):

Type	Variants	TValue	TAnyValue	TOmniValue	TVariableRec
j := I	166	176	81	412	62
j := I/5	302	450	235	4007	328
j := IntToStr(I)	3429	5184	1701	6082	1945
ALL	4063	5731	3016	11407	2083

It is visible how much did I gain under XE3 for strings removing that inline. Why I don’t know as I already told. I will wrap it up here. I already bothered you to much with details and myself gave to much time into it. But hey it was worth it.

The test application can be downloaded here.

↧

From Zero To One: TAnyValue, an attempt to make the best variable data container

February 14, 2013, 12:23 am

≫ Next: From Zero To One: TAnyValue and array handling

≪ Previous: From Zero To One: TValue and other “Variants” like implementation tests – finale

And the saga continues . If you don’t know what the hell I am talking about read the first three articles in the series on this subject:

The latest resulting TAnyValue implementation presented in the last article was good, fast and had resonable memory size (12 bytes). But two things bothered me to no end:

I could not finalize the record when it goes out of scope, so I can’t properly clean up, resulting in non-optimal solutions for data fields (dynamic array, IInterface…)
I had to use three fields for data. One dynamic record, one IInterface and one dynamic array. This was so because I needed to store strings and interfaces along with extended type and have all that clean on its own when record falls out of scope. Yes I could only use IInterface for all non trivial data types but the speed goes down the drain. Having three fields was fine on 32 bit, resulting in 12 bytes size of the record, but on 64x is goes up to 24 bytes. I was not happy at all.

I don’t give up so easily so I pondered what are my options. I really do not understand why in all these years Delphi development team did not make records with destructors and constructors available. It would just take them one simple call to a procedure for each. Maybe I am not seeing something, but it would be trivial for them to make it. Doing that would open vast possibilities for developers. So I had to find a way to fix that shortcoming. The only solution I could see working, was hooking some of the procedures in System.pas. The main targets would be:

_FinalizeRecord
_CopyRecord
_AddRefRecord

The most important here is _FinalizeRecord which is called by the compiler when record goes out of scope and such a record contains an open array or IInterface or a string variable. Basically each variable that needs finalization triggers the call to _FinalizeRecord. The problem is that this is compiler magic. Compiler knows at compile time what the record holds and calls the procedure if needed, when the record goes out of scope and is freed. So I had to hook it and make sure that it gets called for TAnyValue. I don’t like hooks, I see them as last resort to solve a particular problem. Here this was a last resort and it was not a system wide hook (which should be used only in very, very special cases). Before I would use the hook, I needed to see if certain conditions were meet.

The hook needs to be stable under 32 and 64 bit.
There should be minimal to no overhead from the hook
The hook should not interfere with other hooks and with original procedure
You should avoid hooking procedures or functions with very high call frequency

To answer that:

I used KOLDetours.pas which will become part of the Cromis library. The unit is very stable and work equally well under 32 and 64 bit. Also the licence is very liberal and I can easily include it in my library. Naturally I will retain the licence and made perfectly clear who is the author and who the credit goes to.
The overhead is basically non existent. All I do is one simple pointer comparison and then I either call my finalize or the original finalize.
Because this is not a simple patch but a detour I just call the original _FinalizeRecord if the record is not TAnyValue. Also this technique allows for multiple hooks to coexist.
_FinalizeRecord and _CopyRecord have not such a high frequency as GetMem or FreeMem and similar. They are frequently called but because of no overhead that is not an issue. I did not hook _AddRefRecord because I believe that it is not even ever called for the solution I made.

Ok to the implementation then. How does it all work? First we hook the procedures and get the pointer to the TAnyValue TypeInfo.

initialization
  vTypeInfo :=TypeInfo(TAnyValue);
  OldCopyRecord := InterceptCreate(GetCopyRecordAddress,@CustomCopyRecord);
  OldFinalizeRecord := InterceptCreate(GetFinalizeRecordAddress,@CustomFinalizeRecord);

Here we need the addresses of functions to hook. We can’t get them from pascal, because they are protected, compiler won’t see them that way. But we can get them with assembler.

initializationfunction GetFinalizeRecordAddress:Pointer;asm{$IFDEF CPUX64}
  mov rcx, offset System.@FinalizeRecord;
  mov @Result, rcx;{$ELSE}
  mov @Result, offset System.@FinalizeRecord;{$ENDIF}end;

You can get any address this way. Now that we have the addresses, we can write our improved and specialized _FinalizeRecord and _CopyRecord for TAnyValue

procedure FinalizeAnyValue(p : PAnyValue);beginif p.ValueType&lt;&gt; avtNone thenbegincase p.ValueTypeof
      avtString, avtAnsiString, avtWideString:string(PValueData(@p.ValueData).VPointer):='';
      avtInterface: IInterface(PValueData(@p.ValueData).VPointer):=nil;{$IFNDEF CPUX64}
      avtFloat:FreeMem(PValueData(@p.ValueData).VPointer);{$ENDIF}end; 
    // set type to none and erase all data
    PValueData(@p.ValueData).VPointer:=nil;
    p.ValueType:= avtNone;end;end; 
procedure CopyAnyValue(dest, source : PAnyValue);var
  dstData: PValueData;
  srcData: PValueData;begin
  dstData := PValueData(@dest.ValueData);
  srcData := PValueData(@source.ValueData); 
  if dest.ValueType&lt;&gt; source.ValueTypethenbegin
    FinalizeAnyValue(dest);
    dest.ValueType:= source.ValueType; 
  {$IFNDEF CPUX64}case dest.ValueTypeof
      avtFloat:GetMem(dstData.VPointer,SizeOf(Extended));end;{$ENDIF}end; 
  case source.ValueTypeof{$IFNDEF CPUX64}
    avtFloat:PExtended(dstData.VPointer)^:=PExtended(srcData.VPointer)^;{$ENDIF}
    avtInterface: IInterface(dstData.VPointer):= IInterface(srcData.VPointer);
    avtString, avtAnsiString, avtWideString:string(dstData.VPointer):=string(srcData.VPointer);else
    dstData^:= srcData^;end;end;

All that is left are the detour functions.

procedure CustomCopyRecord(Dest, Source, TypeInfo:Pointer);beginif vTypeInfo = typeInfo then
    CopyAnyValue(PAnyValue(Dest), PAnyValue(Source))else
    OldCopyRecord(Dest, Source, typeInfo);end; 
procedure CustomFinalizeRecord(p:Pointer; typeInfo:Pointer);beginif vTypeInfo = typeInfo then
    FinalizeAnyValue(PAnyValue(p))else
    OldFinalizeRecord(p, typeInfo);end;

See how little overhead there is. Basically none, I just compare two pointers and that is all. Also for other records then TAnyValue I just call the original functions. Here I have to say that Eric Grange who jumped in to the discussion at the last article helped me a lot with hooks and with record structure. So a public thanks to him for all the ideas and solutions he helped me provide with.

Ok the dirty hooking details are behind us. Now lets see how we did with the record structure. Because we can now cleanup properly, we can be much more creative with how we structure our record. I made the following structure.

  PAnyValue =^TAnyValue;
  TAnyValue =packedrecordprivate
    ValueData: IInterface;{$HINTS OFF}{$IFNDEF CPUX64}
      Padding :array[0..3]ofByte;{$ENDIF}{$HINTS ON}
    ValueType: TValueType;function GetAsInt64:Int64;inline;...

You may now wonder why the IInterface variable in the record. Well it is there for two purposes. First we need to trigger the _FinalizeRecord by the compiler and IInterface does just that. On the other hand it provides us with 4 bytes of space we can use. Now you probably know, why the additional array [0..3] of Byte. I need additional 4 bytes on 32 bit to store Int64, Double etc… directly without calling GetMem. On 64 bit IInterface itself is 8 bytes so no padding is needed. Quite a neat solution. It only takes 8 bytes on 32 bit and 64 bit per single record (plus one byte for type enumeration). Lets just look how data is stored and read. For most types its just like that

procedure TAnyValue.SetAsInteger(const Value:Integer);beginif ValueType &lt;&gt; avtInteger thenbeginSelf.Clear;
    ValueType := avtInteger;end; 
  // assign the actual value
  PValueData(@ValueData).VInteger:= Value;end; 
function TAnyValue.GetAsInteger:Integer;beginif ValueType = avtInteger then
    Result := PValueData(@ValueData).VIntegerelse
    Result := GetAsIntegerWithCast;end; 
function TAnyValue.GetAsIntegerWithCast:Integer;begincase ValueType of
    avtBoolean: Result :=Integer(GetAsBoolean);
    avtString: Result :=StrToInt(GetAsString);
    avtAnsiString: Result :=StrToInt(string(GetAsAnsiString));
    avtWideString: Result :=StrToInt(string(GetAsWideString));elseraise Exception.Create('Value cannot be converted to Integer');end;end;

Separate getters are because of the inlining and exceptions. Strings on the other hand are store like this

procedure TAnyValue.SetAsString(const Value:string);beginSelf.Clear;
  ValueType := avtString;string(PValueData(@ValueData).VPointer):= Value;end; 
function TAnyValue.GetAsString:string;beginif ValueType = avtString then
    Result :=string(PValueData(@ValueData).VPointer)else
    Result := GetAsStringWithCast;end; 
function TAnyValue.GetAsStringWithCast:string;begincase ValueType of
    avtNone: Result :='';
    avtBoolean: Result :=BoolToStr(AsBoolean,True);
    avtCardinal: Result :=IntToStr(AsCardinal);
    avtInteger: Result :=IntToStr(AsInteger);
    avtInt64: Result :=IntToStr(AsInt64);
    avtFloat: Result :=FloatToStr(AsFloat);
    avtDateTime: Result :=DateTimeToStr(AsDateTime);{$IFDEF UNICODE}
    avtAnsiString: Result :=string(AsAnsiString);{$ENDIF}
    avtWideString: Result := AsWideString;elseraise Exception.Create('Value cannot be converted to string');end;end;

This way we have reference counting which is very important. Same goes for interfaces. The only special one is Extended. It is like this.

procedure TAnyValue.SetAsFloat(const Value:Extended);beginif ValueType &lt;&gt; avtFloat thenbeginSelf.Clear;
    ValueType := avtFloat;{$IFNDEF CPUX64}GetMem(PValueData(@ValueData).VPointer,SizeOf(Extended));{$ENDIF}end; 
{$IFNDEF CPUX64}PExtended(PValueData(@ValueData).VPointer)^:= Value;{$ELSE}
  PValueData(@ValueData).VDouble:= Value;{$ENDIF}end; 
function TAnyValue.GetAsFloat:Extended;beginif ValueType = avtFloat thenbegin{$IFNDEF CPUX64}
    Result :=PExtended(PValueData(@ValueData).VPointer)^{$ELSE}
    Result := PValueData(@ValueData).VDouble{$ENDIF}endelse
    Result := GetAsFloatWithCast;end;

Because extended is 10 bytes on 32 bit systems, I refused to use that 2 additional bytes. It messes with aligning and its just not worth the trouble. The speed difference is negligible anyway. You might wonder why not just use Double? As you know Extended is mapped to Double on 64 bit anyway. I do use double. I have special getter and setter for Double type. The reason for using extended is twofold:

Some people may need extended sometimes. The reasons may wary but still, they may need it.
The compiler resolves I / 5 to extended type. So direct assignment fails.

This won’t compile if you have only Double on the implicit operators

  AnyValue := I /5;

You have to write it like this

  AnyValue.AsDouble:= I /5;

The final tests, taken again, only on 32 bit XE3 are

Delphi XE3 test (times are in ms for 10000000 operations):

Type	Variants	TValue	TAnyValue	TOmniValue	TVariableRec
j := I	149	266	87	270	62
j := I/5	236	316	101	4151	222
j := IntToStr(I)	3291	4856	2185	5776	1648
ALL	3933	5681	2462	12689	2205

The results show how close to the best speed, that a full blown record has, we came. Very close to the limit. I thinks this is really a very good variable container for data holding and transfer, way better then variants. And it can be further developed almost without limitations now that we can finalize. I also think that hooking is so stable and has no overhead that this is a perfectly viable solution. I will test this further and when the code is clean and well tested it will become the official TAnyValue.

You can download the current code along with the test here. It has to be added that the test only tests assigning variables and reading them. It does not address memory allocations / releases. I have a new test for that, but that is already way beyond the scope of this article.

↧

From Zero To One: TAnyValue and array handling

February 22, 2013, 9:00 am

≫ Next: It's a blong, blong, blong road...: Hello FireDAC

≪ Previous: From Zero To One: TAnyValue, an attempt to make the best variable data container

Until now, I was only focusing on speed improvements and memory consumption for TAnyValue. But if any of you think this is all I wanted to do with it, then you are wrong. All along I had plans to extend the functionality of TAnyValue and make a true powerhouse out of it. My goal is to make it an efficient all around data carrier and to make a powerful interface on top of it. Something that will blow Variant type and TValue away.

The first major add-on is array handling. Any value now has premium array handling support I will cut this short and just go to examples. Let me just mention that not only array is now supported but I added support for many basic types that were still missing and also support for Variants. But there is more to come in the future. Ok to the examples then.

I have defined a new record type in Cromis.AnyValue.pas. Its called TAnyArray and it is a powerful wrapper around TAnyValues which is an dynamic array of TAnyValue and also new. TAnyArray can be used as a standalone variable with no problems, but it is also a part of TAnyValue. Lets say you need a new array stored inside TAnyValue (the type is avtArray). You simply write it like this:

  AnyValue.EnsureAsArray.Create([5,'5',1.22, AnyValues([nil, MyObject,6])]);

This created an array with values and additional array inside the first one. Yes arrays can be nested with no limitations. Also each array is again a TAnyValue. Actually the Create does not create the array it just initializes it. The creation comes from EnsureAsArray. For those familiar with my SimpleStorage you know that Ensure returns something or creates a new instance if not already there. In this case it ensures that an array is returned. Also if AnyValue contained some other data type, it clears it and sets the type to avtArray.

Now how do you access the array values. Simple, and actually you can do it two ways. Lets say you want a second element:

  AnyValue[1].AsString;
  AnyValue.GetAsArray.Item[1].AsString;

You can treat AnyValue as TAnyArray. The default property handles that for you. But if you want the more advanced stuff, you can call AnyValue.GetAsArray, which returns the array, or raises exception if AnyValue is not of type avtArray. Array also has a neat function called GetAsString. It returns the whole array as string and can be very useful. Lets say we have an empty array and we want to push some new items into it:

  AnyValue.GetAsArray.Push([5,'4.5', AnyValues([7,'5',3, AnyValues([1.2,3,'5'])])]);
  AnyValue.GetAsArray.GetAsString;

The result will be:

5,4.5,[7,5,3[1.2,3,5]]

You can also chose the delimiter. If you want tab for delimiter just call it like this:

  AnyValue.GetAsArray.GetAsString(#9);

Now lets see what other candy the TAnyArray holds

  MaValue := AnyValue.GetAsArray.Pop;
  MyClone := AnyValue.GetAsArray.Clone;
  AnyValue.GetAsArray.Reverse;
  MySlice := AnyValue.GetAsArray.Slice(2,6);
  AnyValue.GetAsArray.Sort(function(Item1, Item2: PAnyValue):Integerbegin
      Result :=StrToInt64Def(Item1.AsString,-1)-StrToInt64Def(Item2.AsString,-1);end);
  AnyValue.GetAsArray.Clear;
  AnyValue.GetAsArray.SetCapactiy(10000);
  AnyValue.GetAsArray.IndexOf(5);
  AnyValue.GetAsArray.Contains(5);
  AnyValue.GetAsArray.Delete(5,True);
  AnyValue.GetAsArray.Delete[5,'4',nil];
  AnyValue.GetAsArray.SaveToStream(MyStream);
  AnyValue.GetAsArray.LoadFromStream(MyStream);

As you can see, you can save and load from / to stream without being aware of types, arrays, nested depth, it just works. You can delete all instances of an item, you can delete many items at once, you can slice, clone etc…. All that comes in a lightweight package of a record with support for any type you wish to store in it.

And for the finish let me show named values support which is handled with arrays but each such value is a unique TAnyValue instance with type avtNamedValue and a very flexible hash list also with TAnyValue support.

If you want to have name-value type of variables all you have to do is.

  AnyValue['Delphi']:='XE3';

  AnyValue.GetAsArray.AddNamed('Delphi','XE3');

Both are again the same. Named item is added to the array if it does not yet exist. If it does, only the value is assigned. Value is TAnyValue. Also for all arrays, you have enumeration support:

for Value in AnyValue do// do somehing

for Value in AnyValue.GetAsArraydo// do somehing

And as a final example, the hash that is much more convenient then the old hash where you could only add pointers and could only have strings for keys. Here both the key and the value are TAnyValue. Furthermore it is very easy to make new type of hashes with different hashing functions. You only derive from TCustomHashTable and override some functions. Lets see how easy it was to make string and cardinal key hashing classes.

// hash table that uses cardinals as keys
  TCardinalHashTable =class(TCustomHashTable)protectedfunction DoCalculateHash(const Key: TAnyValue):Cardinal;override;publicconstructor Create(const Size:Cardinal= cDefaultHashSize);end; 
  // hash table that uses strings as keys
  TStringHashTable =class(TCustomHashTable)protectedfunction DoCalculateHash(const Key: TAnyValue):Cardinal;override;publicconstructor Create(const Size:Cardinal= cDefaultHashSize);end;

The whole implementation is here

{ TCardinalHashTable } 
constructor TCardinalHashTable.Create(const Size:Cardinal);begininherited Create(Hash_SimpleCardinalMod, Size);end; 
function TCardinalHashTable.DoCalculateHash(const Key: TAnyValue):Cardinal;var
  KeyData:Cardinal;begin
  KeyData := Key.AsCardinal;
  Result := FHashFunction(@KeyData, FSize);end; 
{ TStringHashTable } 
constructor TStringHashTable.Create(const Size:Cardinal);begininherited Create(Hash_SuperFastHash, Size);end; 
function TStringHashTable.DoCalculateHash(const Key: TAnyValue):Cardinal;var
  KeyData:string;begin
  KeyData := Key.AsString; 
  if KeyData &lt;&gt;''then
    Result := FHashFunction(@KeyData[1],Length(KeyData)*SizeOf(Char))mod FSize
  else
    Result :=0;end;

Simple isn’t it. You can make whatever type of hash table you want. Also the already available string one uses SuperFastHash from “Davy Landman” You can look the licence for more details.

The use of string hash for example is trivial

  MyHashTable.Add('Delphi','XE3');
  MyHashTable.Add('Delphi',2010);
  MyHashTable.Add('Delphi',nil);
  MyHashTable.Add(2013,'Integer');
  MyHashTable.Add(4.5,'Float');
  MyHashTable.Add([4.5,'2'],'Even arrays would work');

All of this just works, the key is always taken as string and the value is TAnyValue. You see how much power there is in TAnyValue, all in a fast and memory friendly package. And you still retain type safety, every TAnyValue instance carries the value type inside and if you try to convert to something you cannot, an exception is raised.

If you want to see the demo for the arrays and named-values support you can download it here along with the speed test. Cromis.Hashing which contains the hash table classes is already available if you download the whole Cromis Library. I will add TAnyValue and all the new features soon as a separate download, but I still have to write some demo programs and tests to be sure everything works as it should. Meanwhile all constructive criticism and ideas are very welcome.

↧

It's a blong, blong, blong road...: Hello FireDAC

February 28, 2013, 3:08 am

≫ Next: Dr.Bob's Delphi Notes: XE3 Enterprise Users - get your FireDAC for free

≪ Previous: From Zero To One: TAnyValue and array handling

Yes, I’m sure you saw everyone else’s posts on the matter, but just to be sure, I thought I’d draw your attention to the newly launched FireDAC data access layer (just announced), the result of Embarcadero’s recent acquisition of DASoft’s AnyDAC.

It’s a free download if you have the Enterprise (or higher) version of Delphi, C++Builder or RAD Studio or can be purchased as an add-on (the FireDAC Client/Server Pack) if you have the Professional version (and yes, there’s an offer for current owners of AnyDAC or the XE2 Client/Server Pack.

Click here for the FireDAC FAQ and click here for the FireDAC documentation.

↧

Dr.Bob's Delphi Notes: XE3 Enterprise Users - get your FireDAC for free

February 28, 2013, 6:34 am

≫ Next: The Wiert Corner - irregular stream of stuff: ISO 8601: since 1988

≪ Previous: It's a blong, blong, blong road...: Hello FireDAC

A little while ago, Embarcadero announced the purchase of AnyDAC, for future integration with Delphi (and C++Builder).

↧

The Wiert Corner - irregular stream of stuff: ISO 8601: since 1988

February 28, 2013, 12:43 pm

≫ Next: Behind the connection: Help Update 3 for Delphi, C++Builder and RAD Studio XE3

≪ Previous: Dr.Bob's Delphi Notes: XE3 Enterprise Users - get your FireDAC for free

ISO 8601 was published on 06 05 88 and most recently amended on 12 01 04

Boy, am I glad with the xkcd: ISO 8601 post and image on the right.

One reason:

Please write dates and times so that everyone understands them, not just you.

The alt-text of the comic is hilarious (ISO 8601 was published on 06 05 88 and most recently amended on 12 01 04) showing the confusion of using 2 digit years not knowing which field means which (I thin XKCD author Randall Munroe and Mathematics of the ISO calendar got some of the dates, see PDF search dates below).

I found out in the mid 1980s that people I was communicating with internationally (back then the internet was forming and you already had BITNET Relay chat and email) were using different date formats than I did.

Ever since that, I’ve used the YYYY-MM-DD format of writing dates, encouraging others to use as well and as soon as I found out that was a standard, started to evangelize ISO 8601 (there is an ISO 8601 category on my blog), which – at the time of writing this – had had revisions in 1998 (on 1998-06-15), 2000 (on 2000-12-15) and 2004 (on 2004-12-01).

A lot later I found out that back in 1971, this date format was a recommendation, and in 1976 already a standard. Not nearly as old as Esperanto though (:

Speaking about languages:

At the end of last century, after Delphi 5 added year 2000 support (which made the 16-bit Delphi 1 disappear from the box as the effort to prove the product including all libraries was year 2000 proof), Delphi went cross platform.
The Delphi team working on both Kylix 1 and Delphi 6, the also added a DateUtils unit which provides a lot of cuntionality, including support for weak numbers. The first test version always assumed week 1 was the one with januari first in it. As ISO 8601 also indicates how the first week of a year should be determined, a couple of people (Jeroen W. Pluimers, Glenn Crouch, Rune Moberg and Ray Lischner) provided code that fixed this and a few other things in the unit. We even got mentioned by Cary Jensen!
That code is now also part of the RemObjects ShineOn library. That DateUtils unit is now on GitHub.
A Delphi XE version of the code (and a Delphi 2007 one) are now at NickDemoCode (Thanks Nick Hodges!).

Delphi is not the only environment having ISO 8601 support. XML has, .NET has, etc: it is now wide spread.
So follow your tools, and start using it yourself as well (:

Too bad the ISO 8601 standard text is not available publicly:

I remember the Y2K preparation era where the ISO-8601 standard was freely available at http://www.iso.ch/markete/8601.pdf, soon after the Year 2000, the PDF got locked behind a payment engine.
ISO suffers from heavy link rot too, for instance the ISO 3166 country codes used to be at http://www.iso.org/iso/prods-services/iso3166ma, but are now at http://www.iso.org/iso/home/standards/country_codes.htm. What about HTTP 303 or 302 redirect here guys?

Luckily people keep cached copies:

–jeroen

via: xkcd: ISO 8601.

Filed under: .NET, Delphi, Delphi 2005, Delphi 2006, Delphi 2007, Delphi 2009, Delphi 2010, Delphi 6, Delphi 7, Delphi 8, Delphi x64, Delphi XE, Delphi XE2, Delphi XE3, Development, ISO 8601, Power User, Prism, Software Development Tagged: date formats, delphi 1, delphi 5, delphi 6, delphi team, glenn crouch, internet, iso 8601, iso calendar, mid 1980s, ray lischner, software, technology, xkcd

↧

Behind the connection: Help Update 3 for Delphi, C++Builder and RAD Studio XE3

February 28, 2013, 2:32 pm

≫ Next: twm’s blog: updated assemble.exe in dxgettext repository

≪ Previous: The Wiert Corner - irregular stream of stuff: ISO 8601: since 1988

The help update #3 is available for registered users: http://cc.embarcadero.com/item/29324 It is for Delphi, C++ Builder and RAD Studio XE3.

↧

twm’s blog: updated assemble.exe in dxgettext repository

March 1, 2013, 3:17 am

≫ Next: From Zero To One: Is it possible to build a better dynamic array? Presenting sliced array.

≪ Previous: Behind the connection: Help Update 3 for Delphi, C++Builder and RAD Studio XE3

My colleague Daniel has added an option to the dxgettext assemble tool to specify a different directory where to look for the locale data. This is meant to be used, when you compile the executable to a different directory than the one in which the locale subdirectory is located. Since dxgettext would use the locale directory directly if it exists, you can never be sure where your translations come from. Compiling to a different directory solves this issue.

There isn’t a new executable release yet, but you can just check out the dxgettext sources and compile it yourself. It is located in the subdirectory dxgettext\tools\assemble. Alternatively I have a compiled executable in my buildtools on sourceforge, but beware: I take no responsibility regarding viruses or other problems this executable could cause.

↧

From Zero To One: Is it possible to build a better dynamic array? Presenting sliced array.

March 1, 2013, 7:23 am

≫ Next: Behind the connection: Application design ideas

≪ Previous: twm’s blog: updated assemble.exe in dxgettext repository

From the last post about array handling in TAnyValue (and in general), I was looking to make the IAnyArray as powerful and useful as I could. As I told you, my goal is to make a strong infrastructure around and on top of TAnyValue. And so I thought about it. Simple TList clone with more features was not something I was very thrilled about. So I looked the web about improving upon dynamic array structure. To my surprise, I found very little and even that was not what I wanted. I presume most of you know what dynamic array is and know its strengths and weaknesses. But lets just summarize.

1. Dynamic array

Very simple structure. I uses contiguous chunk of memory of same type of elements (integers, string, record, object…). Its strengths:

Adding element to the end: O(1)
Accessing n-th element by index: O(1)
Removing element from the end: O(1)
Good caching, fast indexed walk-through because of caching.

And its weaknesses:

Removing element from the middle: O(n)
Adding element to the middle: O(n)
Can have problems with resizing at large sizes because of memory fragmentation

2. Single linked list

Also very simple structure. It uses pointers to link each record in one way (double linked lists do it both ways). Its strengths:

Adding element to the end: O(1)
Removing element from the middle: O(1)
Adding element to the middle: O(1)
Removing element from the end: O(1)
Easily re-sized, because we only need to allocate memory for one new element at a time.

And its weaknesses:

Accessing n-th element by index: O(n)
Bad caching, elements can be all over the memory

Also found this nice table comparing them both:

Operation	Array	Singly Linked List
Read (any where)	O(1)	O(n)
Add/Remove at end	O(1)	O(1)
Add/Remove in the interior	O(n)	O(1)
Resize	O(n)	N/A
Find By position	O(1)	O(n)
Find By target (value)	O(n)	O(n)

So as you see each has its strengths and its weaknesses. If we could just take best of each one and make an efficient hybrid, all would be fine. Well that is not so easy to do. If you go one way, you suffer on the other end and vice verse. But after a lot of thinking and testing, I found a good implementation of something i now call sliced array. It is a hybrid, that tries to take best of both world and I think it mainly succeeded at that. It has a lot of strengths and very, very little weaknesses. It looks like this:

I “sliced” the large array into several small arrays and then turned everything into a double linked list. I also added a look-up control array which is very important as you will see later. So how does it all work. Lets look at how a single slice is implemented:

  TArraySlice =record
    Count:Int64;
    Lookup: PSliceLookup;
    ArrayData: TAnyValues;
    NextSlice: PArraySlice;
    PrevSlice: PArraySlice;end;

Simple actually. It holds an array of TAnyValue, pointers to previous and next slice, pointer to look-up record and count of elements it holds. Look-up record looks like this

  TSliceLookup =record
    Index:Integer;
    SumCount:Int64;
    Slice: PArraySlice;end;

This is even simpler. It holds the index in the look-up array, pointer to slice and count of all elements in all the slices to this one + elements in this slice that the look-up points to. Now for the goods and bads of the sliced array.

Its strengths:

Adding element to the end: O(1)
Removing element from the middle: O(SliceSize)
Adding element to the middle: O(SliceSize)
Removing element from the end: O(1)
Good caching. Slices are contiguous in memory and so it supports locality of access, which is very important
Easily re-sized because we only need to allocate memory for one new slice. No need to re-size whole memory space used by the structure.

And its weaknesses:

Accessing n-th element by index: Hard to tell without analyzing mathematically, but its still fast

So we only have one weakness, but that could be a huge weakness because random access by index is very, very important. If we can’t do it very, fast then the structure is not worth the trouble. After all we want a general purpose structure, a workhorse like TList.

To make it, I implemented a few tricks. I can use the look-up structure. The dumbest way would be to just go over the look-up array and check if our index we are searching falls inside the lower and upper index bound of the slice. Once the slice is found we just do the same as for classic dynamic array. So that would be O(SliceCount + 1). Not to shabby if we don’t have to many slices but not near good enough. Specially because the more slices we have, the more we trim down internal insert and delete. I used two tricks to improve the search for the correct slice. First one is a helper pointer that always points to the last slice accessed. So if we do linear iteration from 0 to Count -1, we do not search for the correct slice, ever. We just use the pointer to the last accessed slice and do the offset to the correct element. When we come to the end of the slice we just jump to the next one. Simple but very efficient. This way also the locality of access is supported if slices are big enough.

The next trick is (for truly random access) to calculate where the index should be (it does not mean it is there). If we have 100 elements in each slice and we want to access 350-th element we can assume that it is in the 4-th slice. So we jump directly there. If we missed, and we can miss as slices get degraded from deletions and insertions, we did not miss by much, we just check if we need to go left or right and move until we find the correct slice. It turns out this is also very fast, as predictions are good enough.

Furthermore something helps us here. If we do no deletions and insertions then the slices are in perfect condition and each prediction will be a hit. So we have an O(2) operation always. Tests will demonstrate that. If we have degradation it is slightly worse, but not by much, and we gained a lot on insertions and deletions. The key here is we take care of degradation. I do this by merging slices if the count falls under half the slice size and I split the slice if the count rises to 2x the slice size. This way the structure is never to degraded.

Ok you may say, but all this comes into play with a lot of elements. For small arrays it does not matter. True, that is why default slice size is 10.000 elements and under that you basically have a single dynamic array with one look-up pointer. I also do a check before doing indexed access and if only one slice is present I just jump directly to that one, no searching is done. The only real downside of the structure is a quite large complexity of it and the algorithm to manage it. Also it will never be truly as fast for random indexed search, if nothing else, because I just call more code opposed to returning n-th element from array.

Probably you won’t believe everything on my word, so here are the tests made on Delphi 2010. First one is for small number of elements (5.000) so we only have one slice and the other is for (1.000.000) elements. Here we have 100 slices (as default size is 10.000). I tested my IAnyArray that now uses this structure for data storage, TList<TAnyValue> as this was a fair reference point for dynamic array comparison. Then I also tested TList<Variant> and TList<Integer> for the fun of it

Small array test (5.000e lements) – times are in ms

Large array test (1.000.000 elements) - times are in ms

Just some clarifications.

Deletes and inserts are 100 times faster. It makes sense. I have to move 100 times less elements in memory because I have 100 slices
Adding elements is also faster. Simply because I just create new slice with 10.000 elements while dynamic array (TList) has to re-size the whole memory space
Linear iteration by index or local access is as fast as with dynamic array because of the trick with the last accessed slice pointer
True random access by index is slower by a magnitude 2-3. But it depends on the number of slices
enumeration is very fast, especially by pointers
I did also a “raw” access comparison where I work with PAnyValue so I avoid record copies.

It is a very long post, sorry for that but I really want to hear your honest opinion on this solution. Is it viable, what do you think of it? I find it very promising and very flexible in all usage scenarios. The only downside I see is complexity. Do you find truly random access to slow? Remember that was 1.000.000 access operations over 1.000.000 elements.

There is also a big update of code on my download section:

Latest TAnyValue with array support, name-value, pairs etc….
Hashing classes have been improved and are 2x faster now
TTaskPool has been improved and is faster with less locking, also TThreadQueue. TLockFreeStack has been rewritten
IPC is now faster because of all the changes. It went under 0.1 ms for Client->Server->Client cycle with full payload

↧

Behind the connection: Application design ideas

March 1, 2013, 10:00 am

≫ Next: Firebird News: Jaybird wiki content is updated

≪ Previous: From Zero To One: Is it possible to build a better dynamic array? Presenting sliced array.

Today I was reading a few blogs and questions on various forums. This made me thinking about writing some parial answers to this question: How to write a good application using Delphi? Separate as much as possible the user interface from the processing Encapsulate every processing in a class (or component) Use events to free your classes from things that can change One big mistake the newbie

↧

Firebird News: Jaybird wiki content is updated

March 2, 2013, 3:53 am

≫ Next: See Different: What is the feature of Pascal ?

≪ Previous: Behind the connection: Application design ideas

Mark Rotteveel mentioned he is updating the content for the JDBC Driver wiki I started updating the #Jaybird wiki on http://jaybirdwiki.firebirdsql.org/ #firebird

↧

See Different: What is the feature of Pascal ?

March 2, 2013, 11:59 am

≫ Next: Firebird News: And, in case you wonder, IB Objects works great in Lazarus

≪ Previous: Firebird News: Jaybird wiki content is updated

Borland created the name Object Pascal as a marketing way to sell Delphi, and set a boundary between Borland Pascal and Delphi.

Even though that Pascal constantly changing, and started so, not so long after it first released at 1968, only real developers learned the difference, and know what is going on.

Since then we see huge leap foreword with technology and with the language itself. For example the language itself support thread based programming, the object oriented syntax that was added by Apple in 1982 constantly expanding, and even creating support for thing that languages such as Java or C# lack of.

Pascal today is the only programming language that on one hand offer a very low level support for development – Yes! even lower level then with C, but support also very high level development that is closer to dynamic languages – such as mixin, string management, iteration syntax, and much much more…

Pascal today is also the only true multi-platform language, that can provide rich applications to desktop, web and smartphones/tablet platform with the same code base, and run native there. Yes, that includes iOS, Android with JVM/Dalvik.

Things looks good from far away, but closely there are more then few issues with the modern Pascal (and with some of the, I blame Embarcadero):

There are way too many dialects (FPC, Delphi, Delphi.NET, Oxygen, GNU Pascal [ISO Pascal] etc …)
There is some sort of vendor lock due to the dialect variation (Usually either Delphi or FPC to choose from)
Some of the syntax that is added to the language, well it's not longer Pascal, but closer to Java/C#/Ruby/D …
There are additions to closures – such as lambda, anonymous functions, generics and more, that not always implemented properly
There are support for dependency injection, extended namespace and much more
Not a single real standard, and lack of cooperation between Delphi developers and the true FOSS FPC project (due to Embarcadero side)

The problem with the code additions that I wrote, is that things are created not in the Pascal way, but looks more like the way c++ implemented stuff -> set V to say "I have this technology".

For example, You have now mixin for data types, so I can create for an Integer a function that will translate it into string like so:

...
type
  TIntegerConvertHelper = type helper for integer
    function to_string : string;
  end;
...
1000.to_string
...

Yeap, It's cool, but it's not Pascal ! Neither than this code:

...
type TMyDynArray = array of integer;
...
var MyDynArray : TMyDynArray;
...
begin
  ...
  MyDynArray := TMyDynArray.create(100, -1, 500, another_value);
  ...
end;
...

We call the above example "Array constructor". If you'll see this in the code, you'll have hard time to understand that TMyDynArray is not a class, but a dynamic array. Even if you do know that it exists in the language itself, it is less readable – not more.

And this less readable code of both examples, creates not only un Pascalish like syntax, but, make your code much harder to read and maintain.

You no longer can look at the code and guess what is TMyDynArray, you actually need to check it out, and it makes it so much harder to understand the code you are reading – the exact opposite of Pascal development, where most people complain that it's too readable for them, and they wish to work harder on that

Even though that I mainly focusing on FPC here, the problem is actually even worse with Delphi, where you can write a symbol name in a native language such as Russian, Arabic and Hebrew for example (just like with most Dynamic languages that I know), making your code a lot less readable, maintainable, and open a whole new set of problems that never existed in the first place.

I sent a question about it at the FPC mailing list, that explains my way of thinking, and seems like others as well. Feel free to read the whole discussions there about it, I find then very interesting.

Filed under: Delphi, FPC, Object Pascal, אירגונים וחברות, חברה, טכנולוגיה, פיתוח, קוד פתוח, תוכנה, תקנים Tagged: object pascal

↧

Firebird News: And, in case you wonder, IB Objects works great in Lazarus

March 5, 2013, 5:39 am

≫ Next: Behind the connection: Memory leaks detection

≪ Previous: See Different: What is the feature of Pascal ?

Here is the comment from Jason Wharton on FreePascal 2.6.2 release news And, in case you wonder, IB Objects works great in Lazarus.

↧

Behind the connection: Memory leaks detection

March 5, 2013, 1:53 pm

≫ Next: Leonardo's blog: Quick CGI SpellChecker

≪ Previous: Firebird News: And, in case you wonder, IB Objects works great in Lazarus

The subject of memory leaks detection is frequently invoked and yet, lots of people still wonder how to do it! There are a lot of tools on the market for that purpose. I will talk about the two that I am using: FastMM madExcept FastMM FastMM is actually the memory manager included with Delphi since a few years. It is an original work of Pierre le Riche. Although included with Delphi, you can

↧

Leonardo's blog: Quick CGI SpellChecker

March 6, 2013, 6:12 am

≫ Next: The Wiert Corner - irregular stream of stuff: jpluimers

≪ Previous: Behind the connection: Memory leaks detection

Here's a small CGI program that provides spellchecking services based on GNU Aspell. I use it on a local network where Delphi Win32 clients connects to this CGI hosted on an Apache Server, running on Linux, to ask for spelling suggestions.

This simple spellchecker has a small API that does the very basic, that is, check for word spelling, add word to dictionary, and delete word from dictionary. The client side can be done using any language capable of doing http GET requests, and handling JSON responses.

Prerequisites:

Of course, the first requisite is to install Aspell and one or more dictionaries. On apt-get based systems, you'll install them using this:

sudo apt-get install aspell
sudo apt-get install aspell-en_US

Then next step is to create a personal dictionary, this is a just a plain text file where new words will be added. The file must have just one line, containing this:

personal_ws-1.1 en 0

If the dictionary will be using, for example, spanish words, you must replace "personal_ws-1.1 en 0" by "personal_ws-1.1 es 0", and do:

sudo apt-get install aspell-es_ES  (or es_AR for Argentina).

IMPORTANT: please set RW attributes to the file, to allow read/write by everyone.

How it works, the API:

All requests must be done using these commands:

/cgi-bin/cgiaspell/TSpellCheck/WordSpell?word=
/cgi-bin/cgiaspell/TSpellCheck/WordAdd?word=
/cgi-bin/cgiaspell/TSpellCheck/WordDelete?word=

Here is any word to be spelled, added or deleted from dictionary.

Spell checking

For example, if you want to check spelling on word "houuse", youl have to do this:

http://myserver /cgi-bin/cgiaspell/TSpellCheck/WordSpell?word=houuse

The result is this JSON string:


{ "replacements" : ["House", "house", "hose", "horse", "hours", "hoarse", 
"hoes", "hues", "Hosea", "housed", "houses", "Hus", "hos", "horsey", "hour's", 
"Ho's", "hows", "huhs", "Horus", "hoarser", "douse", "louse", "mouse", "rouse", 
"souse", "Hausa", "Hesse", "hoe's", "hoers", "how's", "hussy", "Hui's", 
"House's", "house's", "hue's", "hoar's", "hoer's", "Horus's"], "total" : 38 }

Adding a word to the personal dictionary

http://myserver /cgi-bin/cgiaspell/TSpellCheck/WordAdd?word=houuse

This will return "Ok." if the word was added correctly.

Deleting a word from the personal dictionary

http://myserver /cgi-bin/cgiaspell/TSpellCheck/WordDelete?word=houuse

This will return "Ok." if the word was removed correctly, or, a message saying it wasn't deleted.

The program

In Lazarus, just create a CGI Application (you'll need the WebLaz package), save the project as "cgiaspell.lpi", and rename unit1 to main.

Now, adapt your main.lfm to this :


object SpellCheck: TSpellCheck
  OnCreate = DataModuleCreate
  OldCreateOrder = False
  Actions =   
    item
      Name = 'WordSpell'
      Default = False
      OnRequest = WordSpellRequest
      Template.AllowTagParams = False
    end  
    item
      Name = 'WordAdd'
      Default = False
      OnRequest = WordAddRequest
      Template.AllowTagParams = False
    end  
    item
      Name = 'WordDelete'
      Default = False
      OnRequest = WordDeleteRequest
      Template.AllowTagParams = False
    end>
  CreateSession = False
  Height = 150
  HorizontalOffset = 250
  VerticalOffset = 250
  Width = 150
end

Then do the same to main.pas as this:


unit main;

{$mode objfpc}{$H+}

interface

uses
  SysUtils, Classes, httpdefs, fpHTTP, fpWeb,
  process,
  fpjson;

type

  { TSpellCheck }

  TSpellCheck = class(TFPWebModule)
    procedure DataModuleCreate(Sender: TObject);
    procedure WordAddRequest(Sender: TObject; ARequest: TRequest;
      AResponse: TResponse; var Handled: Boolean);
    procedure WordDeleteRequest(Sender: TObject; ARequest: TRequest;
      AResponse: TResponse; var Handled: Boolean);
    procedure WordSpellRequest(Sender: TObject; ARequest: TRequest;
      AResponse: TResponse; var Handled: Boolean);
  private
    function ASpellToJSON(AAspellResult: string): string;
    function SpellWord(AWord: string): string;
  public
    { public declarations }
  end;

var
  SpellCheck: TSpellCheck;

const
  cDictionary = '/home/leonardo/.aspell.es_AR.pws';

implementation

{$R *.lfm}

{ TSpellCheck }

procedure TSpellCheck.DataModuleCreate(Sender: TObject);
begin

end;

procedure TSpellCheck.WordAddRequest(Sender: TObject; ARequest: TRequest;
  AResponse: TResponse; var Handled: Boolean);
var
  lStr: TStringList;
  lWord: string;
begin
  if ARequest.QueryFields.IndexOfName('word') = - 1 then
    raise Exception.Create('word param is not present')
  else
    lWord := ARequest.QueryFields.Values['word'];

  // todo: this should be replaced by something
  // more reliable. I.e.: what happens if cDictionary is blocked
  // by another process, or LoadFromFile can be slow on big dictionaries.
  lStr := TStringList.Create;
  try
    lStr.LoadFromFile(cDictionary);
    if lStr.IndexOf( LowerCase(lWord) ) = -1 then
    begin
      lStr.Add(lWord);
      lStr.SaveToFile(cDictionary);
      AResponse.Content := 'Ok.';
    end
    else
      AResponse.Content := lWord + ' already in dictionary.';
  finally
    lStr.Free;
  end;
  Handled:= True;
end;

procedure TSpellCheck.WordDeleteRequest(Sender: TObject; ARequest: TRequest;
  AResponse: TResponse; var Handled: Boolean);
var
  lStr: TStringList;
  lIdx: Integer;
  lWord: string;

begin
  if ARequest.QueryFields.IndexOfName('word') = - 1 then
    raise Exception.Create('word param is not present')
  else
    lWord := ARequest.QueryFields.Values['word'];

  // todo: this should be replaced by something
  // more reliable. I.e.: what happens if cDictionary is blocked
  // by another process, or LoadFromFile can be slow on big dictionaries.
  lStr := TStringList.Create;
  try
    lStr.LoadFromFile(cDictionary);
    lIdx := lStr.IndexOf( LowerCase(lWord) );
    if lIdx 
 -1 then
    begin
      lStr.Delete(lIdx);
      lStr.SaveToFile(cDictionary);
      AResponse.Content := 'Ok.';
    end
    else
      AResponse.Content := lWord + ' not in dictionary.';
  finally
    lStr.Free;
  end;
  Handled:= True;
end;

procedure TSpellCheck.WordSpellRequest(Sender: TObject; ARequest: TRequest;
  AResponse: TResponse; var Handled: Boolean);
var
  lWord: string;
begin
  if ARequest.QueryFields.IndexOfName('word') = - 1 then
    raise Exception.Create('word param is not present')
  else
    lWord := ARequest.QueryFields.Values['word'];

  AResponse.Content := SpellWord(lWord);
  Handled := True;
end;

function TSpellCheck.ASpellToJSON(AAspellResult: string): string;
var
  lStr: TStringList;
  lJSon: TJSONObject;
  lJsonArray: TJSONArray;
  I: Integer;
begin
  Result := '';
  lStr := TStringList.Create;
  lJson := TJSONObject.Create;
  try
    if Pos(':', AAspellResult) > 0 then
      lStr.CommaText:= Copy(AAspellResult, Pos(':', AAspellResult) + 1, Length(AAspellResult));
    lJsonArray := TJSONArray.Create;
    for I := 0 to lStr.Count - 1 do
      lJsonArray.Add(lStr[I]);
    lJSon.Add('replacements', lJsonArray);
    lJson.Add('total', lStr.Count);
    Result := lJSon.AsJSON;
  finally
    lJSon.Free;
    lStr.Free;
  end;
end;

function TSpellCheck.SpellWord(AWord: string): string;
var
  lProcess: TProcess;
  Buffer: array[0..2048] of char;
  ReadCount: Integer;
  ReadSize: Integer;
begin
  lProcess := TProcess.Create(nil);
  lProcess.Options := [poUsePipes,poStderrToOutPut];
  lProcess.CommandLine := '/usr/bin/aspell -a --lang=es_AR -p ' + cDictionary;
  lProcess.Execute;
  lProcess.Input.Write(PAnsiChar(AWord)[0], Length(AWord));
  lProcess.CloseInput;

  while lProcess.Running do
    Sleep(1);

  ReadSize := lProcess.Output.NumBytesAvailable;
  if ReadSize > SizeOf(Buffer) then
    ReadSize := SizeOf(Buffer);
  if ReadSize > 0 then
  begin
    ReadCount := lProcess.Output.Read(Buffer, ReadSize);
    Result := Copy(Buffer,0, ReadCount);
    Result := ASpellToJSon(Result);
  end
  else
    raise Exception.Create(Format('Exit status: %d', [lProcess.ExitStatus]));

  lProcess.Free;
end;

initialization
  RegisterHTTPModule('TSpellCheck', TSpellCheck);
end.

Compile, copy to your Apache CGI directory and enjoy!.

↧

The Wiert Corner - irregular stream of stuff: jpluimers

March 6, 2013, 9:00 pm

≫ Next: DelphiTools.info: Call for DWScript showcase

≪ Previous: Leonardo's blog: Quick CGI SpellChecker

A while ago I was involved in a C header file translation for the header files of the IBM WebSphere MQ family of products, and the table helped a lot for the base types:

Delphi to C++ types mapping – RAD Studio XE2.

A few C things missing there:

unsigned short (Delphi: Word)
unsigned char (Delphi: Byte)

These articles helped resolving the missing bits:

Now we can do SOA between System i (a.k.a. iSeries, aka AS/400) from Windows 7.

–jeroen

Filed under: C++, C++ Builder, Delphi, Delphi 2006, Delphi XE2, Development, MQ Message Queueing/Queuing, Software Development, WebSphere MQ

↧

DelphiTools.info: Call for DWScript showcase

March 7, 2013, 5:56 pm

≫ Next: Firebird News: Firebird RDBMS bindings for Python FDB 1.0 is released

≪ Previous: The Wiert Corner - irregular stream of stuff: jpluimers

I’m considering setting up a DWScript showcase webpage. If you’re using DWScript and want to be on that page, please mail the following to “eric at delphitools.info”:

short description of how DWScript is used
screenshot of the application where it’s used
website or product link

Usage cases don’t have to be fancy, or pretty, industrial and command-line are acceptable too! The screenshot doesn’t have to show the script, just be representative of the application.

Besides the showcase, this is also to get a feel of the variety of uses of DWScript out there, and to help figure out which way to go with future developments, so even if you don’t have anything I could post, don’t hesistate to drop a mail to let me know what you use it for.

Alternatively you can post a comment there, but to minimize spam prevention, comments don’t stay open very long on posts here.

↧

Firebird News: Firebird RDBMS bindings for Python FDB 1.0 is released

March 7, 2013, 7:45 pm

≫ Next: It's a blong, blong, blong road...: More on FireDAC

≪ Previous: DelphiTools.info: Call for DWScript showcase

FDB release 1.0 is out: http://pypi.python.org/pypi/fdb Improvements: - Removed dependency on presence of fbclient library at import time. This caused some confusion to new users when fdb install failed when Firebird was not (yet) installed. Bugs Fixed: - http://tracker.firebirdsql.org/browse/PYFB-25

↧