Thursday, December 17, 2009

Mono: C# compiler bug with property inheritance

The bug appeared quite unexpectedly.
In Visual Studio code sample below compiled fine. But doing the same with Mono C# compiler results in error: Compiler Error CS0546: 'Derived2.Accessor.set': cannot override because 'Base.Accessor' does not have an overridable set accessor.
It must've been a bug with the compiler I thought to myself. Mono bugzilla search confirmed that bug was there.

Here is the code sample that produces error report (workaround is described under it):

abstract class Base
    {
        public virtual string Accessor
        {
            get { return "base"; }
            set { }
        }
    }

    class Derived1 : Base
    {
        public override string Accessor
        {
            get { return base.Accessor; }
        }
    }

    class Derived2 : Derived1
    {
        public override string Accessor
        {
            set { }
        }
    }
Workaround for this error is to add set property to Derived1 class:
public override string Accessor
        {
            get { return base.Accessor; }
            set { base.Accessor = value; }
        }
Happy coding :)

Sunday, December 13, 2009

MSSQL DATEDIFF Equivalent in MySQL

Recently I was porting T-SQL (MSSQL) code into SQL dialect used by MySQL.
Process went smoothly until I have stuck with dates. Especially intervals between two dates. In T-SQL datediff function is used to get interval between to datetime values.

Let us consider this T-SQL sample:

declare @d1 datetime;
declare @d2 datetime;
set @d1 = '2009-01-18 15:22:01'
set @d2 = '2009-01-19 14:22:01'
select datediff(hour, @d1, @d2) as hour, 
       datediff(day, @d1, @d2) as day,
       datediff(second, @d1, @d2) as second
Query results are:
hourdaysecond
23182800

After doing some searching I found out that MySQL equivalent is:
set @d1 = '2009-01-18 15:22:01';
set @d2 = '2009-01-19 14:22:01';
select timestampdiff(hour, @d1, @d2) as hour,
       timestampdiff(day, @d1, @d2) as day,
       timestampdiff(second, @d1, @d2) as second;
Query results are:
hourdaysecond
23082800
Query results are nearly the same except day difference. Somehow, MSSQL treats 23 hours as one day.

In general we can think of timestampdiff (MySQL) as 1-to-1 equivalent of datediff (MSSQL). To make them truly equal it is better to get difference in seconds and then convert (calculate) required interval (hours, days).

Saturday, November 21, 2009

String Compare Performance in D

A while ago I was measuring performance of string comparison in .NET. Today I played with string type in D programming language and decided to make similar tests.

D is relatively new language with some good perspectives (by the way Andrei Alexandrescu is engaged in D development).

Here is relatively simple code, that I have used to make the tests:

import std.string;
import std.date;
import std.stdio;

string cmp1 = "SomeString";
string cmp2 = "Someotherstring";
auto iters = 100000;

void test_f()
{
    for(int i = 0; i < iters; i++) { auto res = icmp(cmp1, cmp2); }
}

void test_equal()
{
    for(int i = 0; i < iters; i++) { auto res = cmp1 == cmp2; }
}

void main()
{
    ulong[] results;
    int mean1 = 0;
    int mean2 = 0;
 
    for(int i = 0; i < 5; i++)
    {
        results = benchmark!(test_f, test_equal)(1, results);
 mean1 += results[0];
 mean2 += results[1];
 writefln("Test_f: %d", results[0]);
 writefln("Test_equal: %d", results[1]);
    }
 
    mean1 = mean1 / 5;
    mean2 = mean2 / 5;
 
    writefln("Mean Test_f: %d", mean1);
    writefln("Mean Test_equal %d", mean2);
}
Test results where interesting:
Test_f: 21
Test_equal: 2
Test_f: 21
Test_equal: 2
Test_f: 22
Test_equal: 2
Test_f: 14
Test_equal: 1
Test_f: 12
Test_equal: 1
Mean Test_f: 18
Mean Test_equal 1
18 millis for 100000 iterations looks pretty nice. It is faster then .NET string case insensitive ordinal string comparison. If you remember: .NET version completed in about 26 milliseconds. Second test function just compares two values for equality, I assume that mere pointer compare is made.

I have decided to give this relatively new language a try, it is highly possible that there will be more posts about D in this blog :).

Wednesday, November 18, 2009

Local Computer Connection Failure When Using ActiveSync

Not so long ago I have encountered strange problem with Windows Mobile device connectivity when using ActiveSync.

Here is long story cut short and the solution I came up with.

Server software was located on the Host1. Windows Mobile device was supposed to connect to that server software.

When device was cradled and connected to Host1's ActiveSync it was unable to open connection to server software. However, it could connect perfectly well to the same server software located on Host2. Strange?

It turned out that for local connections (connections to the host where ActiveSync is located) ActiveSync substitutes remote address device wants to connect to with 127.0.0.1. It is loopback address.
Server software was listening on particular IP address (at that time this was by design).
I discovered this by using netstat command and connecting with Windows Mobile browser to local IIS web server.
The command I have used was: netstat -anbp tcp

Obvious fix: listen on 0.0.0.0 (in .NET it is IpAddress.Any).
Any time when connection to local computer from cradled device fails - check if software you are connecting to listens on all IP addresses or is configured to listen on loopback (127.0.0.1) too.

Thursday, November 12, 2009

Performance Issues When Comparing .NET Strings

Every time when you want to use string.Compare("str1", "str2", true) for case insensitive string comparison - think twice.

To illustrate my point I am bringing this example:


int iters = 100000;
string cmp1 = "SomeString";
string cmp2 = "Someotherstring";

Stopwatch sw = Stopwatch.StartNew();
for(int i =0 ;i < iters;i++)
{
    int res = string.Compare(cmp1, cmp2, true);
}
sw.Stop();

Console.WriteLine("First:" + sw.ElapsedMilliseconds);
            
sw = Stopwatch.StartNew();

for (int i = 0; i < iters; i++)
{
    int res = string.Compare(cmp1, cmp2, StringComparison.OrdinalIgnoreCase);
}
sw.Stop();

Console.WriteLine("Second:" + sw.ElapsedMilliseconds);

Quick question which method is faster, first or second?

...
...

Here is my result in milliseconds:
First:77
Second:26


Wow, second sample nearly 3 times faster!!!
This is because first method uses culture-specific information to perform comparison, while the second uses ordinal compare method (compares numeric values of each char of the string).

Knowing the above we can deduce general rule of thumb: when culture-specific string comparison is not required we should use second way otherwise first one.

Thursday, September 24, 2009

Howto: C++ Class Conversion Operator in .CPP file

In case someone did not know how to do this. It took me some time to figure out the right syntax for writing conversion operator implementation in the CPP file. Here is the definition of the conversion operator.

In a header (.h) file we have TestCase class declared.

class TestCase
{
public:
    operator std::string ();
};
While in .CPP we should have declaration written in the form TestCase::.
The declaration of the "to std::string" conversion operator will look like this
TestCase::operator std::string()
{
   std::string msg("TestCase internals");
   return msg;
}
Now we can use this operator in the code
TestCase testClass;
std::string msg = testClass;
msg variable will be equal to "TestCase internals" string.

Tuesday, June 23, 2009

Complex Keys In Generic Dictionary

Let us start with the quiz about generic dictionary.

Dictionary simpleDict = new Dictionary(StringComparer.OrdinalIgnoreCase);
simpleDict["name1"] = "value";
simpleDict["Name1"] = "value2";
What value will simpleDict["name1"] return?

Let's get back to using complex keys in generic dictionary.

.NET Framework provides IEqualityComparer<T> interface that can be used by dictionary to distinguish between different keys.

Imagine we have a complex class that we want to serve as a key in our dictionary.
public class ComplexKey
{
public int Part1 { get; set; }
public string Part2 { get; set; }
}
The implentation of the comparer will be the following:
public class ComplexKeyComprarer : IEqualityComparer
{
public bool Equals(ComplexKey x, ComplexKey y)
{
return x.Part1.Equals(y.Part1) && x.Part2.Equals(y.Part2);
}

public int GetHashCode(ComplexKey obj)
{
return obj.Part1.GetHashCode() ^ obj.Part2.GetHashCode();
}
}
Having created the comparer we can now instantiate dictionary and operate with complex keys in the same way as with simple ones.
Dictionary<ComplexKey, string> complexDict = 
new Dictionary<ComplexKey, string>(new ComplexKeyComprarer());

ComplexKey ck1 = new ComplexKey() { Part1 = 1, Part2 = "name1" };
ComplexKey ck2 = new ComplexKey() { Part1 = 1, Part2 = "name2" };

complexDict[ck1] = "value1";
complexDict[ck2] = "value2";

Very convenient by the way :)

Thursday, May 07, 2009

Check If Local Port Is Available For TCP Socket

From time to time we need to check if specified port is not occupied. It can be some sort of setup action where we install server product and want to assure that tcp listener will start without any problems.

How, to check if port is busy - start listening on it.

Version #1

bool IsBusy(int port)
{
Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream,
ProtocolType.Tcp);
try
{
socket.Bind(new IPEndPoint(IPAddress.Any, port));
socket.Listen(5);
return false;
} catch { return true; }
finaly { if (socket != null) socket.Close(); }
}
If another process is listening on specified address our code will return false. This will make our code think that port was free while it was not. Remember, we are checking port availability. We need exclusive access to the port.

Version #2
bool IsBusy(int port)
{
Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream,
ProtocolType.Tcp);
try
{
socket.SetSocketOption(SocketOptionLevel.Socket,
SocketOptionName.ExclusiveAddressUse, true);

socket.Bind(new IPEndPoint(IPAddress.Any, port));
socket.Listen(5);
return false;
} catch { return true; }
finaly { if (socket != null) socket.Close(); }
}
This version of the code is much better. It tries to bind the endpoint with exclusive access. If some other process is listening on the port or established connection is bound to the port an exception will be thrown.

And, of course, there is another way how to perform the check. We shall use classes from System.Net.NetworkInformation namespace
bool IsBusy(int port)
{
IPGlobalProperties ipGP = IPGlobalProperties.GetIPGlobalProperties();
IPEndPoint[] endpoints = ipGlobalProperties.GetActiveTcpListeners();
if ( endpoints == null || endpoints.Length == 0 ) return false;
for(int i = 0; i < endpoints.Length; i++)
if ( endpoints[i].Port == port )
return true;
return false;
}

Monday, April 27, 2009

Discovering System Endianess


When doing network programming we send bytes to and from peers. These bytes sometimes constitute complex protocols.

Let us assume we have simple message exchange protocol with some header and some data. Like in the picture here.

Our prefix contains the size of the message. When peer receives bytes from network it reads header first (which is fixed length) then decodes information from the header (data size etc) and finally tries to read the specified number of bytes from the network.

Network applications usually have at least two peers. These peers can be hosted on different systems. Say, peer1 is working on Windows OS, while peer2 - is Java app working in Unix environment.
Our protocol contains integer value that holds message's data size. This value is 4-bytes long.

Windows OS works with numbers that are considered to be little endian, that is least significant byte is placed on the lowest address. Vice versa for Java number. This division is known as endianess.

To transfer multi-byte values over network in uniform manner an agreement was established. Multi-byte data is written into network in big endian byte order.

Earlier I have said that Windows is little endian system, do you really believe me?
If you do not - check yourself, here's how you can do this in C#.

First approach:

int number = 0x00000001;
byte[] bytes = BitConverter.GetBytes(number);
bool isBigEndian = bytes[0] == 0x00;
Second approach (for geeks):
int number = 0x00000001;
int* p = &number;
bool isBigEndian = p[0] == 0x00;
And finally the third one:
bool isBigEndian = !BitConverter.IsLittleEndian;
When you want to write self-contained code, the above approaches can be used to determine endianess of the system your code operates on.

P.S. Third method is the best :).

Thursday, March 26, 2009

Windows Vista Defragmentation Tools

Windows Vista by default uses NTFS file system. Sooner or later files on it will start to fragment.

Fragmentation can lead to significant disk I/O performance decrease. Common way how to handle this problem is a process called defragmentation

It turns out that Vista's defragmentation tools is little bit oversimplified


As you can see, user interface lacks volume fragmentation information. So it is hard to say whether my volume needs defragmentation.

Be afraid not as Vista has command-line based tool - defrag.exe.

Windows Disk Defragmenter
Copyright (c) 2006 Microsoft Corp.
Description: Locates and consolidates fragmented files on local volumes to
improve system performance.

Syntax: defrag -a [-v]
defrag [{-r | -w}] [-f] [-v]
defrag -c [{-r | -w}] [-f] [-v]

Parameters:

Value Description

Specifies the drive letter or mount point path of the volume to
be defragmented or analyzed.

-c Defragments all volumes on this computer.

-a Performs fragmentation analysis only.

-r Performs partial defragmentation (default). Attempts to
consolidate only fragments smaller than 64 megabytes (MB).

-w Performs full defragmentation. Attempts to consolidate all file
fragments, regardless of their size.

-f Forces defragmentation of the volume when free space is low.

-v Specifies verbose mode. The defragmentation and analysis output
is more detailed.

-? Displays this help information.

Examples:

defrag d:
defrag d:\vol\mountpoint -w -f
defrag d: -a -v
defrag -c -v

Here's sample output on my system volume (C:),
defrag c: -a
Windows Disk Defragmenter
Copyright (c) 2006 Microsoft Corp.

Analysis report for volume C: VISTA

Volume size = 70.00 GB
Free space = 21.84 GB
Largest free space extent = 6.20 GB
Percent file fragmentation = 4 %

Note: On NTFS volumes, file fragments larger than 64MB are not included in the fragmentation statistics

You do not need to defragment this volume.

It is also possible to increase the amount of information returned by the tool, just specify -v switch.

Now we have some information about disk fragmentation and hence can decide when to start defragmentation process.

P.S. I had a strange filling when writing this post. In an operating system like Windows Vista with redesigned user interface it is awkward too use command-line tools to perform common task like disk defragmentation.

Sunday, March 15, 2009

Image Watermarking



We all know that when image is posted on the internet it no longer belongs to you.

It can be arguable, but nevertheless, any user with browser can simply save it on HDD and you can do nothing about it.

One of the ways how to control image distribution is Digital watermarking.

In my case I had ~100 images that had to be watermarked. There are two ways how to do this: manually (e.g using Photoshop) or write some code to do the job automatically.

In case of simple watermarks: horizontal/vertical text everything is simple, but things become harder when watermark text should be positioned diagonally.

After poking around the web I found this brilliant article. The code did it job well, so I wired it into small console app, and voilà - 105 files watermaked in less then 20 seconds.

Monday, January 19, 2009

Searching for Similar Words. Similarity Metric



How one can find out if two or more words are similar? I do not mean semantically similar (synonyms aren't taken into consideration), but visually similar.

Consider, these two words, "sample1" and "sample2". Do they look similar? Well, at least they have the same start - "sample". The next two have merely common letters in them: "fox" and "wolf".

One of the methods that can be used to measure words similarity is Euclidian distance. It is used to measure distance beetwen two point in space.

The code to measure similarity of 2 strings:

public static double Euclidian(string v1Orig, string v2Orig)
{
string v1, v2;
if (v1Orig.Length > v2Orig.Length)
{
v1 = v1Orig; v2 = v2Orig;
}
else
{
v1 = v2Orig; v2 = v1Orig;
}

double d = 0.0;
for (int i = 0; i < v1.Length; i++)
{
if ( i < v2.Length )
d += Math.Pow((v1[i] - v2[i]), 2);
else
d += Math.Pow((v1[i] - 0.0), 2);
}
return Math.Sqrt(d);
}

Using the code above we can get numbers that measure words similarity:
"sample1" and "sample2" will give 1.0. While "wolf" and "fox" give 104.10091258005379. Words that are identical will give 0.0. Thus the less number you get the more two words are similar.

In the context of Euclidean distance, "fox" and "wolf" have greater distance then "sample1" and "sample2".

This measurement approach can be used when searching for word groups in the text.