Archive for August, 2007

System.Collections.Generic.Dictionary.Insert

Code for System.Collections.Generic.Dictionary<TKey,TValue>.Insert(TKey, TValue, Boolean)

Try compare this to System.Collections.Hashtable.Insert .  They are similar in many ways.

 

private void Insert(TKey key, TValue value, bool add)
{
    int index;
    if (key == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
    }
    if (this.buckets == null)
    {
        this.Initialize(0);
    }
    int num = this.comparer.GetHashCode(key) & 0x7fffffff;
    for (int i = this.buckets[num % this.buckets.Length]; i >= 0; i = this.entries[i].next)
    {
        if ((this.entries[i].hashCode == num) && this.comparer.Equals(this.entries[i].key, key))
        {
            if (add)
            {
                ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
            }
            this.entries[i].value = value;
            this.version++;
            return;
        }
    }
    if (this.freeCount > 0)
    {
        index = this.freeList;
        this.freeList = this.entries[index].next;
        this.freeCount--;
    }
    else
    {
        if (this.count == this.entries.Length)
        {
            this.Resize();
        }
        index = this.count;
        this.count++;
    }
    int num4 = num % this.buckets.Length;
    this.entries[index].hashCode = num;
    this.entries[index].next = this.buckets[num4];
    this.entries[index].key = key;
    this.entries[index].value = value;
    this.buckets[num4] = index;
    this.version++;
}

System.Collections.Hashtable.Insert

Here is the source code for System.Collections.Hashtable.Insert(Object, Object, Boolean)

 

[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
private void Insert(object key, object nvalue, bool add)
{
    uint seed;
    uint incr;
    if (key == null)
    {
        throw new ArgumentNullException("key", Environment.GetResourceString("ArgumentNull_Key"));
    }
    if (this.count >= this.loadsize)
    {
        this.expand();
    }
    else if ((this.occupancy > this.loadsize) && (this.count > 100))
    {
        this.rehash();
    }
    uint num3 = this.InitHash(key, this.buckets.Length, out seed, out incr);
    int num4 = 0;
    int index = -1;
    int num6 = (int) (seed % this.buckets.Length);
Label_0071:
    if (((index == -1) && (this.buckets[num6].key == this.buckets)) && (this.buckets[num6].hash_coll < 0))
    {
        index = num6;
    }
    if ((this.buckets[num6].key == null) || ((this.buckets[num6].key == this.buckets) && ((this.buckets[num6].hash_coll & 0x80000000) == 0)))
    {
        if (index != -1)
        {
            num6 = index;
        }
        Thread.BeginCriticalRegion();
        this.isWriterInProgress = true;
        this.buckets[num6].val = nvalue;
        this.buckets[num6].key = key;
        this.buckets[num6].hash_coll |= (int) num3;
        this.count++;
        this.UpdateVersion();
        this.isWriterInProgress = false;
        Thread.EndCriticalRegion();
    }
    else if (((this.buckets[num6].hash_coll & 0x7fffffff) == num3) && this.KeyEquals(this.buckets[num6].key, key))
    {
        if (add)
        {
            throw new ArgumentException(Environment.GetResourceString("Argument_AddingDuplicate__", new object[] { this.buckets[num6].key, key }));
        }
        Thread.BeginCriticalRegion();
        this.isWriterInProgress = true;
        this.buckets[num6].val = nvalue;
        this.UpdateVersion();
        this.isWriterInProgress = false;
        Thread.EndCriticalRegion();
    }
    else
    {
        if ((index == -1) && (this.buckets[num6].hash_coll >= 0))
        {
            this.buckets[num6].hash_coll |= -2147483648;
            this.occupancy++;
        }
        num6 = (int) ((num6 + incr) % ((ulong) this.buckets.Length));
        if (++num4 < this.buckets.Length)
        {
            goto Label_0071;
        }
        if (index == -1)
        {
            throw new InvalidOperationException(Environment.GetResourceString("InvalidOperation_HashInsertFailed"));
        }
        Thread.BeginCriticalRegion();
        this.isWriterInProgress = true;
        this.buckets[index].val = nvalue;
        this.buckets[index].key = key;
        this.buckets[index].hash_coll |= (int) num3;
        this.count++;
        this.UpdateVersion();
        this.isWriterInProgress = false;
        Thread.EndCriticalRegion();
    }
}

Social Engineering And Monkeys

Here’s how I heard it.  This is an example of social/psychological experimentation and there is something we can learn from it.

Researchers in a lab were carrying out an experiment.  There was a cage with monkeys, and there was bananas at the top of a ladder on a platform.  Every time a monkey tried to climb up the ladder, every monkey in the cage would get shocked (how cruel).  Eventually, if anyone tried to climb up the ladder, all the monkeys would get mad and tackle the monkey down to the floor and beat him up.  Eventually, what they did is they removed 1 of the monkeys, and put a new monkey in the cage.

Obviously, he had no idea what was going on and when he tried to climb up the ladder, the monkeys got mad and beat him up.  And then the researchers continued doing this until not a single monkey was there from the initial experiment (they were all green monkeys). 

Now, if anyone tried to climb up the ladder, they would all beat him up, yet none of them had a clue why they were doing it……

Sometimes at my job I am reminded of this.  We are doing things which make no sense, and if anyone asks why…. the answer is, well, "because that’s how we always did it."

Bananas anyone?

.NET Data Structures, Algorithms and Patterns Implementation

Here are some common data structures implemented in .NET C# that you can download for free and even contribute to the project if you like (open source).  Actual links to the source code for each of the below algorithms to be added in the near future.  Learning these Computer Science algorithms and patterns will allow you to more effectively write your code.  For example, if your data is already sorted, then it would make sense to use an algorithm such as Binary Search, that depends on a sorted list.  Also, these below implentations are not included in the original .NET 2.0 CLR, so if you are looking for a generalized Priority Queue or Red Black Tree you will need to download the below code.  Keep in mind it is not perfect, and it is NOT for everyone, but it is a good starting point.  Wikipedia has details on most of the below algorithms such as Merge Sort which will explain in which cases it would be good to use and which cases it would not be so good.

Actual links to the source code for each of the below algorithms to be added in the near future.  This page is still being improved.

Association<TKey, TValue>
VisitableHashTable<TKey, TValue>

Bubble Sort
Summary: Repeatedly swap elements until the order is correct.  Works fast if the list is almost sorted.  Doesn’t work so well if you have small elements near the bottom of the list.

Dijkstra’s Single Source Shortest Path algorithm
Summary: Graph "greedy" algorithm that solves the single-source shortest path problem for a directed graph with non negative edge weights.

Bag<T>

VisitableLinkedList<T>

Bucket Sort
Summary: Bucket sort is not a comparison sort, uses buckets, and in some cases can run in linear time


Prim’s Minimal Spanning Tree Algorithm
Summary: A graph algorithm that finds a minimum spanning tree.

BinaryTree<T>
Summary: A tree that each node has at most 2 children.  This implementation is a generic tree of type T.  For example you can create a Binary Tree of integers, or a Binary Tree of strings, or whatever you like.

VisitableList<T>

Cocktail Sort , Shaker Sort
Summary: Also known as bidirectional bubble sort, is both a stable sorting algorithm and a comparison sort.

BinarySearchTree<TKey, TValue>

VisitableQueue<T>

Comb Sort
Summary: An improved bubble sort that rivals the speed of Quick Sort

Deque<T>

VisitableStack<T>

Gnome Sort
Summary: A sort similar to insertion sort but uses a series of swaps to get the right order of elements.  Runs in O(n2) time.

Fibonacci number generation.
Summary: Fibonacci numbers represent a pattern of reproduction numbers found in nature, such as how the population of rabbits will grow over time.  Here are a few values:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …

GeneralTree<T> 
Summary: An implementation of a tree in .NET

Heap Sort
Summary: Usually slower than quicksort, a ‘selection sort’ algorithm that works worst case O(n log n).  As well its in-place but not a ‘stable sort’.

Euclid’s Algorithm  (also known as Euclidean algorithm)
Summary: An algorithm to find the GCD (greatest common divisor) of two numbers.  For example on 16, 18, the GCD is 2.

Graph<T> 
Summary: An implementation of a graph in .NET

Insertion Sort
Summary: Very simple but also very slow sort.

Heap<T>
Summary: A heap is a specialized tree type data structure. (more details to be added).

Merge Sort
Summary: A classic divide and conquer algorithm that works by breaking up the data into two parts and sorting each one recursively.  It runs in O(n log n) time.

Matrix 

Odd-Even Transport Sort 

Pascal Set
Summary: You can perform operations on set such as Set Union, Set Difference, Set Intersection, etc.

Quick Sort 
Summary: A popular sorting algorithm that works in O(n log n) time on average. 

Priority Queue<T> 

Selection Sort
Summary: An in place sorting algorithm (meaning it does not require extra storage space) that performs in O(n
2) time, meaning it is not usually practical for large lists, but is known for its simplicity.
SkipList<TKey, TValue> 

Sorted List<T> 

Shell Sort
Summary: An improvement over insertion sort.  Runs in O(n log 2 n)

Red Black Tree<T>
Summary: Self balancing binary search tree (i.e. even if you insert too much it will automatically reorganize itself into a binary search tree).  It can amazingly search, insert, and delete in O(log n) time, meaning if you have 1000 elements, it will take roughly 3 operations to complete!

ReadOnlyPropertyCollection <T, TProperty>

Object Matrix<T>   

Hash List<TKey, TValue> 

Complex Number
Summary: The complex number (also known as imaginary number) i has the value Square Root of -1.  You can do math on complex numbers using this class.

Circular Queue
Summary: A queue that loops around

Splay Tree
Summary: Another self balancing search tree.  Runs in O(log n) amortized time.

Kruskal’s Algorithm
Summary: A graph algorithm that finds a minimum spanning tree for a connected weighted graph.

Hypotenuse

Tons of stuff to the Matrix and ObjectMatrix classes.

Vectors

Topological Sort for the Graph data structure.
Summary: Sorts your graph topologically (i.e. a linear ordering of its nodes).

Radix Sort
Summary: Integer sorting algorithm that works on individual digits.


Download it here (ZIP format, binaries, Version 1.3 Alpha)

(Released under the Microsoft Permissive License)
Originally posted here, and the new page is here that includes the source code incase you want to play around with it and possibly submit a better implementation you might have thought up!

Are You a Sharp Developer?? (Part 2)

This is the second post in the series "What Makes a Sharp Developer"
The next article in this series will discuss keeping up with the bleeding edge – Visual Studio 2008 (Orcas), Silverlight, .NET 3.5, LINQ, .. and more

This post will focus on: balanced lifestyle, staying away from drugs, and keeping a healthy balance between learning and working.  Overall this article discusses how you can become a more valuable resource to your team and to your employers.
 

 

Lead a Balanced Lifestyle

Many of us in the development or business field unfortunately tend to overwork ourselves.  We work late nights at the office surviving on Java alone (caffeine, that is, not Sun Microsystems JavaR).  We take our work home in the evenings, or we might be doing stuff on the weekend that is work related so that we have a head start during the week.  I feel this tendency to overwork ourselves is part of our human nature.  We have to protect ourselves against this and yearn to keep a balance.  Although there is some overlap, I like to divide my time into three categories – Work time, Time for God, and Time for my family (don’t forget Time for Myself).  The difficult part is keeping the three in balance and perspective. 

If we start doing fifty or sixty hour weeks consistently, our quality of work will definitely go down.  It is unfortunate that some companies do not understand this, and push their developers to work longer and harder despite the fact that inevitably their performance will suffer.  As well if your family life suffers, that will not be good for you at work.  Your wife (or husband) might end up putting extra hot sauce in your sandwich because you didn’t spend time with her for the last six weeks and your stomach isn’t going to like that.

 

Stay away from the drugs

One of the keys to success is to stay away from drugs.  That includes alcohol, nicotine, and…. that includes caffeine, man (Bummer!).  Actually caffeine is one of my favourite drugs :(.  But keep it in moderation.  If you overdo it, or if you become dependent on caffeine to survive, you are going to be at a weakness to those who can survive without their daily injection.  Also it hides your tiredness, especially if you start to use it regularly even when you aren’t working, you might not realize just how tired you are.  If you notice you are coming home with bad headaches, or are feeling dizzy or other strange things are happening, THOSE ARE NOT A GOOD SIGNS.

 

A balance between working and learning

This one is a tricky one.  It depends a lot on your work environment and how tight your deadlines are.  I am suggesting that you make sure that no matter what, each day you learn something new.  Do not spend so much time on your task so that you are completing it as fast as possible without taking the time to investigate or find better solutions.  Don’t assume that just because you did this four times before, that the fifth time you should do the same thing.  It could be possible that you did it a silly way four times and you are just going to make it the fifth now!  There have been so many cases where I have been repeating my same noobness (i.e. lack of expertise) by declaring my SQL parameters as an ArrayList for example, whereas I could do it without the additional memory overhead and then conversion into an array by declaring my SQLParameters directly into an array with the exact size I need!  Sometimes employees think that they are not doing their company a favour if they "waste" time reading stuff like this. 

Infact, I beg to differ!!  On the contrary, you will be a MUCH more valuable employee if you insist on taking the time to learn things.  An employee who consistently learns is the last one to get fired, whereas the old dog who can’t learn new tricks is going to be the first one to have to sleep outside (so to speak).  Please, this is so important.  If your boss doesn’t give you time for this, either get a new job, or work late so that you can spend an hour a day on this.  As well, this will give you a bit of job security incase you end up being the one who is laid off, because you have been keeping up with the improvements and constantly sharpening your skills, you will be more in touch with the technology than the ones sitting on their old skills.

As a PS, I would also recommend looking at current job postings to see what technologies are in demand, and as well try to see if you can figure out trends (i.e. which direction the market is heading and what technologies are up and coming.)

Check if a record exists using IF EXISTS instead of COUNT(*) for increased performance

Summary:

  1. How to find out if a record exists in a table efficiently using the IF EXISTS keyword, and 
  2. How to test two queries to see which is faster using SQL Server Management Studio.  This articles teaches you how to tune and tweak your queries.

If you are writing a query to find out if a record exists in your table, you might not be doing it the fastest way if you are using COUNT(*).
i.e. SELECT COUNT(*) FROM USERS where USERID=@UID
and then from your code you are checking if the count is greater than 0.

This can be efficiently re-written as

IF EXISTS (SELECT * FROM USERS WHERE USERID=@UID) select 1 else select 0


That is very easy.  It is much more efficient.
How do we know its more efficient?

You can use SQL Server Management Studio to find out.  What you have to do is write your two queries,

--testing variables
declare @UID int
set @UID = 1
--actual query 1
SELECT COUNT(*) FROM USERS where USERID=@UID
--actual query 2
IF EXISTS (SELECT * FROM USERS WHERE USERID=@UID) select 1 else select 0

Select all, Click on the button to generate the query execution plan (screenshot below – click to enlarge)

Display Execution Plan (click to enlarge)

And you will see that it will give you a percentage breakdown for each query.  The first two lines, creating testing variables, you can ignore.  However, looking at the last two queries will show you the breakdown of effort required to run those queries.  It will be similar to 99% for query 1, and 1% for query 2.  This is a quick way to find which query should run faster and you can tweak the queries until you get the results you want.

Execution Plan Results (click to enlarge)

What that means is 98% of the work has to go into the first query (out of 2) and 2% for the second.  That gives you a rough … idea… of which one is faster.  Sometimes it will be more evenly broken down, closer to 50/50, and in that case they are roughly equivalent.

Optimization WordPress Plugins & Solutions by W3 EDGE