Monday, September 24, 2012

What's in a name? A rose by any other name would smell as sweet. Everything’s in a name when it comes to code and some other things.

I have two incidents to tell you which happened last week. The first one was when a  colleague complained about the quality of existing code he was looking at, particularly variable names. One such variable he was trying to decipher when he came to me was named ‘cache_2008_temp’. And I told him that when variables are badly named, I wouldn’t make any assumption about them and instead look at it as if the variables were named as ‘a’, ‘b’ and like. I mean what was the developer thinking when he named a variable like that – cache in 2008 which is temp. That is utter nonsense.

 
2012-09-23 12.30.44 Another incident I want to share took place when I wanted to cook some quick pasta for lunch and I asked my wife if we have any readymade pasta sauce (sometimes I can get too lazy to cook fresh pasta sauce). She said that it was in the refrigerator. And I found the bottle. Yeah, that pasta sauce bottle is in the picture.
Now, when I am ready to pour the sauce in the pan with cooked fusilli, I open the lid and dip my spoon to get some sauce. And as I am about to mix it, I realize that the texture of the sauce and the smell is different. I just taste it a little bit and guess what? It isn’t pasta sauce. It is in fact some hot red chili mix. My wife had used a pasta sauce bottle, washed it, but had ignored the label on the bottle. Imagine, what would have had happened if I had used the mix as it is.
 
So after making the point about importance of names, let me quickly summarize on why is it important to choose good name? Things I have learned over years from my experience and from Uncle Bob’s book Clean Code.

Names shouldn’t spread disinformation.
As in above pasta example or the variable ‘cache_2008_temp’,  there is a certain amount of disinformation being spread. If the variable was named ‘a’ and the bottle didn’t have any label, they wouldn’t tempt anyone to make assumptions.  Variables which spread disinformation are dangerous because there is a chance that their intention might be deciphered wrongly.
 
Good names make code self explanatory.
I have picked the following code snippet from a previous post of mine :
   1: //code snippet 1
   2: static void Main(string[] args)
   3: {
   4:     var now = DateTime.Now;
   5:     var nyTimeZoneInfo = TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
   6:     var nyNow = TimeZoneInfo.ConvertTime(now, nyTimeZoneInfo);
   7:     var nyTodays17Hours = nyNow.Date.AddHours(17);
   8:     var nyNext17Hours = nyNow < nyTodays17Hours ? nyTodays17Hours : nyTodays17Hours.AddDays(1);
   9:     var localTimeWhenNyNext17HoursOccurs = TimeZoneInfo.ConvertTime(nyNext17Hours, nyTimeZoneInfo, TimeZoneInfo.Local);
  10:  
  11:     Console.WriteLine("localTimeWhenNyNext17HoursOccurs is " + localTimeWhenNyNext17HoursOccurs);
  12:  
  13:     Console.ReadKey();
  14: }

The code calculates local time when 17 hours will occur in NY. The variable names makes the code quite self explanatory. Variable names like
localTimeWhenNyNext17HoursOccurs  allow you to breeze through the code when you read it. Now let me rewrite the code using bad variable names.

 



   1: //code snippet 2
   2: static void Main(string[] args)
   3: {
   4:     var now = DateTime.Now;
   5:     var nyTimeZoneInfo = TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
   6:     var ny1 = TimeZoneInfo.ConvertTime(now, nyTimeZoneInfo);
   7:     var ny2 = ny1.Date.AddHours(17);
   8:     var ny3 = ny1 < ny2 ? ny2 : ny2.AddDays(1);
   9:     var ny4 = TimeZoneInfo.ConvertTime(ny3, nyTimeZoneInfo, TimeZoneInfo.Local);
  10:  
  11:     Console.WriteLine("localTimeWhenNyNext17HoursOccurs is " + ny4);
  12:  
  13:     Console.ReadKey();
  14: }

Now the code in snippet 2 does exactly the same as snippet 1, but the difference is in variable names which makes the code extremely difficult to understand. With code you tell a story, your variables, methods, properties, classes play characters in that story and if they are badly named, they make your story look bad.

 

If your variable requires a comment to understand its intent then the variable has a bad name.
Variable names shouldn’t require a comment to understand them.



   1: //code snippet 3
   2: List<int> cache_2008_temp //history of records in 2008;


In code snippet 3, we see that the variable ‘cache_2008_temp’ has been commented to tell us that it is used to store history or records in 2008. But, what happens to the code where it is used? Anyone who reads the code using that variable is bound to get confused. It destroys the readability of the code because that name makes no sense – it spoils the code story. So, after sometime a developer who maintains that piece of code thinks that no longer is the 2008 history necessary then she might just rename the variable as ‘cache_2009_temp’ and  use it to store history of 2009 records.  But she might forget to update the comment. As code evolves comments get outdated and hence reliability on comments to understand the code can be dangerous.

Names should be intention revealing.
So instead of naming the variable in snippet 3 ‘cache_2008_temp’ we could have named it ‘historicRecordsOf2008’  or just ‘recordsIn2008’. Names should convey the intent correctly. Such names bring a clarity to the code. Code cannot achieve simplicity if there isn’t any implicit understanding of names.

If you can’t pronounce a name, then the variable has a bad name.
A variable name like ‘cache_2008_temp’ is difficult to pronounce and use in discussions. Instead ‘historicRecordsOf2008’ can be easily be used while discussing the code. Use names which you can pronounce and hence can easily use while discussing the code.

To have good names you need to think, think hard. And that takes time. But as Uncle Bob says choosing good names is hard and takes time but saves more time than it takes in future.

Tuesday, September 11, 2012

Why do we need events when we have delegates?

Ok I understand that the answer to this question is pretty simple. But, I have seen many people faltering in interviews when asked this question.

We have a class with a public delegate and a property which exposes an instance of the delegate.

Code Snippet
  1. public class Temp
  2. {
  3.     public delegate void TestEventDelegate(string input);
  4.  
  5.     public TestEventDelegate TestEvent { get; set; }
  6.  
  7.     private void RaiseTestEvent(string arg)
  8.     {
  9.         TestEvent(arg);
  10.     }
  11. }

Now TestEvent is a multicast delegate. We can create an instance of the class Temp and add our method to TestEvent

Code Snippet
  1. var t = new Temp();
  2. t.TestEvent += input => Console.WriteLine("{0} received", input);

Similarly some other class might create another instance of Temp and add its own method to TestEvent

Code Snippet
  1. var p = new Temp();
  2. p.TestEvent += input => Console.WriteLine("Hell0 {0}", input);

And when Temp calls RaiseTestEvent method, all methods added to the multicast delegate TestEvent will be called. Now think about this for a moment. Doesn’t an event do exactly the same thing? Subscribers add their event handlers which get called when the event is raised. So, why do we need events and their fancy syntax? For that you should know that events can be raised only by the class which defines it, unlike the public delegate which can be called from outside the class by using the instance of the class in which it is defined. This is the main functionality of an event, where the class that defines it controls when an event needs to be raised.

Sunday, September 9, 2012

Visual Studio 2012 Premium and Ultimate can work with any unit testing framework

I have never been a fan of MS tests and so is the case with many other developers. Till now Visual Studio had inbuilt support for unit tests written using MS tests. And for those using other frameworks had to resort to external tools for running tests inside Visual Studio. Now that has all set to change. With Visual Studio 2012 you can now do the following:

  • Run tests from multiple frameworks. This means you can write tests using Xunit and Nunit and run them using the tools provided in Visual Studio.
  • You can get code coverage  results from a single click. This is an excellent feature which was provided till now only using paid tools.
  • It has a Test Runner, which can be configured to run all tests after each build.
  • With the premium edition you get Fakes framework which seems to have the capability to fake any external dependency.

Friday, August 24, 2012

Coding Standard – Naming a type when declaring generic types

When declaring types  in a generic type declaration, I use full names instead of alphabets like T, G, .

So, in the below example instead of  using TC or C, I have used TCell and instead of using TG or G, I have used TGrid.

Code Snippet
  1. /// <summary>
  2. /// interface to be implemented by a class
  3. /// which will evolve cells in a grid
  4. /// using the game rules
  5. /// </summary>
  6. /// <typeparam name="TCell"> </typeparam>
  7. /// <typeparam name="TGrid"> </typeparam>
  8. public interface IEvolution<TCell, in TGrid>
  9.     where TCell : ICell
  10.     where TGrid : IGrid<TCell>
  11. {
  12.     /// <summary>
  13.     /// Applies game rules on the <paramref name="currentGrid"/>
  14.     /// object to evolve its cells
  15.     /// </summary>
  16.     /// <param name="currentGrid"></param>
  17.     void Execute(TGrid currentGrid);
  18. }

Tuesday, August 21, 2012

What is Unicode? What is UTF-8 or UTF-16? or Why do we need to specify content="text/html;charset=UTF-8" in a html page?

Have you seen this tag <meta http-equiv="content-type" content="text/html;charset=UTF-8" /> in a html file and just ignored it?

Have you come across some gibberish on a webpage, like ""

If the answer to any of the above two questions is yes and you still haven’t figured out, then the one word answer to it is Encoding or Character Encoding.e

What is Character Encoding?

Wikipedia says that -

A character encoding system consists of a code that pairs each character from a given repertoire with something else—such as a bit pattern, sequence of natural numbers, octets, or electrical pulses—in order to facilitate the transmission of data (generally numbers or text) through telecommunication networks or for data storage.

What this means is that we need a pattern to map characters, that is if we want to transmit the character ‘A’ from one device (in same or different network) to another we first encode it using the encoding pattern. At the other side we decode it using the pattern again and then we can see the character ‘A’.

What can be an example of a pattern?

ASCII is an example of a pattern which be used. So as in the above example, to transmit ‘A’ from one device to another using the ASCII encoding, we first lookup in the ASCII table for the value of ‘A’. We see that it is 65. We then transmit this 65. On the other side, we receive ‘65’ and lookup the ASCII table again and realize that the sender wanted to send ‘A’.

Simple, isn’t it? There isn’t much to encoding apart from understanding the fact that it is a pattern.

So what is UTF-8, UTF-16?

ASCII is sufficient if you just want to transmit English alphabets (upper and lower case), numbers and some special characters. In total, you can transmit 256 different characters. This limit on characters comes from the fact that ASCII considers each character to be of size 1 byte (or 8 bits). So, what happens if you want to transmit a character from Hindi language? Well, you can’t do it using ASCII encoding.  So  there is another standard for encoding – Unicode. The Unicode possibly maps each character from a universal set of characters. Search for the character map utility on your windows machine to see this mapping. Each character maps to something called ‘Code Point’

character map So if we search for the Unicode mapping of the letter ‘A’, we will see that it is U+0021.

Now what if we want to send the Hindi character अ? We look up the Unicode mapping and find that the code point is U+0905. Now that was not possible using ASCII encoding. The Unicode scheme can have possible 65536 characters if it considers each character to be of 2 bytes or even more if it considers each character to be more than 2 bytes.

So, what is the problem using Unicode? Nothing, except the fact that now you will start using double (or more) memory (as compared to ASCII) to transmit the same characters if you are using only English alphabets or numerals.

This is where UTF-8 comes into picture. UTF-8 is an implementation of Unicode encoding which can represent all characters in Unicode scheme.  It does this using 1 to 4 bytes. For first 128 characters (English alphabets and numbers) it uses 1 byte. For others it might use 2, 3 or 4  bytes. This means that UTF–8  uses a minimum of 1 byte. UTF-16 is another variable size encoding scheme which uses a minimum of 2 bytes per code point.

Further reading on encoding:

http://www.joelonsoftware.com/articles/Unicode.html

Sunday, August 19, 2012

Game Of Life implementation in C#

I have uploaded my Game Of Life implementation at Github. You can download it by clicking here. It is a console based application developed in C#. It is an OO implementation with high coverage using tests in nunit.

Let’s sum up the rules of game of life:

1) A live cell with exactly two OR three neighbors survives to live in the next generation. It dies in all other scenarios.

2) A dead cell becomes with exactly three neighbors becomes alive.

This is how our are interfaces are lined up:

(Click the image to enlarge)

ClassDiagram1

Thursday, May 31, 2012

A method that never returns OR objects that get disappeared OR how threading issues can mess your life OR never be confident ever that your code is thread safe

I wasn’t really sure what should be the title of this post and hence these four titles – pick you choice. Recently, I came across this strange issue where some of our records were not being displayed. We have an application which has a blotter that displays real time updates. The records get cleared at the end of business day. Accidentally, one of the developers launched two instances of the application before signing off for the day. The next day she discovered that the number of records in both the instances were different and thus began our investigation of missing records. Our first suspect was this multithreaded code which we thought was causing synchronization issues. This seemed like a definite threading issue for us but we discarded it as we were confident that our code was thread safe. Basically, we don’t use any synchronization techniques like lock and instead work with objects local to the thread and hence the over confidence. Looking at the debug logs we discovered that some of our objects were getting disappeared.

I will try to put in a little more details –  code which very roughly represents the problem we had:

Update
  1. public class Update
  2. {
  3.     public int Id { get; set; }
  4.  
  5.     public string Details { get; set; }
  6. }
OnNewThread
  1. private Update OnNewThread(string dataReceivedFromServer)
  2. {
  3.     var update = CreateUpdate(dataReceivedFromServer);
  4.     logger.Debug(update.Id);
  5.  
  6.     ProcessDataReceivedFromServer(update);
  7.     logger.Debug(update.Id);//we realize our update has disappeared
  8.  
  9.     //send update to the UI
  10. }

So the problem was that we were losing some Update objects.  We were passing a parameter to the method ProcessDataReceivedFromServer – which was local to the thread and hence the overconfidence that there were no synchronization issues. So what was going on wrong. I took a look at the method – ProcessDataReceivedFromServer .

ProcessDataReceivedFromServer
  1. private Update ProcessDataReceivedFromServer(Update update)
  2. {
  3.     logger.Debug(update.Id);
  4.     
  5.     ProcessDataReceivedFromServer1(update);
  6.     logger.Debug(update.Id);
  7.     
  8.     ProcessDataReceivedFromServer2(_update);
  9.     logger.Debug(update.Id);
  10.     
  11.     ProcessDataReceivedFromServer3(_update);
  12.     logger.Debug(update.Id);
  13.     
  14.     ProcessDataReceivedFromServer4(_update);
  15.     logger.Debug(update.Id);
  16.  
  17.     //sometimes the object disappeared from here
  18.     return update;
  19. }

The method looked okay to me. So I looked into ProcessDataReceivedFromServer1, ProcessDataReceivedFromServer2, ProcessDataReceivedFromServer3 and ProcessDataReceivedFromServer4. All these methods had no synchronization issues and it baffled me. The log statements indicated that the object had been processed by each of these methods and returned correctly. Just before the return statement the object was present and it then just disappeared. Moreover, this didn’t happen for all objects. It occurred once in 2000 updates.

I started reviewing the code line-by-line till I discovered that ProcessDataReceivedFromServer2, ProcessDataReceivedFromServer3 and ProcessDataReceivedFromServer4 were not being passed thread local variable update, but instead were receiving an instance variable _update. ProcessDataReceivedFromServer1 was assigning update to _update and this was then passed in each of the other method. So, it was indeed a synchronization issue. We really didn’t need _update, but I guess when we made our code thread safe we forgot to review it thoroughly and that piece of code remained. The situation was overlooked especially because update and _update looked similar.