Coordinate system in HTML5 Canvas, drawing with y-axis value increasing upwards

by Miłosz Orzeł 19. May 2013 10:06

Coordinate system in HTML5 Canvas is set up in such a way that its origin (0,0) is in the upper-left corner. This solution is nothing new in the world of screen graphics (e.g. the same goes for Windows Forms and SVG). CRT monitors, which were standard in the past, displayed picture lines from top to bottom and image within a line was created from left to right. So locating origin (0,0) in the upper-left corner was intuitive and it made creating hardware and software for handling graphics easier.

Unfortunately sometimes default coordinate system in canvas is a bit impractical. Let’s assume that you want to create projectile motion animation. It seems natural that for ascending projectile, the value of y coordinate should increase. But it will result in a weird effect of inverted trajectory:

Default coordinate system (y value increases downwards)

You can get rid of this problem by modifying y value that is passed to drawing function:

context.fillRect(x, offsetY - y, size, size);

For y = 0, projectile will be placed in a location determined by offsetY (to make y = 0 be the very bottom of the canvas, set offsetY equal to height of the canvas). The bigger the value of y the higher a projectile will be drawn. The problem is that you can have hundreds of places in your code that use y coordinate. If you forget to use offsetY just once the whole image may get destroyed. 

Luckily canvas lets you make changes to coordinate system by means of transformations. Two transformation methods will be useful for us: translate(x ,y) and scale(x, y). The former allows us to move origin to an arbitrary place, the latter is for changing size of drawn objects, but it may also be used to invert coordinates.

Single execution of the following code will move origin of coordinate system to point (0, offsetY) and establish y-axis values as increasing towards the top of the screen:

context.translate(0, offsetY);
context.scale(1, -1);

Translation and scaling of coordinate system. Click to enlarge...

But there’s a catch: the result of providing -1 as scale’s method second argument is that the whole image is created for inverted y coordinate. This applies to text too (calling fillText will render letters upside-down). Therefore before writing any text, you have to restore default y-axis configuration. Because manual restoring of canvas state is awkward, methods save() and restore() exist. These methods are for pushing canvas state on the stack and popping canvas state from the stack, respectively. It is recommended to use save method before doing transformations. Canvas state includes not only transformations but also values such as fill style or line width...

context.save();
 
context.fillStyle = 'red';
context.scale(2, 2);
context.fillRect(0, 0, 10, 10);
 
context.restore();
 
context.fillRect(0, 0, 10, 10);

Above code draws 2 squares: 

First square is red and is drawn with 2x scale. Second square is drawn with default canvas settings (color black and 1x scale). This occurs because right before any changes to scale and color, canvas state was save on the stack, later on it was restored before second square drawing.

TortoiseSVN pre-commit hook in C# - save yourself some troubles!

by Miłosz Orzeł 13. January 2013 19:24

Probably everyone who creates or debugs a program happens to make temporary changes to the code that make current task easier but should never get into the repository. And probably everyone has accidentally put such code into next revision. If you are lucky enough, mistake will be revealed quickly and the only result will be a bit of shame, if not...

If only there was a way to mark “uncommitable” code...

You can do it and it’s pretty simple!

TortoiseSVN lets you set so-called pre-commit hook. It’s a program (or script) that is run when user clicks “OK” button in “SVN Commit” window. Hook can for example check content of modified files and block commit when deemed appropriate. Tortoise hooks differ from Subversion hooks in that they are executed locally and not on the server that host the repository. You therefore don’t have to worry whether your hook will be accepted by the admin or if it works on the server (e.g. server may not have .NET installed), you also don’t affect the experience of other users of the repository. Client-side hooks are quicker too.

Detailed description of hooks can be found in „4.30.8. Client Side Hook Scripts” chapter of Tortoises help file.

Tortoise supports 7 kinds of hooks: start-commit, pre-commit, post-commit, start-update, pre-update, post-update and pre-connect. We are concerned with pre-commit action. The essence of the hook is to check whether one of added or modified files contains temporary code marker. Our marker may be a “NOT_FOR_REPO” text put into a comment placed above temporary code.

This is whole hook’s code – simple console application, that may save your ass :)

using System;
using System.IO;
using System.Text.RegularExpressions;

namespace NotForRepoPreCommitHook
{
    class Program
    {
        const string NotForRepoMarker = "NOT_FOR_REPO";

        static void Main(string[] args)
        {              
            string[] affectedPaths = File.ReadAllLines(args[0]);

            Regex fileExtensionPattern = new Regex(@"^.*\.(cs|js|xml|config)$", RegexOptions.IgnoreCase);

            foreach (string path in affectedPaths)
            {
                if (fileExtensionPattern.IsMatch(path) && File.Exists(path))
                {
                    if (ContainsNotForRepoMarker(path))
                    {
                        string errorMessage = string.Format("{0} marker found in {1}", NotForRepoMarker, path);
                        Console.Error.WriteLine(errorMessage);    
                        Environment.Exit(1);  
                    }
                }
            }             
        }

        static bool ContainsNotForRepoMarker(string path)
        {
            StreamReader reader = File.OpenText(path);

            try
            {
                string line = reader.ReadLine();

                while (line != null)
                {
                    if (line.Contains(NotForRepoMarker))
                    {
                        return true;
                    }

                    line = reader.ReadLine();
                }
            }
            finally
            {
                reader.Close();
            }  

            return false;
        }
    }
}

TSVN calls pre-commit hook with four parameters. We are interested only in the first one. It contains a path to *.tmp file. In this file there is a list of files affected by current commit. Each line is one path. After loading the list, files are filtered by extension (for sure we don’t want to process files of all types). Checking if file exists is also important – the list from *.tmp file contains paths for deleted files too! Detection of the marker represented by NotForRepoMarker constant is realized by ContainsNotForRepoMarker method. Despite its simplicity it provides good performance. On mine (middle range) laptop, 100 MB file takes less than a second to process. If marker is found, program exits with error code (value different than 0). Before quitting, information about which file contains the marker is sent to standard error output (via Console.Error). This message will get displayed in Tortoise window.

The code is simple, isn’t it? In addition, hook installation is also trivial!

To attach hook, choose “Settings” item from Tortoise’s context menu. Then select “Hook scripts” element and click “Add…” button. Such window will appear:

TSVN hooks configuration window

Set „Hook Type” to „Pre-Commit Hook”. Fill “Working Copy Path” field with a path to the directory that contains local copy of the repo (different folders can have different hooks). In “Command Line To Execute” field, set path to the application that implements the hook. Check “Wait for the script to finish” and “Hide the script while running” options (the latter will prevent console window from showing). Press “OK” button and voila, hook is installed!

Now mark some code with “NOT_FOR_REPO” comment and try to execute commit. You should see something like that:

Operation blocked by pre-commit hook

Notice the „Retry without hooks” button – it allows commit to be completed by ignoring hooks.

We now have a hook that prevents from temporary code submission. One may also want to create a hook that enforces log message to be filled, blocks *.log files commits etc. Your private hooks – you decide! And if some of the hooks will be usefull for the whole team, you can always remake them as Subversion hooks.

Tested on TortoiseSVN 1.7.8/Subversion 1.7.6.

How to close pop-ups upon main window closure or logout?

by Miłosz Orzeł 4. November 2012 14:58

Imagine you have to provide support for some really old web application. The app has one main window and pop-up windows that show some sensitive information (for example payroll list). Client wants to ensure that all pop-ups are closed when user leaves main window or clicks “logout” button in this window...

So... how to close all the windows opened with window.open?

On the web this question comes up very often. Unfortunately, most common answer is really naive. Proposed solution is based on keeping references to opened pop-ups and subsequent invocation of close method: 

var popups = []; 
 
function openPopup() {
    var wnd = window.open('Home/Popup', 'popup' + popups.length, 'height=300,width=300');
     
    popups.push(wnd);
}
 
function closePopups() {
    for (var i = 0; i < popups.length; i++) {
        popups[i].close();
    }
 
    popups = [];
}

In practice this doesn’t work because the array of references is cleared at full page reload (for example after clicking on a link or upon postback)...

Other suggested solution is to give the pop-up a unique name (using the second parameter of the open method) and later acquisition of a reference to the window:

var wnd = window.open('', 'popup0');
wnd.close();

This is based on the fact, that window.open method works in two modes:

  1. If a window with a given name doesn’t exist, it is created.
  2. If a window with a given name does exist, it will not be recreated, instead a reference to that window will be returned (if non empty URL is passed to the open method pop-up will be reloaded).

The problem lies at point no. 1. If pop-up window with given name wasn’t previously opened, the call to open and close methods will cause the pop-up to be briefly visible. It sucks…

But maybe a reference to pop-up can be retained between page reloads?

If there is no need to support older browsers (unlikely for the old application) we can try to put reference to the pop-up window into localStorage. However, this will not work:

var popup = window.open('http://morzel.net', 'test');
localStorage.setItem('key', JSON.stringify(popup)); 
 
TypeError: Converting circular structure to JSON

Old tricks for keeping page state between reloads that are based on cookies or window.name will not work too.

 

So… what to do?

Even if you can’t afford to have a major change such as introducing frames, don’t give up :)

Pop-up windows have opener property that points to parent window (that is the window in which the call to window.open was placed). Pop-ups can therefore periodically check whether the main window still remains open. Additionally, pop-ups can also access variables from parent window. This can be used to enforce pop-ups closure when main window is closed or when user clicks on “logout” button in parent window. When user is logged-in (and only then!), a marker variable (i.e. loggedIn) should be set in main window.

Here is the JS code that should be placed on a page displayed in a pup-up:

window.setInterval(function () {
    try {
        if (!window.opener || window.opener.closed === true || window.opener.loggedIn !== true) {
            window.close();
        }
    } catch (ex) {
        window.close(); // FF may throw security exception when you try to access loggedIn (for external site)
    }
}, 1000);

Checking variable from the opener window has another advantage. If user moves away from our application in main window (for example by clicking back button or a link to an external website), then the pop-up window will detect the lack of monitored variable in window.opener and close automatically.

Well, it's not the kind of code you enjoy to write but it achieves the desired result despite the painful gaps in the browsers API. If only they provide us with window.exists('name') method...

Ref modifier for reference types and a bit of SOS

by Miłosz Orzeł 9. April 2012 23:30

Take a look at the following code and think what value will be displayed on the console (note that string is a reference type)?

using System;
               
class Program
{
    static void Test(string y)
    {
        y = "bbb";
    }

    static void Main()
    {
        string x = "aaa";
        Test(x);
        Console.WriteLine(x);
    }
}

The correct answer (aaa) is not all that obvious. You will see the words aaa, because without a ref modifier, a program written in C# provides a copy of the parameter value (for value types) or a copy of a reference (for reference types).

When parameter y in method Test receives a new text value, CLR does not modify the array of chars. Instead, a new string is created and a reference to it is assigned to variable y (more info here). Variable y contained in method Test is, however, just a copy of a reference hold under x variable from method named Main.

To actually change the text hidden under x variable, use the ref modifier (you have to set it both in the method declaration and its invocation - C# enforces such behavior for clarity):

using System;
               
class Program
{
    static void Test(ref string y)
    {
        y = "bbb";
    }

    static void Main()
    {
        string x = "aaa";
        Test(ref x);
        Console.WriteLine(x);
    }
}

After this change, console will show bbb text.

 

SOS

Way in which parameters are passed to a method can be examined by using tool called SOS (Son of Strike). We will use CLRStack -a command, which displays information about parameters and local variables on managed code stack (if you don't know how to use SOS look here and here, if you wonder where the name "Son of Strike" came from, click here)...

Below are the results of CLRStack -a command executed at the time of entry to the Test method.

For code without ref modifier:

!CLRStack -a
OS Thread Id: 0x176c (5996)
Child SP IP       Call Site
0031f114 00390104 Program.Test(System.String)
    PARAMETERS:
        y (0x0031f114) = 0x025cb948

0031f158 003900af Program.Main()
    LOCALS:
        0x0031f158 = 0x025cb948

0031f3c0 656721bb [GCFrame: 0031f3c0]

For code with ref modifier:

!CLRStack -a
OS Thread Id: 0x934 (2356)
Child SP IP       Call Site
001dee34 002f00f4 Program.Test(System.String ByRef)
    PARAMETERS:
        y (0x001dee34) = 0x001dee78

001dee78 002f00aa Program.Main()
    LOCALS:
        0x001dee78 = 0x027fb948

001df0ec 656721bb [GCFrame: 001df0ec]

An important difference that is exhibited by these results is the value of y parameter. In the case of code without ref modifier, it is the address of aaa string (0x025cb948). For the code with ref modifier, the value of y parameter is the address of x variable (0x001dee78) from Main method (that variable points to aaa string).

View State for TextBox and other controls that implement IPostBackDataHandler

by Miłosz Orzeł 8. January 2012 21:00

While reading the official training kit for 70-515 exam I came across this text: "With view state, data is stored within controls on a page. For example, if a user types an address into a TextBox and view state is enabled, the address will remain in the TextBox between requests.". If such statements can be found in recommended study guide, it should not come as a surprise, that confusion about the way ASP.NET Web Forms tries to cope with inherent statelessness of HTTP is so common… ;)

TextBox control from ASPX page:

<asp:TextBox ID="TextBox1" runat="server"></asp:TextBox>

is rendered on HTML page as an input tag:

<input name="TextBox1" type="text" id="TextBox1" />

If so, then the preservation of TextBox value between requests does not require any use of __VIEWSTATE hidden field. To illustrate this, create a simple page that contains TextBox and Button controls:

...
 
<body>
    <form id="form1" runat="server">
        <asp:TextBox ID="TextBox1" runat="server"></asp:TextBox>
        <asp:Button ID="Button1" runat="server" Text="Button" onclick="Button1_Click" /
    </form>
</body>
</html>

and add a handler for button’s Click event, which only task is to extend the text in TextBox1 control:

protected void Button1_Click(object sender, EventArgs e)
{
    TextBox1.Text += "X";
}

Then, run the page and activate a tool for monitoring communication between browser and server. We are interested in testing form data that is sent to the server at postback... If you are using IE, I can recommend you a debugging proxy called Fiddler. Under Firefox, use Firebug. You can also use built-in ASP.NET Trace feature – to do so, add Trace = "true" to @Page directive. I performed my tests using development tools provided with Chrome browser ("Network" tab).

The following screenshot shows what form data (HTTP POST request) was sent after first button press:

Dane formularza przy pierwszym postbacku

And here is data from second postback:

Dane formularza przy drugim postbacku

If you compare data from first and second requests, you will see that a change in the value of TextBox1.Text does not affect the value of __VIEWSTATE field. Expanding the field would be a waste of network resources if text is being sent to server in a separate field called TextBox1.

System.Web.UI.WebControls.TextBox class is one of several classes that implement IPostBackDataHandler interface. This interface requires LoadPostData method. After page initialization is completed (but before the Load event) loading of View State data is invoked (LoadViewState) and then (if the control implements IPostBackDataHandler), loading of form data is invoked (LoadPostData). Text property of a TextBox control can therefore be set even if View State mechanism is disabled (via EnableViewState = "false" setting).

So... Can we completely disable View State mechanism for TextBox controls and the like?

No. For example, View State is useful when TextChanged event is handled (for comparison between current and previous value). It can also be used when the value that is being set is other than the one related to control’s value (e.g. ForeColor).

Detection of loading an iframe created in Ext JS

by Miłosz Orzeł 5. November 2011 19:31

Suppose that you need to execute a block of code when iframe's content is loaded. In case when iframe is created statically in HTML markup, the solution is really simple. All you have to do is to connect some JavaScript function with load event:

<iframe src="http://wikipedia.org" width="600" height="400" onload="someFunction();" ></iframe>

Note: The load event (onload) is invoked when the entire contents of the document is loaded (including its external elements such as images). If you need to act earlier, that is at a time when the DOM is ready, use the other methods...

But what if the iframe is created with Ext JS code?

A simple way to set it up it is to use Ext.BoxComponent with correct autoEl property value. This gives you the ability to easily use iframe in Ext JS layout (e. g. as a child item of Ext.Window), without extending document tree with redundant elements. 

var iframeContainer = new Ext.BoxComponent({
    autoEl: {
        tag: 'iframe',
        frameborder: '0',
        src: 'http://wikipedia.org'
    },
    listeners: {
        afterrender: function () {
            console.log('rendered');

            this.getEl().on('load', function () {
                console.log('loaded');
            });
        }
    }
});

In the above code (Ext JS 3.2.1), really important thing is the time when iframe's load event is hooked. You can do it only after the control (BoxComponent) is rendered. If you try this earlier, then getEl() will return undefined and the code will fail. Prior to rendering, an Ext JS control is just a JavaScript objects, for which no document tree elements exist. Below are two screenshots showing the HTML snippets created by Ext.Window in which the only item was BoxComponent creating the iframe tag...

beforerender:

DOM beforerender

afterrender:

DOM afterrender

You can clearly see that premature connecting to load event is futile, becasue you simply cannot listen to events on something that does not exist.

Those screenshots come from Elements window of Chrome Developer Tools. A quick way to show that tool (of course in Google's browser) is to press F12 or Ctrl+Shift+I. Nice feature of CDT is the ability to show events being listened on a DOM element. To see the list you have to select DOM element and, on the right side menu, choose "Event Listeners" tab. On the screenshot below, you can see that iframe's load event is indeed used:

CDT Event Listeners

BadImageFormatException, x86 i x64

by Miłosz Orzeł 16. September 2011 21:36

Have you ever seen BadImageFormatException or “An attempt was made to load a program with an incorrect format” message?

If so, maybe the program that you tried to run hasn’t been compiled with /platform:x86 option. You are probably wondering why, while writing in C# you should think about what kind of machine (x86 or x64) your program will execute on. Well… in most cases you don’t have to care about it. If your application doesn’t contain any unsafe code and doesn’t import any native modules that are destined for specific CPU architecture, then you can forget about 32/64 bit dilemma. After all C# code gets compiled into intermediate language (CIL) and upon execution framework’s just-in-time compiler will emit native instructions suitable for current platform. Great, but there’s a catch…

Imagine a situation when you import 32 bit DLL in your application. If you run that software on 32 bit operating system everything works as expected. Unfortunately on x64 machine you can get BadImageFormatException while trying to load DLL. Why? Your assembly may be compiled with /platform:anycpu setting. It means that on 32 bit systems your code will be executed with 32 bit CLR version. On 64 bit systems a 64 bit process will be used – and it will cause problems with loading 32 bit DLL. If you provide /platform:x86 setting, then 32 bit version of the CLR will be choosen even on the x64 versions of Windows (utilizing WoW64: Windows 32-bit on Windows 64-bit).

In Visual Studio 2010 you can set targeted platform in project “Properties” window (on “Build” tab). To get there, right click on project file in Solution Explorer and choose “Properties” or use main menu “Project | <you project name> Properties…”.

Setting targeted platform in Visual Studio (click to enlarge)

Microsoft has created useful tool called CorFlags which can be used to show or set the targeted platform of an managed assembly. You can access this program by using Visual Studio Command Prompt or find it directly on your disk (I have it under C:\Program Files\Microsoft.NET\SDK\v2.0\Bin\CorFlags.exe).

Here are some examples of what you may see after checking an EXE file created with various /platform option values (use CorFlags file.name command to check a file):

anycpu x86 x64
Version   : v4.0.30319
CLR Header: 2.5
PE        : PE32
CorFlags  : 1
ILONLY    : 1
32BIT     : 0
Signed    : 0
Version   : v4.0.30319
CLR Header: 2.5
PE        : PE32
CorFlags  : 3
ILONLY    : 1
32BIT     : 1
Signed    : 0
Version   : v4.0.30319
CLR Header: 2.5
PE        : PE32+
CorFlags  : 1
ILONLY    : 1
32BIT     : 0
Signed    : 0

For the scope of this post 2 rows of the CorFlags output are important: PE and 32BIT.

  • PE: PE32 means that file can be executed on both x86 and x64
  • PE: PE32+ means that it can only be run on 64 bit version of OS
  • 32BIT: 1 means that program must be executed on x86 environment.

Understanding 32BIT: 1 meaning is really important if you want to avoid problems with importing 32 bit DLL on 64 bit systems. If 32BIT flag is set and you run PE32 file on x64, then your app will be executed in 32 bit process (under WoW), hence will have a chance to properly import 32 bit DLL. Without this setting 64 bit environment will be chosen which will cause problems.

The good news is that you can easily modify 32BIT flag with CorFlags tool, to do this execute that command:

CorFlags file.exe /32BIT+

And to remove it, try this

CorFlags file.exe /32BIT-

So even if you cannot recompile problematic assembly with proper /platform option you still have a chance of using legacy 32 bit DLLs on 64 bit Windows :)

Why use of GetPixel and SetPixel is so inefficient?

by Miłosz Orzeł 5. April 2011 23:28

Bitmap class provides two simple methods: GetPixel and SetPixel used respectively to retrieve a point of image (as the Color structure) and set a point of image. The following code illustrates how to retrieve/set all the pixels in the bitmap:

private void GetSetPixel(Bitmap image) {
   for (int x = 0; x < image.Width; x++) {
      for (int y = 0; y < image.Height; y++) {
         Color pixel = image.GetPixel(x, y);
         image.SetPixel(x, y, Color.Black);
      }
   } 
}

As shown, review and modification of pixels is extremely simple. Unfortunately behind the simplicity of the code lies a serious performance trap. While for a small number of references to image points, the speed at which GetPixel and SetPixel work is good enough, for larger images it is not the case. Graph presented below can serve as a proof of that. It shows results of 10 tests* which consisted of 10-fold invocation of previously shown GetSetPixel method for images 100x100 and 1000x1000 pixels in size.

Wyniki testów prędkości operacji na pikselach obrazu z użyciem metod GetPixel i SetPixel klasy Bitmap.

The average test time for an image measuring 100 by 100 pixels was 543 milliseconds. This speed is acceptable if the image processing is not done frequently. Performance problem is, however, clearly visible when you try to use an image of size 1000 per 1000 pixels. Execution of the test in this case takes an average of more than 41 seconds - more than 4 sec. on a single call to GetSetPixel (seriously!).

Why so slow?

Low efficiency is due to the fact that access to the pixel is not a simple reference to a memory area. Each getting or setting of color is associated with invocation of .NET Framework method, which is a wrapper for native function contained in gdiplus.dll. This call is through the mechanism of P/Invoke (Platform Invocation), which is used to communicate from managed code to unmanaged API (API outside of the .NET Framework). So for a bitmap of 1000x1000 pixels there will be 1 million calls to GetPixel method that besides the validation of parameters uses the native GdipBitmapGetPixel function. Before returning color information, GDI+ function has to perform such operations as calculating the position of bytes responsible for description of desired pixel… Similar situation occurs in the case SetPixel method.

Look at the following code of Bitmap.GetPixel method obtained with the .NET Reflector (System.Drawing.dll, .NET Framework 2.0):

public Color GetPixel(int x, int y) {
   int argb = 0;
   if ((x < 0) || (x >= base.Width)) {
      throw new ArgumentOutOfRangeException(“x”, SR.GetString(“ValidRangeX”));
   }
   if ((y < 0) || (y >= base.Height)) {
      throw new ArgumentOutOfRangeException(“y”, SR.GetString(“ValidRangeY”));
   }
   
   int status = SafeNativeMethods.Gdip.GdipBitmapGetPixel(new HandleRef(this, base.nativeImage), x, y, out argb);
   if (status != 0) {
      throw SafeNativeMethods.Gdip.StatusException(status);
   }
   return Color.FromArgb(argb);
}

Here is import of GDI + function:

[DllImport(“gdiplus.dll”, CharSet=CharSet.Unicode, SetLastError=true, 
ExactSpelling=true)]
internal static extern int GdipBitmapGetPixel(HandleRef bitmap, int x, int y, out int argb);

* I have tested on such laptop: HP Pavilion dv5, AMD Turion X2 Dual-Core Mobile RM-70, 3 GB RAM, Vista Home Premium

IBM.WMQ.MQMessage.ReadString and EndOfStreamException

by Miłosz Orzeł 18. April 2010 23:39

Recently during modification of a program to communicate with WebSphere MQ (v6.0.2.7) I noticed that logs contain some exceptions of type EndOfStreamException. Since the adapter code was rather complex it took a while before I found a trivial cause of the problems ;)

System.IO.EndOfStreamException: Nie można odczytać danych spoza końca
strumienia.
   w System.IO.__Error.EndOfFile()
   w System.IO.BinaryReader.ReadByte()
   w System.IO.BinaryReader.Read7BitEncodedInt()
   w System.IO.BinaryReader.ReadString()
   w IBM.WMQ.MQMessage.ReadString(Int32 length)

The error was reported, because sometimes in two different locations there was a call to ReadString method on the same MQMessage object:

string text = message.ReadString(message.MessageLength);

To get rid of the trouble simply add one line of code:

string text = message.ReadString(message.MessageLength);
message.Seek(0);

What's the problem?

ReadString is a method that scans stream of bytes and converts it to a string*. After successful reading of entire message content, marker of current position was left on the end of a stream. So next call to ReadString had to end up with EndOfStreamException exception. Why it had to happen? Internally ReadString (IBM.WMQ.MQMessage) uses data stored in MemoryStream object. While getting text, depending on current value of message’s DataLength property, there may be a  call to .NET Framework’s System.IO.BinaryReader.ReadString method. To read the text, it must first get the encoded length of that text  - this is why method Read7BitEncodedInt is visible on the stack trace. This routine in turn uses ReadByte method which after hitting the end of a stream throws discussed exception.

* Conversion takes place using message's CharacterSet (CCSID) property.

Why strings are immutable and what are the implications of it?

by Miłosz Orzeł 26. January 2010 23:41

String type (System.String) stores text values as a sequence of char (System.Char) elements that represent Unicode characters (encoded in UTF-16). Usually one char element stands for one symbol.

When working with text one has to remember that strings in .NET are immutable! This simply means that once created, strings cannot be modified (without reflection or unsafe code), and the methods that apparently modify a string, really return a new object with the desired value.

Immutability of strings has many advantages (more about it soon), but it can cause problems if programmer forgets that any "change" to the string actually causes creation of a new instance of String class. Although the CLR treats strings in a special way, they are still a reference type, for which the memory is allocated on the managed heap.

Operation of this loop will create 10 000 string variables, all of which except the last are trash that will need to be collected by Garbage Collector:

string s = string.Empty;

for (int i = 0; i < 10000; i++)
{
    s += "x";
}

The following picture shows part of the "Histogram by Size for Allocated Objects" window from CLR Profiler application. You can see how subsequent iterations caused heap allocations for ever larger strings.

To avoid creating many unwanted objects use StringBuilder class, which allows you to modify the text without making new String class instances.

StringBuilder sb = new StringBuilder();

for (int i = 0; i < 10000; i++)
{
    sb.Append("x");
}

string x = sb.ToString();

This simple change has a huge impact on the amount of allocated and freed memory. See the following comparison of fragments of Profiler’s “Summary” window:

Why do designers of .NET (just like the creators of Java) decided to implement immutable text strings?

For optimization reasons (mainly because of comparison speed), texts can be stored in a special table (intern pool) maintained by the CLR. The idea is to avoid creating a number of variables defining the same string. Below is a piece of code proving that variables that have the same string value can point to the same* object:

string a = "xx";
string b = "xx";
string c = "x";
string d = String.Intern(c + c);

Console.WriteLine((object)a == (object)b); // True
Console.WriteLine((object)a == (object)d); // True

If the strings were modifiable, change to the value of variable a, would also change the value of b and d.

Immutability of strings has positive significance in multithreaded applications – any text amendment causes creation of a new variable so there is no need to set up the lock to avoid conflicts while multiple threads simultaneously access text. This is really important because quite often authorization of some operations is based on particular string value (for example, you can block specific functionality of a service based on client’s address).

Another important reason for immutability is the widespread use of strings as keys in hash tables. Would calculation of position of the element in table make any sense if it would be possible to modify the value of a key? The following listing shows that the "change" to variable used as the key does not affect the operation of the hash table:

string key = "abc";
Hashtable ht = new Hashtable();
ht.Add(key, 123);

key = "xbc";

Console.WriteLine(key); // xbc
Console.WriteLine(ht["abc"]); // 123

What would happen in the case of modifiable strings can be seen through a block of code that uses unsafe mode to actually change the string used as key:

unsafe
{
    string key = "abc";
    Hashtable ht = new Hashtable();
    ht.Add(key, 123);

    fixed (char* p = key)
    {
        p[0] = 'x';
    }

    Console.WriteLine(key); // xbc
    Console.WriteLine(ht["abc"]); // Not found!
}

Immutability is also related to the fact that the string is stored internally as an array - the data structure representing a continuous address space (which for performance reasons does not support insert operation).

* Whether a text literal is automatically added to the pool may be dependent on the use of ngen.exe tool or CompilationRelaxations settings...

What for?

I can’t imagine working as a programmer without hundreds of web pages on which people "wasting" their free time share what they managed to find out. Therefore I will try to add a bit of useful information to the web’s resources myself...  - about me

Language

This blog is my first attempt to write in English so if you see any language mis- takes please let me know. I didn’t have enough time to translate most of my old posts but I will try to make new ones both in Polish and in English.
Znasz polski? Kliknij tutaj.