Tuesday, June 16, 2009

BlackBerry MDS Simulator

Just cracked how to use the latest RIM MDS Simulator on my Vista box. Install the simulator as usual, and go to the install directory (in my case it's C:\Program Files\Research In Motion\BlackBerry Email and MDS Services Simulators 4.1.2\MDS). Make sure your user has write access to this directory, otherwise MDS will fail to run. On Vista installs, this directory is read-only for your user (should be OK with XP, though). Now, for the final bit: open the run.bat file and replace the following:

!BMDS_CLASSPATH!;!BMDS_CLASSPATH2!

with:

%BMDS_CLASSPATH%;%BMDS_CLASSPATH2%

Now the MDS Simulator works and I can browse the internet from the BlackBerry simulator. When you are done, just press Ctrl-C on the MDS console and then exit it (so eighties, isn't it?).

Friday, February 27, 2009

SQL Compact Insert Performance

One of the performance bottlenecks of SQL Compact databases is data insertion, especially when large amounts of data must be inserted. Lots of people have struggled with this, especially when coming from SQL Server development, where everything is easier and so much faster. Using the same coding approaches as in the desktop SQL Server, SQL Compact will run very slowly when inserting large volumes of data. There are two main reasons for this:
  1. On heavily indexed tables, all data changes will be slower (this means inserting, updating and deleting data). In such cases you may get better performance by dropping all indexes and constraints before adding the data and recreating them at the end. Please note that SQL Compact uses indexes internally for PRIMARY KEY, UNIQUE, and FOREIGN KEY constraints.
  2. The query processor is the worst mechanism for inserting large volumes of data. Executing a SQL INSERT comand (even if it is prepared and has command parameters created by explicitly stating data type and size) is always slower than the best alternative: using a base table cursor. When your application uses the SQL Compact Query Processor to execute a command it always pays a performance price because the QP must parse, compile and optimize the command execution path. If you can, do this work by yourself.
I have recently talked about this on two presentations (TechEd EMEA 2008 and DevDays09) and used a simple demo to illustrate this. The sample solution (InsertDemo, see download link below) uses four different approaches for bulk data insertion:
  • Brute force - Each row is inserted by executing a SQL command created on the fly, using a string concatenation. If you look at the code you will see that I'm reusing both the SqlCeCommand and the StringBuilder objects so that I take the GC out of the performance measurement. This approach has the worst performance of the four and you can explain it through the heavy use of the query processor.
  • Prepared - Instead of executing a new command every time, you can create a command with parameters and prepare it (note that the .NET CF SqlCeCommand.Prepare does not prepare the command like the OLE DB version does, it merely marks the command for preparation upon first execution - this is a pitty because the OLE DB version allows you to parse the command and check for syntax errors before executing it). Note that here we have a tighter insertion loop because we only need to set the parameter's values and execute the command.
  • SqlCeResultSet - This is the best managed approach for inserting data in a table. By using the SqlCeResultSet class your code circumvents the query processor completely and inserts the data in the table by using a lower-level object: a base table cursor. Note that contrary to the OLE DB interfaces, there is no notion of a current row for insertionpurposes so you must create a SqlCeUpdatableRecord object to store the data for insertion. You can create just one instance of this object and reuse it to insert all the required rows in the table (see how I did it in the sample code).
  • OLE DB - This is the fastest approach, but it's not managed. Interestingly the relative performance difference to the SqlCeResultSet seems to be constant with the inserted sample size, which seems to point to the .NET CF runtime. Although the difference seems small for small data sets, it may become significant for larger data sets.
Have a go with the sample project (you need both in the same solution in order for the whole demo to work). There are a few things that you might be interested in testing with this demo:
  • Sample size - Inserting more rows will take longer, but does the total time grow linearly?
  • PRIMARY KEY - What's the impact of adding a PRIMARY KEY to the sample table? How does performance behave when the sample size increases?
  • Random keys - When using a PRIMARY KEY, what's the impact of inseting a sorted data set vs. a random data set?
  • Delete - What happens when you delete large amounts of data under the previous circumstances?
I hope you have some fun gathering data with this demo. It uses a very simple table (one integer column) so it's not real. But you can change it to model your real table and see how it behaves under the pressure of thousands of inserts.

Enjoy!
Sample code: InsertDemo.zip, OleDbInsert.zip

Saturday, February 7, 2009

How fast is fast?

I don't like to use the "optimization" word when talking about software. Coming from a mathematics and engineering background, the word "optimization" usually means the best possible result. Can we use this word to describe the performance of our code? Hardly. I have learned that there is always some way to make your code run a bit faster.

Sometimes the new approach requires a little change to your code, while other times it will require major rewrites making it an important investment. Is it worthwhile? Well, we have to measure the performance improvement to asses how much of an improvement (if at all) we are talking about.

There are three major tools to help you in the performance assessment and tuning of your application:
  • Code instrumentation - a set of programming tools that help you measure the timings of specific portions of your code.
  • The .NET CF Profiling tools - these help you take a bird's eye view on your whole application's performance drivers.
  • The Query Execution plan - this is a SQL Compact-specific tool that shows you how the Query Analyzer compiles your SQL command and how it plans to execute it. Some of the execution bottlenecks may become apparent and, with some skills you will be able to make your command run faster.
Code instrumentation is simplest to use and involves timing your code. The .NET CF helps you with System.Environment.TickCount and System.DateTime.Now to determine the timing when a specific piece of code runs. The first method is more precise and (anywhere from one tenth of a second to half a second) ad reports elapsed time in milliseconds. The second method is less precise and is accurate to the second. Both methods work by retrieving the time measurement of the process start and end and by subtracting both to get an approximate exucution time.

With the .NET CF 3.5, Microsoft introduced a much more precise time measurement intrument, the System.Diagnostics.StopWatch. Although more precise, I generally don't use this timing method because I don't need all the precision. Timing code involves running a fairly large number of trials and averaging out the timing at the end, so individual timing precision is less of an issue. Also, most of the time you are really concerned about shaving 5, 10 or more seconds from a particular piece of code, you won't care about improving it by 10 milliseconds.

I will be looking at these topics in the next few posts about SQL Compact performance.

Wednesday, February 4, 2009

Writing Mobile Code

I keep coming back to this book because it's an invaluable reference for .NET CF developers and mobile application designers alike. With a constant and consistent focus on performance, the author explores the various aspects of mobile software development with an emphasis on Microsoft platforms and the .NET Compact Framework. Although the book's examples are all written to target the .NET CF 1.1, the material seems to keep its original freshness due to the author's approach: writing mobile code is different than writing desktop code. A mobile software developer worth his salt knows that in an underpowered device with limited memory and storage, and with small screen sizes one has to think differently. And this book teaches you how.

SQL Compact Performance

Last November I delivered a presentation at TechED EMEA 2008 about SQL Compact performance tuning. I had some requests for the presentation source code but could not find the time to post the sources online and, because this was a managed code presentation it did not feel right to post the code on my native code blog. After receiving one more request for the presentation materials I thought that this a good time to start writing about my presentation's topic and share the presentation code.

Like every aspect of software development for Windows Mobile or Windows CE, performance is one of the top priorities for developers. Devices are inherently limited in memory, storage and processing power. This also limits the amount of horsepower you can house in such a device and SQL Compact is a great example of that: it's a small-sized relational database engine with a simple but effective security model that also runs on desktop computers and serves multiple clients. Due to its small size, SQL Compact does not include all the bells and whistles that his desktop cousin has to offer, so you should expect to do some of the optimization work that SQL Server does behind the scenes. You just cannot expect to bring your desktop development experience to a mobile device and expect it to perform in the same way. You have to adopt a device development mindset: prepare to work more.

Here are two quick examples of how things work so differently on SQL Compact:
  • There is no connection pooling mechanism so it is considered a bad practice (performance-wise) to quickly create, open and close a database connection. In the worst-case scenario you will be loading and unloading the database engine everytime you do this!
  • Don't use the INSERT command to insert data on a table. Even if you use command parameters on an INSERT command, you will still get less than optimal performance. To get the very best performance when inserting data, you need to use a base table cursor which is exposed throught the SqlCeResultSet class.
Interested? Stay tuned for more in the next post.