C++, Page 1

Is C# the Boost of C-family languages?

For all the cons of giving a single entity control over C#, one pro is that it gives the language an unmatched agility to try new things in the C family of languages. LINQ—both its language integration and its backing APIs—is an incredibly powerful tool for querying and transforming data with very concise code. I really can’t express how much I’ve come to love it.

The new async support announced at PDC10 is basically the holy grail of async coding, letting you focus on what your task is and not how you’re going to implement a complex async code path for it. It’s an old idea that many async coders have come up with, but, as far as I know, has never been successfully implemented simply because it required too much language support.

The lack of peer review and standards committee for .​NET shows—there’s a pretty high rate of turnover as Microsoft tries to iron down the right way to tackle problems, and it results in a very large library with lots of redundant functionality. As much as this might hurt .​NET, I’m starting to view C# as a sort of Boost for the C language family. Some great ideas are getting real-​world use, and if other languages eventually feel the need to get something similar, they will have a bounty of experience to pull from.

C++, at least, is a terrifyingly complex language. Getting new features into it is an uphill battle, even when they address a problem that everyone is frustrated with. Getting complex new features like these into it would be a very long process, with a lot of arguing and years of delay. Any extra incubation time we can give them is a plus.

ClearType in Windows 7

One of my big pet peeves with ClearType prior to Windows 7 was that it only anti-aliased horizontally with sub-pixels. This is great for small fonts, because at such a small scale traditional anti-aliasing has a smudging effect, reducing clarity and increasing the font’s weight. For large fonts however, it introduces some very noticeable aliasing on curves, as best seen in the ‘6′ and ‘g’ here:

"Int64.org" rendered with GDI

You’ve probably noticed this on websites everywhere, but have come to accept it. Depending on your browser and operating system, you can probably see it in the title here. This problem is solved in Windows 7 with the introduction of DirectWrite, which combines ClearType’s horizontal anti-aliasing with regular vertical anti-aliasing when using large font sizes:

"Int64.org" rendered with DirectWrite

Of course, DirectWrite affects more than just Latin characters. Any glyphs with very slight angles will see a huge benefit, such as hiragana:

"まこと" rendered with GDI and DirectWrite

Unfortunately, this isn’t a free upgrade. For whatever reason, Microsoft didn’t make all the old GDI functions use DirectWrite’s improvements so to make use of this, all your old GDI and DrawText code will need to be upgraded to use Direct2D and DirectWrite directly, so an old WM_PAINT procedure like this:

PAINTSTRUCT ps;
HDC hdc = BeginPaint(hwnd, &ps);

HFONT font = CreateFont(-96, 0, 0, 0, FW_NORMAL,
                        0, 0, 0, 0, 0, 0, 0, 0, L"Calibri");

SelectObject(hdc, (HGDIOBJ)font);

RECT rc;
GetClientRect(hwnd, &rc);

DrawText(hdc, L"Int64.org", 9, &rc,
         DT_SINGLELINE | DT_CENTER | DT_VCENTER);

EndPaint(hwnd, &ps);

Will turn into this:

ID2D1Factory *d2df;

D2D1CreateFactory(D2D1_FACTORY_TYPE_SINGLE_THREADED,
   __uuidof(ID2D1Factory), 0, (void**)&d2df);

IDWriteFactory *dwf;

DWriteCreateFactory(DWRITE_FACTORY_TYPE_SHARED,
   __uuidof(IDWriteFactory), (IUnknown**)&dwf);

IDWriteTextFormat *dwfmt;

dwf->CreateTextFormat(L"Calibri", 0, DWRITE_FONT_WEIGHT_REGULAR,
   DWRITE_FONT_STYLE_NORMAL, DWRITE_FONT_STRETCH_NORMAL,
   96.0f, L"en-us", &dwfmt);

dwfmt->SetTextAlignment(DWRITE_TEXT_ALIGNMENT_CENTER);
dwfmt->SetParagraphAlignment(DWRITE_PARAGRAPH_ALIGNMENT_CENTER);

RECT rc;
GetClientRect(hwnd, &rc);

D2D1_SIZE_U size = D2D1::SizeU(rc.right - rc.left,
                               rc.bottom - rc.top);

ID2D1HwndRenderTarget *d2drt;

d2df->CreateHwndRenderTarget(D2D1::RenderTargetProperties(),
   D2D1::HwndRenderTargetProperties(hwnd, size), &d2drt);

ID2D1SolidColorBrush *d2db;

d2drt->CreateSolidColorBrush(D2D1::ColorF(D2D1::ColorF::Black),
   &d2db);

D2D1_SIZE_F layoutSize = d2drt->GetSize();
D2D1_RECT_F layoutRect = D2D1::RectF(0.0, 0.0,
   layoutSize.width, layoutSize.height);

d2drt->BeginDraw();
d2drt->DrawText(L"Int64.org", 9, dwfmt, layoutRect, d2db);
d2drt->EndDraw();

This is no small change, and considering this API won’t work on anything but Vista and Windows 7, you’ll be cutting out a lot of users if you specialize for it. While you could probably make a clever DrawText wrapper, Direct2D and DirectWrite are really set up to get you the most benefit if you’re all in. Hopefully general libraries like Pango and Cairo will get updated backends for it.

DirectWrite has other benefits too, like sub-pixel rendering. When you render text in GDI, glyphs will always get snapped to pixels. If you have two letters side by side, it will choose to always start the next letter 1 or 2 pixels away from the last—but what if the current font size says it should actually be a 1.5 pixel distance? In GDI, this will be rounded to 1 or 2. This is also noticeable with kerning, which tries to remove excessive space between specific glyphs such as “Vo”. Because of this, most of the text you see in GDI is very slightly warped. It’s much more apparent when animating, where it causes the text to have a wobbling effect as it constantly snaps from one pixel to the next instead of smoothly transitioning between the two.

DirectWrite’s sub-pixel rendering helps to alleviate this by doing exactly that: glyphs can now start rendering at that 1.5 pixel distance, or any other point in between. Here you can see the differing space between the ‘V’ and ‘o’, as well as a slight difference between the ‘o’s with DirectWrite on the right side, because they are being rendered on sub-pixel offsets:

"Volcano" close-up comparison with GDI and DirectWrite

The difference between animating with sub-pixel rendering and without is staggering when we view it in motion:

"Volcano" animation comparison with GDI and DirectWrite

Prior to DirectWrite the normal way to animate like this was to render to a texture with monochrome anti-aliasing (that is, without ClearType), and transform the texture while rendering. The problem with that is the transform will introduce a lot of imperfections without expensive super-sampling, and of course it won’t be able to use ClearType. With DirectWrite you get pixel-perfect ClearType rendering every time.

Apparently WPF 4 is already using Direct2D and DirectWrite to some degree, hopefully there will be high-quality text integrated in Flash’s future. Firefox has also been looking at adding DirectWrite support, but I haven’t seen any news of Webkit (Chrome/Safari) or Opera doing the same. It looks like Firefox might actually get it in before Internet Explorer. Edit: looks like Internet Explorer 9 will use DirectWrite—wonder which will go gold with the feature first?

Direct2D and DirectWrite are included in Windows 7, but Microsoft has backported them in the Platform Update for Windows Server 2008 and Windows Vista so there’s no reason people who are sticking with Vista should be left out. Are there people sticking with Vista?

Efficient stream parsing in C++

A while ago I wrote about creating a good parser and while the non-blocking idea was spot-on, the rest of it really isn’t very good in C++ where we have the power of templates around to help us.

I’m currently finishing up a HTTP library and have been revising my views on stream parsing because of it. I’m still not entirely set on my overall implementation, but I’m nearing completion and am ready to share my ideas. First, I’ll list my requirements:

To accomplish this I broke this out into three layers: a core parser, a buffer, and a buffer parser.

The core parser

Designing the core parser was simple. I believe I already have a solid C++ parser design in my XML library, so I’m reusing that concept. This is fully in-situ pull parser that operates on a range of bidirectional iterators and returns back a sub-range of those iterators. The pull function returns ok when it parses a new element, done when it has reached a point that could be considered an end of the stream, and need_more when an element can’t be extracted from the passed in iterator range. Using this parser is pretty simple:

typedef std::deque<char> buffer_type;
typedef http::parser<buffer_type::iterator> parser_type;

buffer_type buffer;

parser_type p;
parser_type::node_type n;
parser_type::result_type r;

do
{
  push_data(buffer); // add data to buffer from whatever I/O source.

  std::deque<char>::iterator first = buffer.begin();

  while((r = p.parse(first, buffer.end(), n)) == http::result_types::ok)
  {
    switch(n.type)
    {
      case http::node_types::method:
      case http::node_types::uri:
      case http::node_types::version:
    }
  }

  buffer.erase(buffer.begin(), first); // remove all the used
                                       // data from the buffer.
} while(r == http::result_types::need_more);

By letting the user pass in a new range of iterators to parse each time, we have the option of updating the stream with more data when need_more is returned. The parse() function also updates the first iterator so that we can pop any data prior to it from the data stream.

By default the parser will throw an exception when it encounters an error. This can be changed by calling an overload and handling the error result type:

typedef std::deque<char> buffer_type;
typedef http::parser<buffer_type::iterator> parser_type;

buffer_type buffer;

parser_type p;
parser_type::node_type n;
parser_type::error_type err;
parser_type::result_type r;

do
{
  push_data(buffer); // add data to buffer from whatever I/O source.

  std::deque<char>::iterator first = buffer.begin();

  while((r = p.parse(first, buffer.end(), n, err)) == http::result_types::ok)
  {
    switch(n.type)
    {
      case http::node_types::method:
      case http::node_types::uri:
      case http::node_types::version:
    }
  }

  buffer.erase(buffer.begin(), first); // remove all the used
                                       // data from the buffer.
} while(r == http::result_types::need_more);

if(r == http::result_types::error)
{
  std::cerr
    << "an error occured at "
    << std::distance(buffer.begin(), err.position())
    << ": "
    << err.what()
    << std::endl;
}

The buffer

Initially I was testing my parser with a deque<char> like above. This let me test the iterator-based parser very easily by incrementally pushing data on, parsing some of it, and popping off what was used. Unfortunately, using a deque means we always have an extra copy, from an I/O buffer into the deque. Iterating a deque is also a lot slower than iterating a range of pointers because of the way deque is usually implemented. This inefficiency is acceptable for testing, but just won't work in a live app.

My buffer class is I/O- and parsing-optimized, operating on pages of data. It allows pages to be inserted directly from I/O without copying. Ones that weren't filled entirely can still be filled later, allowing the user to commit more bytes of a page as readable. One can use scatter/gather I/O to make operations span multiple pages contained in a buffer.

The buffer exposes two types of iterators. The first type is what we are used to in deque: just a general byte stream iterator. But this incurs the same cost as deque: each increment to the iterator must check if it's at the end of the current page and move to the next. A protocol like HTTP can fit a lot of elements into a single 4KiB page, so it doesn't make sense to have this cost. This is where the second iterator comes in: the page iterator. A page can be thought of as a Range representing a subset of the data in the full buffer. Overall the buffer class looks something like this:

struct page
{
  const char *first;    // the first byte of the page.
  const char *last;     // one past the last byte of the page.
  const char *readpos;  // the first readable byte of the page.
  const char *writepos; // the first writable byte of the page,
                        // one past the last readable byte.
};

class buffer
{
public:
  typedef ... size_type;
  typedef ... iterator;
  typedef ... page_iterator;

  void push(page *p); // pushes a page into the buffer.  might
                      // be empty, semi-full, or full.

  page* pop(); // pops the first fully read page from from the buffer.

  void commit_write(size_type numbytes); // merely moves writepos
                                         // by some number of bytes.

  void commit_read(size_type numbytes); // moves readpos by
                                        // some number of bytes.

  iterator begin() const;
  iterator end() const;

  page_iterator pages_begin() const;
  page_iterator pages_end() const;
};

One thing you may notice is it expects you to push() and pop() pages directly onto it, instead of allocating its own. I really hate classes that allocate memory – in terms of scalability the fewer places that allocate memory, the easier it will be to optimize. Because of this I always try to design my code to – if it makes sense – have the next layer up do allocations. When it doesn't make sense, I document it. Hidden allocations are the root of evil.

The buffer parser

Unlike the core parser, the buffer parser isn't a template class. The buffer parser exposes the same functionality as a core parser, but using a buffer instead of iterator ranges.

This is where C++ gives me a big advantage. The buffer parser is actually implemented with two core parsers. The first is a very fast http::parser<const char*>. It uses this to parse as much of a single page as possible, stopping when it encounters need_more and no more data can be added to the page. The second is a http::parser<buffer::iterator>. This gets used when the first parser stops, which will happen very infrequently – only when a HTTP element spans multiple pages.

This is fairly easy to implement, but required a small change to my core parser concept. Because each has separate internal state, I needed to make it so I could move the state between two parsers that use different iterators. The amount of state is actually very small, making this a fast operation.

The buffer parser works with two different iterator types internally, so I chose to always return a buffer::iterator range. The choice was either that or silently copy elements spanning multiple pages, and this way lets the user of the code decide how they want to handle it.

Using the buffer parser is just as easy as before:

http::buffer buffer;
http::buffer_parser p;
http::buffer_parser::node_type n;
http::buffer_parser::result_type r;

do
{
  push_data(buffer); // add data to buffer from whatever I/O source.

  while((r = p.parse(buffer, n)) == http::result_types::ok)
  {
    switch(n.type)
    {
      case http::node_types::method:
      case http::node_types::uri:
      case http::node_types::version:
    }
  }

  pop_used(buffer); // remove all the used
                    // data from the buffer.
} while(r == http::result_types::need_more);

The I/O layer

I'm leaving out an I/O layer for now. I will probably write several small I/O systems for it once I'm satisfied with the parser. Perhaps one using asio, one using I/O completion ports, and one using epoll. I've designed this from the start to be I/O agnostic but with optimizations that facilitate efficient forms of all I/O, so I think it could be an good benchmark of the various I/O subsystems that different platforms provide.

One idea I've got is to use Winsock Kernel to implement a kernel-mode HTTPd. Not a very good idea from a security standpoint, but would still be interesting to see the effects on performance. Because the parser performs no allocation, no I/O calls, and doesn't force the use of exceptions, it should actually be very simple to use in kernel-mode.

C++1x loses Concepts

Bjarne Stroustrup and Herb Sutter have both reported on the ISO C++ meeting in Frankfurt a week ago, in which the much-heralded feature "concepts" were removed from C++1x.

Concepts are a powerful feature aimed at improving overloading (basically, removing the extra work in using things like iterator categories) and moving type checking up the ladder so that more reasonable error messages can be produced when a developer passes in the wrong type (think a single error line instead of multiple pages of template crap). Apparently the feature was a lot less solid than most of us thought, with a huge amount of internal arguing within the committee on a lot of the fundamental features of it. It seems that while most agreed concepts were a good idea, nobody could agree on how to implement them.

I'm definitely disappointed by this, but I'm also glad they chose to remove concepts instead of further delaying the standard, or worse putting out a poorly designed one. Instead, it seems like there is hope for a smaller C++ update to come out in 4-5 years that adds a more well thought out concepts feature. There are plenty of other C++1x language features to be happy about for though, like variadic templates, rvalue references, and lambda functions!

You may notice I've been saying C++1x here instead of C++0x—that's because it's pretty obvious to everyone now that we won't be getting the next C++ standard in 2009, but more likely 2011 or 2012. Just in time for the end of the world!

C++ ORM framework for SQLite

Over the past week I’ve been rewriting my rather dated SQLite wrapper to have an efficient, modern C++ feel. The basic wrapper is there, but I was looking for something a little more this time.

While looking at the problem I decided I was spending too much time writing boilerplate SQL for all my types so I decided to look at existing ORM frameworks. I’m pretty picky about my C++ though, and couldn’t find anything I liked so I started writing my own. Instead of creating a tool to generate C++, I wanted to take a pure approach using native C++ types and template metaprogramming.

What I ended up with is not a full ORM framework, and I’m not particularly interested in making it one. All I’m aiming for is removing boilerplate code while leaving it easy to extend it for more complex queries. Here’s what I’ve got so far:

struct my_object
{
  int id;
  std::string value;
  boost::posix_time::ptime time;
};

typedef boost::mpl::vector<
  sqlite3x::column<
    my_object, int, &my_object::id,
    sqlite3x::primary_key, sqlite3x::auto_increment
  >,
  sqlite3x::column<
    my_object, std::string, &my_object::value,
    sqlite3x::unique_key
  >,
  sqlite3x::column<
    my_object, boost::posix_time::ptime, &my_object::time
  >
> my_object_columns;

typedef sqlite3x::table<
  my_object,
  my_object_columns
> my_object_table;

Using it is pretty simple. It uses the primary key as expected, generating the proper WHERE conditions and even extracting the type to let find() and others specify only the primary key:

sqlite3x::connection con("test.db3");

my_object_table my_objects(con, "t_objects");

my_objects.add(my_object());
my_objects.edit(my_object());
my_objects.remove(int());
my_objects.exists(int());
my_objects.find(int());

One benefit of the approach taken is it makes working with single- and multiple-inheritance just as easy:

struct my_derived :
  my_object
{
  float extra;
};

typedef boost::mpl::copy<
  boost::mpl::vector<
    sqlite3x::column<my_derived, float, &my_object::extra>
  >,
  boost::mpl::back_inserter<my_object_columns>
> my_derived_columns;

typedef sqlite3x::table<
  my_derived,
  my_derived_columns
> my_object_table;

The next thing on the list was supporting types not known natively to sqlite3x. I did not want to have the headache of sub-tables, so I took the easy route and implemented basic serialization support:

struct my_derived :
  my_object
{
  std::vector<boost::uuid> uuids;
};

struct uuids_serializer
{
  static void serialize(std::vector<boost::uint8_t> &buffer,
     const std::vector<boost::uuid> &uuids);

  template<typename Iterator>
  static Iterator deserialize(std::vector<boost::uuid> &uuids,
     Iterator first, Iterator last);
};

typedef boost::mpl::copy<
  boost::mpl::vector<
    sqlite3x::column<
      my_derived, float, &my_object::extra,
      sqlite3x::serializer<uuids_serializer>
    >
  >,
  boost::mpl::back_inserter<my_object_columns>
> my_derived_columns;

A few things aren’t finished, like specifying indexes and support for multi-column primary keys.

Overall though, I’m pretty happy with it. The majority of what I use SQLite for doesn’t require many complex queries, so this should greatly help lower the amount of code I have to manage.

Best of all this ORM code is in an entirely isolated header file—if you don’t want it, just don’t include it and you’ll still have access to all the basic SQLite functions. Even with it included I kept to the C++ mantra of “dont pay for what you don’t use”—as it is entirely template-driven, code will only be generated if you actually use it.

Once I’m finished the code will replace what I have up on the SQLite wrapper page, but until then it will exist in the subversion repository only.

Qt 4.5 released, still using three year old GCC

Qt 4.5 is out, along with Qt Creator. It’s still using GCC 3.4.5, from a January 2006 codebase. Sigh.

First thoughts on Qt

I’ve been doing so much C# and XAML coding for work lately, I felt compelled to get back to the place I thrive—real C++.

I’ve always been weary of cross-platform C++ GUI coding. The options just never seemed very good to me: GTK, which doesn’t act anything close to native in Windows. Qt, which seemed good but was under GPL. wxWidgets, which feels like a thin wrapper around Win32 (and was therefor quite easy for me to learn) but has lots of little issues like the inability to scale with DPI. Given the announcement of Qt going LGPL, I figured it’s a good time to start learning Qt.

If you’re like me you might be thinking – “Qt, but that’s not real C++! What happened to using the standard library, templates, and not paying for what you don’t use!?”. Do I wish there was a more modern Boost-quality library? Absolutely. But that doesn’t exist. Perhaps because GUI work is rather boring, and coders who could make a better quality library would rather spend their time on more interesting things. Qt is still the most modern GUI lib I’ve seen for C++ yet. But I digress.

Hunting around the Qt website, first thing I find out: it’s going to be a pain in the ass to compile my Qt-based project with VC++. I’m sure it’s possible with a little elbow grease, but I wanted to get started quickly so I downloaded Qt Creator instead. Creator has a bundle that includes MinGW, Qt, and the Creator IDE. Perfect for a quick start!

Creator turns out to be a pretty good IDE. It is very close to knocking VC++ out of my favorite position. With a few bugs and usability issues fixed, it’s possible I’ll be using it even for pure Win32 apps.

I create a GUI project, hop into the designer and lay out a simple window. I haven’t even read any documentation or tutorials for using Qt at this point, so I get a little stuck. There are no Creator tutorials out there yet, so I skimmed through some other Qt stuff and quickly found my way to the layout model – exactly what I was looking for. The best thing I’ve found in WPF is the ability to have a window layed out automatically based on the size of controls in it, and I’m very pleased to see Qt has something similar. Tie in some events, and I have a simple app created.

Compile the project and oops, some errors pop up. After a little hair pulling, I found out the Creator bundle comes with MinGW GCC 3.4 — very old! It was not compiling some of my standard C++ correctly. I’ll see about integrating TDM’s GCC 4.x builds soon, but fear it will mean recompiling Qt. For now I’ve begrudgingly dumbed down my C++ to the subset that GCC 3.4 works with.

In one day I’ve learned how to create a functioning GUI program with Qt. I’ve also backed away from the designer and learned how to do things manually – I’ll definitely use the designer for a serious project, but learning how things work behind the scenes is important too.

All-in-all I’m impressed with Qt. It feels native on Windows, and has a relatively clean API. It is more powerful and productive than straight Win32, but doesn’t seem nearly as powerful as WPF. Then again, it took me several months to wrap my head around WPF enough to build anything of substance.

Nokia to release Qt under LGPL

This is fantastic news for anyone developing GPL-incompatible software. Nokia will be releasing Qt under the LGPL.

Visual Studio 2010 CTP now available

Coinciding with the 2008 PDC, the first Visual Studio 2010 CTP is now available for download. At first glance, it includes a few interesting things for C++:

I’ll be posting more as I take a closer look at these and other features.

C++0x is now feature complete

Herb Sutter posts to tell us C++0x is now feature complete. There will now be about a year of bugfixing and clarification, but that’s it: all the features are now known, and their interfaces are solid barring any bugs being found. This means compilers can finally start implementing C++0x at full speed without too much worry of surprises.

The Committee Draft is not yet available, but it is about the same as the September 2008 Working Draft.