Windows Vista
I/O Improvements in Windows Vista
My tips for efficient I/O are relevant all the way back to coding for Windows 2000. A lot of time has passed since then though, and for all the criticism it got, Windows Vista actually brought in a few new ways to make I/O even more performant than before.
This will probably be my last post on user-mode I/O until something new and interesting comes along, completing what started a couple weeks ago with High Performance I/O on Windows.
Synchronous completion
In the past, non-blocking I/O was a great way to reduce the stress on a completion port. An unfortunate side-effect of this was an increased amount of syscalls -- the last non-blocking call you make will do nothing, only returning WSAEWOULDBLOCK. You would still need to call an asynchronous version to wait for more data.
Windows Vista solved this elegantly with SetFileCompletionNotificationModes. This function lets you tell a file or socket that you don't want a completion packet queued up when an operation completes synchronously (that is, a function returned success immediately and not ERROR_IO_PENDING). Using this, the last I/O call will always be of some use -- either it completes immediately and you can continue processing, or it starts an asynchronous operation and a completion packet will be queued up when it finishes.
Like the non-blocking I/O trick, continually calling this can starve other operations in a completion port if a socket or file feeds data faster than you can process it. Care should be taken to limit the number of times you continue processing synchronous completions.
Reuse memory with file handles
If you want to optimize even more for throughput, you can associate a range of memory with an unbuffered file handle using SetFileIoOverlappedRange. This tells the OS that a block of memory will be re-used, and should be kept locked in memory until the file handle is closed. Of course if you won't be performing I/O with a handle very often, it might just waste memory.
Dequeue multiple completion packets at once
A new feature to further reduce the stress on a completion port is GetQueuedCompletionStatusEx, which lets you dequeue multiple completion packets in one call.
If you read the docs for it, you'll eventually realize it only returns error information if the function itself fails—not if an async operation fails. Unfortunately this important information is missing from all the official docs I could find, and searching gives nothing. So how do you get error information out of GetQueuedCompletionStatusEx? Well, after playing around a bit I discovered that you can call GetOverlappedResult or WSAGetOverlappedResult to get it, so not a problem.
This function should only be used if your application has a single thread or handles a high amount of concurrent I/O operations, or you might end up defeating the multithreading baked in to completion ports by not letting it spread completion notifications around. If you can use it, it's a nice and easy way to boost the performance of your code by lowering contention on a completion port.
Bandwidth reservation
One large change in Windows Vista was I/O scheduling and prioritization. If you have I/O that is dependant on steady streaming like audio or video, you can now use SetFileBandwidthReservation to help ensure it will never be interrupted by something else hogging a disk.
Cancel specific I/O requests
A big pain pre-Vista was the inability to cancel individual I/O operations. The only option was to cancel all operations for a handle, and only from the thread which initiated them.
If it turns out some I/O operation is no longer required, it is now possible to cancel individual I/Os using CancelIoEx. This much needed function replaces the almost useless CancelIo, and opens the doors to sharing file handles between separate operations.
My Windows Vista/7/8 Wishlist
These are some changes I’ve been trying to get made since Vista entered beta. Now 7’s beta has begun and still chances look bleak. Maybe I’ll have more luck in 8?
- Remove
TransmitFile
/TransmitPackets
limitations. Added back in Windows NT 3.51, theTransmitFile
function lets you transfer a file’s contents entirely in kernel-mode, directly out of Windows’ internal file cache. This requires significantly less resources, is much more scalable, and is simpler to code for. Later on we got the even more functionalTransmitPackets
function. So what’s the problem? Microsoft wanted to guard against people using their desktops as servers: they locked desktops down to handling two concurrent TransmitFile calls at once. With increasingly faster internet connections and P2P’s popularity still rising, this just won’t fly anymore. For what would probably take less than five minutes to change, Microsoft could make Windows seem faster for so many people. - Give me asynchronous DNS! Vista teased me with the
GetAddrInfoEx
function, which has unimplemented placeholders for async functionality. I wonder how much faster browsing the web would be if a browser submitted several DNS requests at once, instead of one at a time? Think of all those nasty Web-2.0 sites that load things from 10 different hostnames, or forum sites that let users display external images. - Implement Linux’s
TCP_CORK
. It forces TCP to send out full frames only—no partials. Think of it like Nagle without the timeout. In some situations this can result in higher throughput, so I’m all for it. - Allow me to bind sockets and files to a thread, for I/O completion ports. It could be very nice to set a preferred thread for I/O packets to arrive on, with work stealing settings. This could improve scalability by helping applications better specify their usage patterns.
- Let me boot from software RAID. I have yet to see a quasi-hardware RAID solution (you know, the ones that come with your $80 desktop motherboard) that doesn’t suck. These things do most—if not all—of the work in software drivers, and every single one I’ve seen has brought performance and stability issues along with it. Windows has it’s own built-in software RAID which should alleviate the need for all this cheap unstable crap. Unfortunately the one big gotcha of full software RAID is that you can’t boot from it. Come on guys, Linux has been letting me keep the bulk of my system in software RAID for a long time now. Time to play catch up.
Enabling IPv6 and PNRP in Windows Vista
Windows Vista is the first version of Windows to support IPv6 out of the box. Even those of us with an IPv4 connection can make use of this, using a technology called Teredo to get IPv6 connectivity over IPv4. With Google finally getting IPv6, now seems like a good time for others to start too.
The steps to enable IPv6 are simple:
- Open up a command prompt with administrator privileges. Start->All Programs->Accessories, right click on Command Prompt and select Run as administrator.
- If you aren’t on a router, or if your router supports UPNP, enter
netsh interface teredo set state client
. - If you want to manually forward a port or your router doesn’t support UPNP, enter
netsh interface teredo set state client clientport=12345
, substituting 12345 with the port you want to use. You will have to forward UDP over this port to your computer. - Now wait for a minute or so and run
netsh interface teredo show state
. It should show “qualified” under State. - Now if you run ipconfig, it should come up with a Tunnel adapter Local Area Connection with an IPv6 address starting with 2001:0.
- You can test if it’s working by visiting Google IPv6, or the KAME project’s famous dancing kame.
Now for the second part of the post. PNRP (Peer Name Resolution Protocol) version 4.0 was also introduced in Windows Vista. With PNRP, every computer can have a hostname pointing to it that allows any XP SP2, Vista, and Server 2008 computer to connect to it via the internet. This can be incredibly useful if you’re on the go and wish to remote in to your computer. Another use I’ve found for it is to enable it on relative’s PCs for those inevitable tech support calls that we geeks despise so much.
PNRP functions solely over IPv6, so you will need to have a valid IPv6 address to make it work. The above Teredo instructions should work fine if you don’t. Here’s how you enable it:
- Open up a command prompt with administrator privileges.
- Run the command
netsh p2p pnrp peer set machinename publish=start autopublish=enable
. - Now if you run
netsh p2p pnrp peer show machinename
, it should show you a hostname to use in the format p.<random hex here>.pnrp.net. Record this name, and you can use it to talk to your machine remotely just like any other hostname.
Developers aren’t left out either: Windows comes with an extensive P2P framework, and PNRP is only one of the things built on it. WCF for instance has full integration with P2P.