04.13.09
Posted in Uncategorized at 6:29 pm by david
Long time no update this blog. I intend to fix that. But for now, I’d like to point to two articles I posted at CodeProject:
- Tracing Events Raised by Any C# Object—in which I describe a technique for tracing the events of any C# object using a very simple helper class, using .NET Reflection to get the event handlers of an arbitrary object.
- Password Field Unhider (and some C++ utility classes)—first I present a small utility that lives in the Windows notification area and stands ready at any time to unhide (that is, unmask) any password field on the screen, so you can see what you’re typing. And second, I describe some very simple yet useful C++ utility classes: a general message pump, an IPC mechanism using
WM_COPYDATA, and a work item dispatcher.
I intend to post more articles at CodeProject, the kind of useful tips, tutorial, explanation things, with source code, that are longer than the typical blog post.
Permalink
11.11.07
Posted in Uncategorized at 5:43 pm by david
You can use Word 2007 features to generate very nice looking mathematical notation. This feature is, for all practical purposes, completely undocumented by Microsoft. However, some information has been published by Microsoft employees and others on the web. This post is meant to serve as a convenient directory of that information. (This post will be updated as I learn more about equations in Word 2007.)
- To enter equation mode: Insert|Equation or Alt+= shortcut. Note that the Insert ribbon doesn’t have the Equation item when in Blog mode. Why not? (Question: How do I get it to appear in Blog mode?)
- Dataninja: Undocumented Word 2007 Equation Shortcuts—John Gardner created this very nice reference card with some useful equation formatted tips.
- Word Team Blog: Equations in 2007—an introductory post, with links to two screenshot-videos on how to use linear method. Unfortunately – she doesn’t explain, while she is typing, how to enter equation mode or how use the keyboard to move the cursor from one insertion field to the next. (This is currently the only post on the Word Team blog with the tag “equations”.)
- Word 2007 Help: Math AutoCorrect Symbols—I cut&pasted this from the Word 2007 Help—it is more easily used in this format.
- TechRepublic: Microsoft Office Word 2007 Inside and Out sample chapter on Building Blocks—TechRepublic offers this sample chapter of Microsoft Office Word 2007 Inside and Out—the chapter is all about Building Blocks, which is what the equations gallery is made of. It explains a bit about Math Autocorrect mode, linear equation entry, and how to add your own equations to the gallery. This chapter is pretty good—the book might be worth getting.
- Word Team Blog: Equation Numbering—a post on how to number the equations in your document. There is a video here too. (This is currently the only post on the Word Team blog with the tag “equations video”.)
- Murray Sargent: Math in Office: Using Math Italic and Bold in Word 2007—How using the ribbon’s italic and bold formatting buttons provides the proper math italic and bold characters for variables.
- UTN 28: Unicode Nearly Plain-Text Encoding of Mathematics—this document, an Unicode consortium Technical Note written by the Microsoft developer who implemented the feature, is a complete description of the linear entry method.
- Murray Sargent: Math Selection—a brief note on how selection works inside an equation, and the related post Murray Sargent: Using Left/Right Arrow Keys in Mathematical Text on how the insertion points works inside an equation.
- Murray Sargent: Breaking Equations Into Multiple Lines—A nice description of how to break equations onto multiple lines, and also how to align multiple equations on a specific character.
- David Carlisle: XHTML and MathML from Office 2007—David Carlisle provides instructions and an XSL stylesheet so you can take the HTML output of Word 2007 and run it through his process to get an XHTML document that has the math equations in MathML format (normally Word 2007 saves equations in “ECMA Math” format, OMML—apparently a Microsoft invention). Note that Word allows you to cut/paste MathML to/from the Clipboard (so you get get equations into or out of Mathematica, for example).
- Murray Sargent: User Spaces in Math Zones—On typing spaces into equations: Just don’t do it!
Interesting, but not as practical:
General places to look for information:
Permalink
11.08.07
Posted in Concurrency, Race Conditions at 6:05 pm by david
In a series of posts, of which this is the first, I’m going to describe race condition detection and why it is useful to detect data races when you’re trying to debug multithreaded programs.
But before I get started, here is a useful paper that characterizes race conditions formally: What Are Race Conditions? Some Issues and Formalizations (Robert H. B. Netzer, Barton P. Miller) [ACM Letters on Programming Languages and Systems v1n1, March 1992).
Netzer (who wrote his PhD thesis on this subject) classifies data races into two categories: general races, which pertain to programs which are meant to be deterministic, and data races, which pertain to programs which are non-deterministic. Then he also presents an orthogonal classification of races: feasible races which “capture the intuitive notions desired for debugging” but which are hard to compute completely and accurately, and apparent races which “capture less accurate notions” which can be detected in practice, but which are sufficiently less accurate that they tend to swamp the user with false positives.
So: general races cause non-deterministic execution in programs intended to be deterministic, and data races cause non-atomic execution of critical sections in non-deterministic programs. Thus both kind of races can cause failures in programs.
In fact, in my most recent job, I dealt with both kinds of races in the application program we were building.
In fact, both general races and data races are important concepts. Your typical Windows application program is non-deterministic – execution depends on the precise timing and ordering of input events (keystroke, mouse movement, and system messages) – but also contains large sections that are intended to operated deterministically (e.g., if in Photoshop you load a certain image file, and execute a specific filter with specific parameters on it, and then you save the resulting image in another file, then the result should be the same each time you do it even if the timing of your mouse movements differs from run to run.)
(Note that the “critical sections” that are violated by data races need not be the operating system-provided primitive, like CRITICAL_SECTION objects in Windows. It just means any bit of code that implements—or is supposed to implement—mutual exclusion.)
(By the way, I found this article somewhat difficult to understand: the differences between the feasibledata races and apparent races, which are the key to Netzer’s classification, were hard to grasp. Also it wasn’t really clear what he meant by feasible execution. Finally, the writing was repetitive in places.)
10.04.07
Posted in Uncategorized at 7:35 pm by david
A bunch of items I ordered arrived today - hard disks to relieve my chronic space shortage, and USB 2.0 enclosures to put them in. I built everything at once and commenced transferring data. It didn’t take long before I remembered Jim Gray’s early warnings that as we moved to terabyte disks programmers would need to think of disks as sequential devices - that is, like tapes. (I read this in a presentation of his a long time ago, but right now I can only find later references, like in this interview, and in this paper.) Consolidating 500Gb of files from multiple smaller hard drives onto one larger drive takes a long time.
But the real point of this post is this: Can you help me figure out if I’m using one of my new USB 2.0 hard disk enclosures correctly? I thought installing the disk into the thing was obvious—but I checked the instructions anyway, which is my usual habit. The instructions seemed clear: place the hard disk in the USB enclosuer, plug in the cables, secure the metal case with four screws—all ok so far. But here is the last paragraph:
Is good with machine plank according to the right method conjunction the hard dish, lock the right and HDD, can immediately trust the usage.
Say what? A finer example of Engrish I have never seen. Thank you CP Technologies for your CP-U2S-3G Platinum Series USB 2.0 to SATA hard disk case instructions!
Permalink
09.07.07
Posted in Bakin's Bits at 3:36 pm by david
I was configuring a new computer to be used for testing concurrent software, and was using my standard self-guidelines: second-fastest processor available (a nod to economy), as much DRAM as I can jam on a motherboard, and the latest dual-graphics card technology. Whohoo! But then I found this site on building a very economical cluster system and I realized my guidelines were old-fashioned. I’m now in the mood to build my own micro-Beowulf, so I can experiment with parallel clusters as well as multicore concurrency.
Check it out: The system described produces 26Gflops at a cost (August 2007) of $1256! It consists of 4 microATX motherboards, each with a dual core CPU and 2GB RAM, 4 power supplies, 1 hard disk, and 1 8-port gigabit switch. The “structure” is scrap plexiglass and threaded rods – definitely minimal! – and the whole thing is 11″ x 12″ x 17″! Kudos to Professor Joel Adams and his student Tim Brom for designing, building, configuring, and benchmarking this small Beowulf.
Here’s another such system – LittleFe. And here is a homebrew 10-node system from 2000, with the same idea w.r.t. minimal packaging.
(My main conclusion about my self-guidelines: I don’t need even the second-fastest processor anymore. Nearly any current processor is fast enough for development purposes, compiler and system bloat notwithstanding. This system uses cheap multicore processors, a reasonable amount of memory for each node, and doesn’t need anything more than the built-in motherboard graphics. I would still like a system with a hot new graphics card however, so I can experiment with GPGPU.)
Update Sept 18 2007: Lot’s of people are doing work in this area—which will make it easy to get started! Here are some more links:
ParallelKnoppix - A LiveCD that let’s you boot up an MPI cluster in 5 minutes!
And on the ParallelKnoppix site, some user’s have sent in pictures of their clusters—lot’s of different (and primitive, yet working) building techniques here!
And this page from Dec 2005 describes how some guy built a “mobile wireless linux cluster” (2 nodes) in order to have access to “big computer resources” while exploring a cave, mountain climbing, a weekend trip to the mountains, or who knows what else.
Permalink
Posted in Book Review at 3:21 pm by david
The typical data structures most programmers know and use require imperative programming: they fundamentally depend on replacing the values of fields with assignment statements, especially pointer fields. A particular data structure represents the state of something at that particular moment in time, and that moment only. If you want to know what the state was in the past you needed to have made a copy of the entire data structure back then, and kept it around until you needed it. (Alternatively, you could keep a log of changes made to the data structure that you could play in reverse until you get the previous state - and then play it back forwards to get back to where you are now. Both these techniques are typically used to implement undo/redo, for example.)
Or you could use a persistent data structure. A persistent data structure allows you to access previous versions at any time without having to do any copying. All you needed to do at the time was to save a pointer to the data structure. If you have a persistent data structure, your undo/redo implementation is simply a stack of pointers that you push a pointer onto after you make any change to the data structure.
This can be quite useful—but it is typically very hard to implement a persistent data structure in an imperative language, especially if you have to worry about memory management1. If you’re using a functional programming language—especially a language with lazy semantics like Haskell—then all your data structures are automatically persistent, and your only problem is efficiency (and of course, in your functional languages, the language system takes care of memory management). But for practical purposes, as a hardcore C++ programmer for professional purposes, I was locked out of the world of persistent data structures.
Now, however, with C# and C++/CLI in use (and garbage collection coming to C++ any time now …2) I can at last contemplate the use of persistent data structures in my designs. And that’s great, because it gave me an excuse to take one of my favorite computer science books off the shelf and give it another read.
The book is Purely Functional Data Structures, by Chris Okasaki. I find it to be a very well written and easy to understand introduction to the design and analysis of persistent data structures—or equivalently—for the design and analysis of any data structure you’d want to use in a functional language.
There are two key themes of the book: First, to describe the use and implementation of several persistent data structures, such as different kinds of heaps, queues, and random-access lists, and second, to describe how to create your own efficient persistent data structures.
Read the rest of this entry »
Permalink
08.21.07
Posted in Bakin's Bits at 3:22 pm by david
I—David Bakin—am an experienced software developer with over 25 years experience developing system and application software.
My aim with Bakin’s Bits is to produce programming tools to help software developers write better programs, and enjoy writing them more. The initial tools I am working on will help diagnose and debug concurrency issues in multithreaded programs—much much better than current debugging tools. I’m launching Bakin’s Bits with the help of an angel investor (who is also providing a lot of non-financial support and encouragement).
I will provide programming services on a consulting basis until my products get off the ground.
And I will provide short articles on debugging concurrent programs, developing concurrent programs, and other programming topics that interest me, on this site.
My main interests are:
- issues in concurrency, especially correctness, debugging, and appropriate design
- functional programming—I know it is fun, I know it is powerful, but can I construct a business case that will convince a development manager to let me put it into a product?
- GPGPU programming and other forms of stream programming, including
- massively scalable systems that fit in Google’s map-reduce paradigm or another similar functional pipeline
- fun in programming—not just fun as in functional, but also fun problems, fun algorithms, and having a fun time while keeping my brain active.
Thanks for checking in! — Dave
Permalink