Some neat libraries that have happened recently; not by me.
I found CSVScan on the RAA only a few days after I wrote my own fast CSV parser. CSVScan itself had been released only a few days before. It’s Ragel-based, and implements the most common CSV “spec”, unlike my Ccsv, which only supports a constrained format.
Ccsv is faster, but not enough to care about. Here’s a benchmark yielding 1,000,000 rows from a file:
I’m surprised it took until Sep 2007 for someone to write a C CSV parser.
Ccsv development is halted; if I had known about CSVScan, I wouldn’t have written it. It would be nice, though, if CSVScan had a
foreach method and a gem version.
If you’re learning, Ccsv still makes an excellent example of a plain C extension. CSVScan makes an excellent example of a Ragel extension.
Ara wrote a leak detector called Dike, which bears investigation. Using the object finalizer is a good idea. Probably the ideal leak detector will be lifecycle-based (sort of like Dike) instead of snapshot-based (like BleakHouse), but C-implemented (like BleakHouse) so that we can guarantee we aren’t introducing leaks in the attempt to track them, and so that it’s fast enough to use in live production environments.
If we do lifecycle tracking we should be able to identify the exact line of app code that spawns the leaks.
I was at the Ruby East conference today. It was good. Mike Mangino gave an excellent talk on mocking. Also, I met Gregory Brown in real life. Good guy; hair longer than mine.
Despite Mike’s talk, I’ve added an integration suite to Ultrasphinx. No mocks here. I actually removed some of the unit tests; for a local service consumer like Ultrasphinx, the integration suite is definitely the way to go.
Now I can approach full coverage in the plugin itself instead of relying on CHOW as the integration test—that was definitely not optimal. Also I can spawn mongrels in
setup and kill them in
teardown, which lets me test awkward situations like development environment class reloading.
Mark Lane contributed the sample app, so it’s still NIH even though I wrote the helper and initial tests. Good times.
... 58%: core rails (783 births, 1241 deaths, ratio -0.23, impact 1.66) 66%: recipe/new/GET (15182 births, 14593 deaths, ratio 0.02, impact 1.77) 75%: core rails (766 births, 1168 deaths, ratio -0.21, impact 1.60) 83%: recipe/list/GET (16423 births, 15991 deaths, ratio 0.01, impact 1.64) 65992 births, 66458 deaths. Tags sorted by immortal leaks: recipe/show/GET leaks, averaged over 4 requests: 5599 String 80 Array 2 Regexp 2 MatchData 2 Hash 1 Symbol ... core rails leaks, averaged over 4 requests: 238 String 10 Array Tags sorted by impact * ratio: 0.0739: recipe/show/GET 0.0350: recipe/new/GET 0.0218: recipe/list/GET -0.6686: core rails
That’s a Symbol up there; the new BleakHouse walks the
sym_tbl as well as the regular heap. We now track the history of every individual object instead of just class counts. This means we can accurately (fingers crossed) identify where lingering objects were spawned.
On the flipside, analyzing the log file is slow (a decent-sized logfile will have hundreds of millions of rows). I wrote a pure-C CSV parser, which helps, and there’s always the “better hardware” answer. I’ve been mainly running it on my Mac Mini; if I use the Opteron 2210 it goes much faster, since the analyzer is CPU-bound.
It doesn’t make pretty graphs anymore but I’m not sure exactly how they would help. It would be easy enough to add them back.
go go go
A gem, not a plugin, because it needs to compile a C extension. First, uninstall the old versions to prevent version problems:
sudo gem uninstall bleak_house -a -i -x
sudo gem install bleak_house
Also, you need to rebuild your
ruby-bleak-house binary, even if you already have one. Just run:
The RDoc has updated usage instructions.
In the interests of there being less business all up in here, I have created a crazy blog system.
Related in spirit to: e, Hobix, Blosxom, Yurt.
Ok, so now a ridiculous benchmark. On the server, a dynamic request from Typo, complete with MySQL climbing painfully out of swap:
$ time curl --head localhost:4001 real 0m11.270s
$ time curl --head localhost:4001 real 0m2.825s
Now with page caching:
$ time curl --head localhost:4001 real 0m0.015s
But, how about a dynamic request from the all-new Bax?
$ time curl --head localhost:4040 real 0m0.017s
Yep. Now to get that feed to validate.
Notice: this article is extremely out of date. If you want to learn modern Subversion best practices, please look elsewhere.
You want to make a Subversion branch, and merge it later. You read the branching section in the official book, but are still confused. What to do?
creating the branch
1. Note the current head revision:
svn info svn://server.com/svn/repository/trunk | grep Revision
2. Make a clean, remote copy of
trunk into the
branches folder. Name it something. We’ll call it
svn cp svn://server.com/svn/repository/trunk \ svn://server.com/svn/repository/branches/your_branch \ -m "Branching from trunk to your_branch at HEAD_REVISION"
HEAD_REVISION with the revision number you noted in step 1.
Note that a backslash (
\) means that the command continues onto the next line.
switch your local checkout to point to the new branch (this will not overwrite your changes):
svn switch --relocate \ svn://server.com/svn/repository/trunk \ svn://server.com/svn/repository/branches/your_branch
You don’t really need the
--relocate svn://server.com/svn/repository/trunk bit, but I’m in the habit of being explicit about it.
4. Check that your local checkout is definitely now
your_branch, and that you can update ok:
svn info | grep URL svn up
commit your new changes.
(These steps will work even if you had already made local changes on
trunk, but decided you wanted them on
your_branch instead. If your
trunk checkout was unmodified, just skip step 5.)
updating the branch
You’ve been developing for a while on
your_branch, and so have other people on
trunk, and now you have to add their changes to
update your branch checkout and
commit any outstanding changes.
2. Search the Subversion log to see at what revision number you last merged the changes (or when the original branch was made, if you’ve never merged). This is critical for making a successful merge:
svn log --limit 500 | grep -B 3 your_branch
3. Also note the current head revision:
svn info svn://server.com/svn/repository/trunk | grep Revision
4. Merge the difference of the last merged revision on
trunk and the head revision on
trunk into the
your_branch working copy:
svn merge -r LAST_MERGED_REVISION:HEAD_REVISION \ svn://server.com/svn/repository/trunk .
LAST_MERGED_REVISION with the revision number you noted in step 2, and
HEAD_REVISION with the revision number you noted in step 3.
Now look for errors in the output. Could all files be found? Did things get deleted that shouldn’t have been? Maybe you did it wrong. If you need to revert, run
svn revert -R *.
5. Otherwise, if things seem ok, check for conflicts:
svn status | egrep '^C|^.C'
Resolve any conflicts. Make sure the application starts and the tests pass.
commit your merge.
svn ci -m "Merged changes from trunk to your_branch: COMMAND"
COMMAND with the exact command contents from step 4.
folding the branch back into trunk
your_branch is done! Now it has to become
trunk, so everyone will use it and see how awesome it is.
This only happens once per branch.
1. First, follow every step in the previous section (“updating the branch”) so that
your_branch is in sync with any recent changes on
svn del svn://server.com/svn/repository/trunk
your_branch onto the old
svn mv svn://server.com/svn/repository/branches/your_branch \ svn://server.com/svn/repository/trunk
4. Relocate your working copy back to
svn switch --relocate \ svn://server.com/svn/repository/branches/your_branch \ svn://server.com/svn/repository/trunk
Subversion 1.5 is scheduled to bring automatic merge tracking (notice the ticket comment that says “tip of the iceberg”). Until that fine day, if you want to automate this, the svnmerge.py tool is supposed to be pretty nice.
Pratik Naik posted an introductory tutorial to has_many_polymorphs the other day. Looks good, and worth checking out if you’re just getting started.
Adobe CS3 gives you this pleasant and fatal alert if you try to install it on a case-sensitive Mac filesystem:
1. Get a firewire hard drive the same size (or larger) as your boot drive. I do a full weekly backup so I already had such a drive.
2. Download and install Carbon Copy Cloner 3 Beta.
3. Start Carbon Copy Cloner and make a fully copy of your boot drive to your backup. Make sure you choose both “copy everything from source to target” and “erase the target volume”. This will give you a bootable backup.
4. Restart and hold down the Option key. Choose your backup drive and press enter.
5. Start the Disk Utility application and erase your regular boot drive. Make sure the volume format is set to “Mac OS Extended (Journaled)”.
6. Start Carbon Copy Cloner again. Make a full copy of your backup drive to your regular boot drive. This time, don’t choose “erase the target volume”, or it will reformat it to be case-sensitive all over again.
7. Now we need to patch up some things. Start a terminal and run:
cd "/Volumes/Your Boot Drive/" sudo cp -R /usr . sudo cp -R /private . sudo cp -R /sbin . sudo ln -s private/tmp tmp sudo ln -s private/var var sudo bless --folder . --bootinfo --bootefi
8. Reboot. Hold down Option again, but this time choose the regular boot drive.
9. Download Applejack. Install it.
10. Reboot and hold down Command-S. At the single-user prompt, type:
applejack auto restart
Let everything finish. Now you’re clear.
I was surprised to see that Fauna has been near the top of the RubyForge most-active list recently:
Based on a clue from Tom Copeland, I pieced together the formula for project activity. For any given week:
activity = log(0.3 * number of file downloads) + log(number of repository commits) + log(3 * number of bugs filed) + log(3 * number of forum messages posted) + log(4 * number of ended tasks) + log(5 * number of files released) + log(5 * number of support requests made) + log(10 * number of patches filed)
This is then converted into a percentile of all active projects.
It’s interesting that a project can have significant activity even if no developer is working on it.
Fauna is taking over.
The Allison and has_many_polymorphs projects on Rubyforge are now part of the Fauna project. The new repository urls are:
Their forums have also moved. Eventually the old contents will follow.
Finally, the IRC channel has moved to
irc.freenode.net, since it’s not just about has_many_polymorphs any more.
Every Snax project now has complete RDoc documentation. The code page has up-to-date links.
The best place to make a bug report is usually the IRC channel. A post on the appropriate forum is also good.