Category Archives: Technical

Linking to Paragraphs in HTML

HTML provides the ability to link not just to other pages on the web — like so: Norman Walsh’s website — but also to paragraphs or elements within those pages — like so: the paragraph on the same page beginning with the words ‘Just about everything changed’.

On that same page you’ll notice that anytime you hover over a paragraph a pilcrow symbol (¶) appears at the end of that paragraph. That symbol is actually an HTML link to the paragraph. (It appears in Firefox 1.5; in IE 6.0 you’ll have to hover just beyond the last word of the paragraph to have the symbol appear.) Copy that link if you want to point directly to that paragraph from your own website.

Mr. Walsh uses (maybe devised?) a mixture of javascript and CSS to implement this feature. A javascript function finds all

elements and inserts a link at the end of each one. A set of CSS rules ensure that the pilcrow symbol is the same color as the page background until you hover over it, and then the color of the link changes to make it visible.

This link-inserting only occurs in paragraphs that have an id attribute —

. I haven’t figured out if these paragraph IDs are inserted by hand or automatically by JS or CSS. Cool effect though. And perhaps even useful.

Close to the Boundaries

Like many other things in life, it’s safer not to be too close to any boundaries (like a fish at the edge of a school in shark infested waters, or someone walking on a ledge 200 meters high with no safety net).

This from a website article about the dangers of using the DOS program FDISK. Simple typos in FDISK can cause it to begin formatting your hard drive, and disk recovery tools can only recover files where the directory information nodes pointing to those files are still intact. Since the FDISK format starts at the root directory and destroys the root directory information node first, you generally cannot recover files from the root directory once you’ve accidentally started formatting the drive. This is bad as it’s the files in the root directory that allow your computer to do things like boot up when you turn it on.

I like that the author pulls a general principle out of this technical mess. If you like safety then finding yourself on a boundary is a bad place to be. Political constituency, religious practice, medical diagnosis, restaurant seating, cellular phone coverage, etc. You’ll have a less eventful ride when you’re located smack in the middle of your group.

In software engineering this boundary principle is called the “special case” — the bump in the data that cannot be processed by the simple, elegant algorithm that works fine for the bulk of the data. Oh, special cases can still be processed, but only after attaching bells and whistles — inelegance — to your elegant algorithm. Something like 70% of the work of programming goes into dealing with special cases.

Textpattern vs. WordPress

Matthew 6:24 says:

No one can serve two masters. Either he will hate the one and love the other, or he will be devoted to the one and despise the other. You cannot serve both…

For over a year now I’ve been bouncing back and forth between WordPress and Textpattern as my preferred publishing platform for weblogs.

Actually, it really can’t be described as going “back and forth.” I’ve been using WordPress while dreaming of Textpattern.

See, WordPress is full of features, and everything about it is easy to use. Publishing a blog entry is easy and the online editor makes simple HTML formatting easy too; changing the theme of your site is as easy as adding a new folder to the themes directory and then clicking a button in the admin interface; adding a plugin follows the same procedure in the plugin directory. There’s even a plugin that integrates the Textile markup language into the WordPress editing interface. Textile is one of Textpattern’s biggest selling points! So why am I torn between WordPress and Textile if WordPress makes all of these things so easy?

The answer is execution speed. Or maybe — the more fundamental reason behind the difference in execution speed — design.

Robots.txt: a Bad Idea?

Here’s an interesting article about robots.txt files and perhaps an even more interesting discussion in the ensuing comments.

No Fishing – or – Why ‘robots.txt’ and ‘favicon.ico’ are bad ideas and shouldn’t be emulated. | 2003-10-14 | BitWorking

The article takes issue what the robots.txt file does and where it is placed. It raised some interesting points that I enjoyed thinking about in that part of my brain that spins off and thinks about things while the rest of my brain tries to stay focused. The author says that, since the Robot Exclusion Protocol requires robots.txt to reside in a hard-coded location with respect to your domain name, it basically requires all legitimate robots to fish for information from your site: before they request a single page from any site they must first request a robots.txt file that may or may not exist. They didn’t follow a link to the file, the way you get to all other files on the WWW. They simply reach out there to see if a particular file exists on your domain without any real reason to suspect that it does. That’s fishing. And it uses bandwidth even for those sites that have no robots.txt, because they have to return a 404 error page.

Now that in itself doesn’t strike me as a compelling reason to insist that the protocol specify a link-based system for robots to discover your robots.txt file if it exists, but it begs the question of how many files may eventually be placed in a hardcoded location and therefor require fishing to find them. Once we have 100 such files will we be tired of such bandwidth draining requests for files specified by protocols that we don’t support on our site and begin wishing for a link-y method for a robot to discover whether we have that file on our site?