Feeds:
Posts
Comments

Archive for December, 2004

PubSub

Another great tool I’ve discovered is PubSub. In simple terms it’s Google for dynamic web sites (those whose conent is updated frequently) such as news sites or blogs. As these dynamic sites are updated with content that matches your search you are notified either through your aggregator or through XMPP, a protocol invented for instant messaging.

Aggregator operate on what’s known as a "pull" algorithm. It’s the responsibility of the client to periodically request updates from the server. In contrast, XMPP uses a "push" algorithm. When there are updates the server immediately notifies all clients. There are pros and cons to both techniques, but I have no real need for "push" and just subscrbe to searches via my aggregator (Bloglines).

So, why is it useful to subscribe to searches? Well, I want to read certain topics, such as articles on .NET and XUL programming (don’t worry if you don’t know what those topics are, it’s not that relevant to this post). In the past, I did Google searches to find blogs that discussed these topics and subscribed to them. If they referenced other blogs I’d often subscribe to the new ones as well. This works, but there’s a few problems. First, many blogs are not focused on single topics. A blogger might discuss XUL often enough to be worth subscribing to his blog, but still 50% or more of his posts are on subjects I have no interest in. Plus there are lots of good articles I miss out on because I’ve yet to discover the blog site. PubSub solves all of these issues. Instead of subscribing to several sites to get information on a topic, I instead create a PubSub search on the topic and subscrib to it instead. Only posts relevant to the topic I’m interested in are found in the subscription (assuming I can create a really good search query for PubSub) and I miss fewer relevant posts.

My only complaint with PubSub so far is a complaint shared with most search engines. The query language is not friendly, at least for the types of searches I make. But the nature of subscribing compounds this issue. Let me expand on this. The first complaint is that the default "operator" is OR. Users not familiar with boolean algebra think of a search as simply a request for sites that contain all terms that they type in. This is an AND operation, not OR. I created a few bad queries to begin with by making this assumption (and I’m a developer who certainly does understand boolean algebra and query languages). For instance, out of vanity I thought I’d search for references to myself and put in the key words William and Kempf. Got back sites about William Hung, etc. The OR operator didn’t give me what I wanted and I had to modify the search to use an AND operator as I expected it to default to.

Even after familiarizing myself with the query syntax, there’s still problems. However, they are problems shared with nearly every other online search engine I’ve used. The first, and by far the most annoying for me, though it may not effect everyone, is that the queries can’t handle many of the terms I frequently need. For example, .NET causes more false hits than correct ones, because the search engine ignores the period. Likewise C++ and C#, two programming languages I use a lot, are nearly impossible topics to search on. Then there are other terms I want to use, such as Mono, an OpenSource iimplementation of the .NET runtime and C# language, that return too many false hits because they are words with multiple meanings. With static searches this is annoying, but you can either try and ignore the false hits or refine your search by adding more terms that narrow the subject down to what you’re currently really looking for. But with a search for dynamic content you can’t easily narrow the search and the false hits keep coming compounding the annoyance factor and increasing the time wasted in human filtering.

If anyone knows of creative ways around these issues when using PubSub, I’d love to hear them. I’m hoping to learn a few tricks to work around the problem, as I did with static search engines, though I’m finding the dynamic nature of PubSub makes the tricks more difficult to discover… if they exist.

Read Full Post »

Aggregators

I mentioned in my last post that I’m relative new to the Blog scene. I’ve been reading blogs for only about a year. During that time I tried several aggregators (software used to collect and read articles published on Blogs and news sites). I’d like to share what I’ve experienced here, as it might help newbies as well as developers of aggregators.

The first aggregator I used was SharpDevelop. A very nice program, which I ran for several months. The interface was fairly nice and it had all of the standard features one would expect in an aggregator. At the time, the only complaint I had was how many resources (a term that’s over simplified as the amount of memory) it consumed. At the time I was running other applications for work that consumed HUGE amounts of resources, and SharpDevelop consumed enough to push the machine over the edge after a while. Since this experience I’ve grown to want several other things in an aggregator that SharpDevelop doesn’t have, but at the time this was my only complaint.

After SharpDevelop I switched to RSSBandit. Another very nice program, which consumed fewer resources. But the reason I made the switch was because RSSBandit had a feature that was supposed to allow me to share my feeds remotely, via FTP or even WebDAV. This meant that I could read my feeds on any machine which had RSSBandit installed. This became my biggest qualification for a good aggregator, but RSSBandit didn’t live up to it. You had to manually sync the local machine with the server, which I could never remember to do, making the feature useless. If I recall correctly, it also failed to sync the read/unread status of the feed items. At a minimum I’m going to read my feeds at work and at home (like I said, many of the feeds I read are on topics related to my work). So keeping feeds in synch remotely is now one of my must have features.

Next, I started reading Blogs via Mozilla Firefox’s LiveBookmarks. I did this because I didn’t want to install a full aggregator at my new work, mostly, but I found that I really liked reading feeds inside the browser. One less program I have to have up and running. And the LiveBookmarks worked pretty well. The biggest complaints were that the discovery mechanisms (the ability for the software to automatically discover feeds on a site) were minimal and determining if there were new feed items required more than a casual glance.

That’s when I discovered Sage. This is a Firefox extension for news aggregation. It works in conjunction with LiveBookmarks and completely solved the problem with determining whether or not there were new feed items at a glance. It even improved the auto discovery, though it’s still not as powerful as I was used to with RSSBandit and SharpReader.

None of these solved my desire for remote feed synchronization, however. I didn’t think there was a solution for this until I ran across another blog entry that talked about Bloglines. Bloglines is now my aggregator of choice. It centralizes my feed subscriptions on a server and remotely tracks what items I’ve read. Other benefits go along with this. The load placed on blog sites is reduced, as Bloglines queries the site for new feeds only once, no matter how many Bloglines users have subscribed. I can read my feeds any where, even on computers that are not under my control (i.e. I can’t install software) so long as there’s an Internet connection and a browser. Nearly Utopia. My only complaint is that the user interface is a little cartoonish and if there’s advanced key handlers for navigating similar to those found in Gmail, for example, they are not documented and i’ve not figured them out. So, way too much mouse usage when reading large numbers of feeds.

Bloglines also provides a programmatic interface, so stand alone aggregators or other software can use Bloglines to enhance the user interface. I still prefer reading in my browser, but have installed the Firefox extensions that make discovery possible (though since it’s based on the LiveBookmarks features, it’s not as powerful as I’d like) and indicate new feed items at a glance. Pretty nice, but it would also be interesting to have an extension for Thunderbird, which would entirely replace the cartoony user interface of the Bloglines web site and give me the powerful keyboard navigation I want. But, for now at least, I’m a Bloglines convert.

I do have to mention that I’ve read a few blog entries that indicate Bloglines may have a few bugs which cause it to stop aggregating certain sites, but I’ve never run across this issue. Maybe it’s been fixed, or maybe I’ve just been lucky.

Read Full Post »

Blog Tools

I’m relatively new to the whole Blog thing. During the last year I started heavily reading Blogs, spurred on by the election though I read just as many Blogs about computer programming and technology as I did about politics. For publishing a blog, however, this is my first experience and I’ve been at it for less than a month. So I’m very green, to say the least.

Despite that (or maybe because of it), I’ve decided one of the topics I’m going to blog about is the tools I run across related to Blogs in any fashion. Why would anyone care to read this if I’m so green? Well, usually when I get into something computer related I do a lot of research and that research should be worth something to others. In fact, I have a few articles to post today that I think other newbies might find of interest.

Read Full Post »

Code Quality

Useful blog entry on MS tools for ensuring code quality.

Read Full Post »

Firefox Ad

The Firefox add has apperantly run. Nice to see.

I used Netscape way back in the early days, and was sorely disappointed with the initial IE releases. Then IE surpassed Netscape in nearly every way I could care about, and I left them for IE. The whole government thing than followed, and frankly, I thought it was useless MS bashing. Even if MS hadn’t bundled it in the OS (which I do think makes sense, even if it created some security concerns later) or given it away free, there was no comparing Netscape and IE, so I believe most users would have still run IE.

Then the Mozilla movement came. I tried several versions along the way… mostly because I had to for doing web development. They all sucked, quite frankly. I even recall a version of Netscape that was released that once installed, you couldn’t uninstall. They rendered slowly, and had non-native looking UIs with bad designs. I figured the wars were over, and IE was the sole survivor.

Years passed. IE started showing it’s age. No releases add little of note. The web standards left the product in the dust (and considering how slowly standards evolve, there’s little excuse for MS having let this occur). The ActiveX integration in IE proved to be the worst possible security hole ever conceived. Spyware became a much bigger problem than viruses (why do we classify them as something other than a virus, btw?). We desperately needed something new, but I figured we were out of luck, since MS wasn’t doing anything, and the others were producing products that I considered worse than the IE alternative.

Then along came Firefox. I’ve mostly made the switch (the IE integration means I still use several programs that have embedded IE, but I run Firefox as my main browser). It’s a wonderful product. So good in fact, that MS will be hard pressed to get back into the browser wars again. It’s not going to be easy for them to win back those that make the switch, and the numbers making the switch are increasing quickly. MS has certainly dropped the ball in this case. I still want them to put out a new browser, and soon, since IE is embedded in so many other apps, but even when they do, it may not become my main browser again.

BTW, Thunderbird is worth your checking out as well. Very nice little mail application, based on the Mozilla architecture.

Read Full Post »