Have you ever got frustrated, angry and tired over broken links on your intranet? Well me too. This got me thinking how to solve it and make a document/web page link persistent over it’s lifecycle, or at least lessen the amount energy needed to keep links in order and up to date. On the internet there already are plenty of good ways to do this. I’m thinking link-shorteners like bitly.com and social bookmarking services like Delicious. But to find something that would be a good fit for the intranet, is not as easy. Note: I’m using link in this text instead of URL or URI.
There are some problems with links
There are web content management (WCM) systems (like the one we use) that ensures there are no broken links within and between the pages created by the WCM. But we have links in other information systems to the pages created by our WCM system and vice versa. In our WCM a document might get updated with a new link, because the file that is linked gets it name changed, or the web page gets moved to another part of the website (and its “pretty” link changes). We also link to external websites from the intranet, sites which we if course have zero control over, and who’s editors don’t know that they break a lot of links on our intranet if they change a link to one of their web pages/documents. Summary of problems:
- Links can only be controlled within each system, but not between systems
- All links (pointing at the same document/web page) has to be changed, one by one
- Links to a document often changes over the document’s lifecycle
- Links are hard to monitor, meaning that it’s quite hard to detect dead and re-directed links
- Links have no context, so it’s hard to automatically add “Related links”
But to any problem there is at least one solution…
There already are solutions on the internet that pretty much does what we want for our intranet at least if we use a few of them simultaneously, but we cannot us them inside our firewall. We also couldn’t find any open source projects that could work for us without major work done to them. What we need for our intranet is something that:
- Has an API that could be used by other systems
- Shortens a link
- Gives a document/web page etc a link that is persistent over time
- Makes it easy to change all occurrences of a given link
- Enables monitoring of all links from one place
- Shows aggregated statistics over the lifecycle of a document
- Makes it possible to tag (metadata) a link
- Makes it possible to add a description (metadata) for a link
- Saves a users favorite links
- An administration GUI
So we have developed and open sourced a URL-service that has all of the functionality and is; compatible with the bit.ly API, has link-shortening, sluggs (“pretty” short links), monitoring, statistics and of course a GUI for administration of links. An overview of the service is shown below:
Other uses for the URL-service
The shortened URLs are pre-requisite for making it easy to batch-change all similar links at once. Add the search engine’s ability to calculate checksums for all indexed documents, this way we can identify potential duplicates/triplets etc. With this information we can correct all the links pointing to a copy and link them back to the original document instead. And then we erase all the copies. This improves search quality because the index gets smaller and the precision gets better, since the search result page doesn’t show multiple results for the same information (the actual copies).
One of the most useful things about the URL-service is that it can be used by the search engine to rank the links that are most favorited higher than other similar results. Adding what nowadays seems to be all the rage, a social dimension to the search results. If a document/web page is favorited a lot, then it must be useful. We could also analyze the relation between the user’s profile and the links the user has favorited and the suggest the same links for another user that has a similar profile.
Links tagged by users, are also an important building block for making our intranet semantic. When we have enough tagged links, we will be able to give semi- or fully automated suggestions for links to related information.
Also the tagging and the description of the links themselves will also be usable by the search engine. This can of course be used in combination with favorited links for even more powerful suggestions.
Nothing I have described here is new, of course. I hope that someone else will find the concept interesting or useful, the solution is after all open sourced for you to use or further build upon. As always, I would really appreciate any comments about this blog post.