Information flow part 4: Search statistics for our enterprise search

For content owners and content editors it is important to know if their stuff is findable, meaning that users can find it by navigation, searching etc. Giving them access to basic search statistics, enables them to find out at least if the search part is working. The navigation part can be done as well, but is not going to be the subject of this post.

Note: This blog post presents some basic search statistics for our enterprise search and the simple search statistics we have made accessible to all the users on our intranet.

We collect statistics for both our www and intranet websites. We do not use Google Analytics for this primarily since the zero-result queries are very important to us (and GA doesn’t provide those easily). The statistics can be viewed by everyone on our intranet and covers all scopes (many sub-sites have their own scope) in the index. We started using our new search (Solr/Lucene) implementation on the intranet in march/april, at the end of august we also switched our www sites to use the new search. Overall stats for our search implementation as of today:

  • 10 different sources (all intranet sites counts as one)
  • ≈280 000 indexed documents
  • an average of 1.5 search terms per search query
  • 1 search per 20 visits
  • ≈530 000 queries this year
  • ≈25 000 queries per week
  • ≈4000-5000 queries per weekday

We are, very soon, going to add a new source with about 400 000 documents to the index and switch several one-off (one example and another one) search user interfaces to our unified search user interface. Adding another 10 000-15 000 queries per week. We have lots more sources that we will have to index, but also many documents to archive or delete before adding more sources to the index.

The search statistics

As I mentioned earlier, all users on our intranet have access to the search statistics directly from the search interface.

Search form
Unified search user interface

When the user clicks on the search statistics link he is presented with the search statistics form:

Search statistics form
Search statistics form

The user chooses a scope and a date interval and the statistics are presented (some results are blurred by me). The scopes that has a zero-result query are shown in context (on mouse-over/on focus) with the search query. The example below shows the statistics for the month of november for all scopes (intranet and www).

Search stats for all scopes, month of november
Search statistics for all scopes, month of november

The individual search terms are linked and performs a search when clicked. The search performed is related to the scope for which the statistics are shown.

How we use search stats and actions based on the stats

The statistics are of course used to enhance the findability, meaning that the editors can add keywords, change title of their documents/web pages etc in order to improve the findability of the content. More about the importance of metadata in a previous blog post.

We can also add what we call ”Key matches”, which is the same principle as Googles ”sponsored links”, but applied to our enterprise search (read more about the problems and opportunities with key-matches, called best-bets in this article). We have implemented it as a self-service. Any user can fill out the key match form (everyone has access to the key matches form directly from the search interface), our search admin confirms or denies the request. We have initially set the number of key matches to a maximum of 200. Why? Well first of all, key matches is not the right solution, enhancing the contents findability is. Second, if the number of key matches gets to big, then the quality of the key matches themselves will lessen and the whole idea of using key matches will eventually be pointless.

A list of all key matches are available for any user on our intranet. The key matches can be applied to a specific scope, or to a specific range of scopes.

key-matches
List of key-matches

The search statistics can also be used to in the governance of web content. Look at this presentation for more about it (read the speaker notes). The single most important thing to do, based on the search stats is to archive or delete obsolete and outdated content in order to improve findability. Adding relevant keywords and metadata is also important. The search stats should be used only as an indicator of what is not easily findable on our intranet/internet websites. But remember the saying: ”Lies, damn lies and statistics”, before using the statistics to prove a point.

As always I really appreciate feedback, comments, tweets.

Comments

  1. Reply

    Great blog post as always!

  2. Reply

    Spot on Kristian, I like the pragmatic approach to distribute the findability governance from central search admin to the content provision actors either in web or intranet spaces. If you work with provision, and your content can’t be found for your audience. It should always be one of the key metrics for any content editor. Regardless if the provision is covered through editorial processes and professional communicators or from end-user content in either free-form (blogpost or whathave you). One step further down the long road to Findability nirvana! What I do really like is that the stats is part of the search interface options = easy to find. Many intranet / web managers get their daily stats feed into closed-circles which becomes problematic. From centre you can’t figure out all details. What I still would like to see is the intersection with conversational spaces, i.e. wiki / online voting where the (Portal) Web Governance address incoming issues with Findability through mass-collaboration. Rather than only inform the search-engine admin ;-) I also like your useful search-patterns in the http://hitta.vgregion.se . Keep up the good work and enlighten us….

    • Reply

      Fredric Thanks. I agree with you on that we do not know what happens at the edge of the network. Therefore it is very important with data/information transparency. That’s the guiding principle for us, as we try to give the users the tools they need to make their work easier.

      I think the way to findability nirvana is a long and narrow one, but we’ll have to try staying on this chosen path! :-)

Leave a comment

E-postadressen publiceras inte. Obligatoriska fält är märkta *