2010
01.22

Security in Search Applications

When a search engine is brought to bear on content with restricted access, it becomes evident that security and preserving the integrity of permissions can be an important and often thorny issue.

The problem

Consider the case when enterprise search is applied to a document management system. Each and every document might well end up in the same index, meaning that when searches are conducted it is imperative that confidentiality is maintained and users are only shown results they’ve got access to. This is typically known as ‘document-level’ or mapped security, where each piece of content is accompanied by the associated security metadata (known as an Access Control List, or ACL).

The problem is further compounded by the fact that search engines are being increasingly used to break down silos of data by making searchable, disparate sources of data that don’t share a single access control system.

Under the hood

This security metadata, or entitlements, get stored in the index, thus forming a part of the query evaluation process carried out by the search engine. What that means is that the set of documents found to correspond to the user’s query will be further whittled down by matching this metadata with the user’s credentials (authorisation). Given how intrinsic this is to the overall process of matching, this piece of the puzzle will generally fall under the remit of the underlying search engine. The other constituent part is the authentication or the process of establishing who the user is, so that his credentials (such as the groups he belongs to) can be provided along with the query. However, many providers of search technology have seen the former as their only remit in the security life cycle. To be fair, this lack of vendor support for the entire security life cycle has been somewhat understandable, given the plethora of authentication mechanisms out there. But the problem’s not going away.

How do we deal with it?

Given this, and the fact that almost every internal search project faces this dilemma, we often have to provide or facilitate a solution. We therefore chose to provide a TwigKit Security Module which offers integration with leading authentication providers such as Active Directory, LDAP servers, databases, OpenID, and proprietary systems, via a variety of methods and protocols such as form-based, basic, NTLM, Kerberos delegation, X.509 certificate exchange, and ‘container managed’ authentication. What this means is that if an organisation already has a single sign on infrastructure we’re can easily tap into that, but equally importantly, in greenfield scenarios, we can quickly implement an effective security solution. Thankfully, there was no need to reinvent the wheel since the Twigkit Security Module builds on the industry-leading Spring Security framework.

The TwigKit Security Module essentially fills three roles. First, it initiates the authentication of the user by using any one or combination of the supported methods. Second, once the user is authenticated, it will pass the identity of the user to search engines that support security using each platform’s specific methodology (e.g. the FAST Security Access Module). Finally, the TwigKit Security Module provides simple methods for restricting access to individual components or aspects of the search user interface. This allows us to in effect go a level deeper than document-level security, restricting access to individual fields (somewhat analogous to column level security in a database).

Some code

Let’s say our search user interface is built using the TwigKit Tag Library and provides access to some kind of product catalogue. In this case we might want certain parts of the product information to be only available to a group of ‘Product Managers’. This could be information stored in the search engine, such as as profit margins, or functional aspects of the interface like the ability to manipulate discount levels. In the following code sample you can see how you would go about displaying product attributes such as features and price, whilst restricting other parts to members of the group “PRODUCT_MANAGERS”:

...

<search:field fieldName="productFeatures" label="Features" />
<search:field fieldName="priceGBP" label="Price" prefix="£" />

<!-- Only show margins and discount level link to 'Product Managers' -->
<security:conditional allowedGroups="PRODUCT_MANAGERS" deniedGroups="">
	<search:field field="profitMargin" label="Profit Margin" />
	<a href="/EditDiscount/1234/">Change Discount</a>
</security:conditional>

...

Any information or components placed within the ‘conditional tags’ from the security library (highlighted) will only be visible to members of the groups mentioned, or with an alternative configuration, visible to everyone unless they belong to the groups being explicitly denied access. This example shows you how to interject at a user interface level, but TwigKit provides other hooks to apply similar logic using Query and Response Processors to correspondingly protect against potential injection attacks or do even do last minute checks and filtering. We will be covering processors in a separate post.

To wrap up, other tags from the TwigKit Tag Library provide more innocuous facilities such as providing particular attributes and features for authenticated users:

<security:userDetails authenticated="true">
	Hello <b>${user.name}</b> (<a href="/logout">sign out</a>)
</security:userDetails>

If you’ve come across other frameworks that tackle this, or  just have some thoughts on the matter, then we’d really like to hear from you – so put it in a comment or drop us a line.

Final thoughts

Unfortunately it’s hard to do such a complex topic proper justice in a short post like this. Admittedly it covers the problem at a very high level, ignoring other aspects of securing search such as ensuring the integrity of the search index when permissions change, and the overarching concerns addressed in security policies (such as physical and network security).

Despite these caveats we hope it you found it helpful from a functional perspective. Last, but certainly not least, I want to thank my good friend and search guru Christian Moen from Atilika for all his invaluable thoughts on the subject.

Hjortur Stefan Olafsson

Stefan is an architect and developer at TwigKit. He has spent the last few years working with search technologies in various blue-chip organisations and governments around the world. You can keep tabs on him on Twitter.

Comments are closed.

TwigKit Enterprise Search London Meetup