<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Dào of Search &#187; Development</title>
	<atom:link href="http://blog.twigkit.com/category/development/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.twigkit.com</link>
	<description>A blog about search, user experience, and development.</description>
	<lastBuildDate>Wed, 01 Sep 2010 21:26:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Security in Search Applications</title>
		<link>http://blog.twigkit.com/security-in-search-applications/</link>
		<comments>http://blog.twigkit.com/security-in-search-applications/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 11:33:46 +0000</pubDate>
		<dc:creator>Hjortur Stefan Olafsson</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://blog.twigkit.com/?p=223</guid>
		<description><![CDATA[<img class="alignright size-full wp-image-255" title="Permissions" src="http://blog.twigkit.com/wp-content/uploads/2010/01/Permissions.png" alt="" width="251" height="185" />When a search engine is brought to bear on content with restricted access, it becomes evident that security and preserving the integrity of permissions can be an important and often thorny issue. Consider the case when enterprise search is applied to a document management system. Each and every document might well end up in the same index, meaning that when searches are conducted it is imperative that confidentiality is maintained and users only see results to which they have access.]]></description>
			<content:encoded><![CDATA[<p>When a search engine is brought to bear on content with restricted access, it becomes evident that security and preserving the integrity of permissions can be an important and often thorny issue.</p>
<h2>The problem</h2>
<p><a href="http://blog.twigkit.com/wp-content/uploads/2010/01/Permissions.png"><img class="alignright size-full wp-image-255" title="Permissions" src="http://blog.twigkit.com/wp-content/uploads/2010/01/Permissions.png" alt="" width="251" height="185" /></a>Consider the case when enterprise search is applied to a document management system. Each and every document might well end up in the same index, meaning that when searches are conducted it is imperative that <a href="http://en.wikipedia.org/wiki/CIA_triad#Key_concepts" target="_blank">confidentiality</a> is maintained and users are only shown results they&#8217;ve got access to. This is typically known as &#8216;document-level&#8217; or mapped security, where each piece of content is accompanied by the associated security metadata (known as an <a href="http://en.wikipedia.org/wiki/Access_control_list" target="_blank">Access Control List</a>, or ACL).</p>
<p>The problem is further compounded by the fact that search engines are being increasingly used to break down silos of data by making searchable, disparate sources of data that don&#8217;t share a single access control system.</p>
<h2>Under the hood</h2>
<p>This security metadata, or entitlements, get stored in the index, thus forming a part of the query evaluation process carried out by the search engine. What that means is that the set of documents found to correspond to the user&#8217;s query will be further whittled down by matching this metadata with the user&#8217;s credentials (<strong><a href="http://en.wikipedia.org/wiki/Authorization" target="_blank">authorisation</a></strong>). Given how intrinsic this is to the overall process of matching, this piece of the puzzle will generally fall under the remit of the underlying search engine. The other constituent part is the <strong><a href="http://en.wikipedia.org/wiki/Authentication#Authentication_vs._authorization" target="_blank">authentication</a></strong> or the process of establishing who the user is, so that his credentials (such as the groups he belongs to) can be provided along with the query. However, many providers of search technology have seen the former as their only remit in the security life cycle. To be fair, this lack of vendor support for the entire security life cycle has been somewhat understandable, given the plethora of authentication mechanisms out there. But the problem&#8217;s not going away.</p>
<h2>How do we deal with it?</h2>
<p>Given this, and the fact that almost every internal search project faces this dilemma, we often have to provide or facilitate a solution. We therefore chose to provide a TwigKit Security Module which offers integration with leading authentication providers such as <a href="http://en.wikipedia.org/wiki/Active_Directory" target="_blank">Active Directory</a>, <a href="http://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol" target="_blank">LDAP</a> servers, databases, <a href="http://en.wikipedia.org/wiki/OpenID" target="_blank">OpenID</a>, and proprietary systems, via a variety of methods and protocols such as <a href="http://en.wikipedia.org/wiki/Form_based_authentication" target="_blank">form-based</a>, <a href="http://en.wikipedia.org/wiki/Basic_access_authentication" target="_blank">basic</a>, <a href="http://en.wikipedia.org/wiki/NTLM" target="_blank">NTLM</a>, <a href="http://en.wikipedia.org/wiki/Kerberos_(protocol)" target="_blank">Kerberos</a> delegation, <a href="http://en.wikipedia.org/wiki/X.509" target="_blank">X.509</a> certificate exchange, and &#8216;container managed&#8217; authentication. What this means is that if an organisation already has a <a href="http://en.wikipedia.org/wiki/Single_sign-on" target="_blank">single sign on</a> infrastructure we&#8217;re can easily tap into that, but equally importantly, in greenfield scenarios, we can quickly implement an effective security solution. Thankfully, there was no need to reinvent the wheel since the Twigkit Security Module builds on the industry-leading <a href="http://static.springsource.org/spring-security/site/" target="_blank">Spring Security</a> framework.</p>
<p>The TwigKit Security Module essentially fills three roles. First, it initiates the authentication of the user by using any one or combination of the supported methods. Second, once the user is authenticated, it will pass the identity of the user to search engines that support security using each platform&#8217;s specific methodology (e.g. the FAST Security Access Module). Finally, the TwigKit Security Module provides simple methods for restricting access to individual components or aspects of the search user interface. This allows us to in effect go a level deeper than document-level security, restricting access to individual fields (somewhat analogous to column level security in a database).</p>
<h2>Some code</h2>
<p>Let&#8217;s say our search user interface is built using the TwigKit Tag Library and provides access to some kind of product catalogue. In this case we might want certain parts of the product information to be only available to a group of &#8216;Product Managers&#8217;. This could be information stored in the search engine, such as as profit margins, or functional aspects of the interface like the ability to manipulate discount levels. In the following code sample you can see how you would go about displaying product attributes such as features and price, whilst restricting other parts to members of the group &#8220;PRODUCT_MANAGERS&#8221;:</p>
<pre class="brush: xml; highlight: [7,10];">
...

&lt;search:field fieldName=&quot;productFeatures&quot; label=&quot;Features&quot; /&gt;
&lt;search:field fieldName=&quot;priceGBP&quot; label=&quot;Price&quot; prefix=&quot;£&quot; /&gt;

&lt;!-- Only show margins and discount level link to 'Product Managers' --&gt;
&lt;security:conditional allowedGroups=&quot;PRODUCT_MANAGERS&quot; deniedGroups=&quot;&quot;&gt;
	&lt;search:field field=&quot;profitMargin&quot; label=&quot;Profit Margin&quot; /&gt;
	&lt;a href=&quot;/EditDiscount/1234/&quot;&gt;Change Discount&lt;/a&gt;
&lt;/security:conditional&gt;

...
</pre>
<p>Any information or components placed within the  &#8216;conditional tags&#8217; from the security library (highlighted) will only be visible to members of the groups mentioned, or with an alternative configuration, visible to everyone unless they belong to the groups being explicitly denied access. This example shows you how to interject at a user interface level, but TwigKit provides other hooks to apply similar logic using Query and Response Processors to correspondingly protect against potential <a href="http://en.wikipedia.org/wiki/Code_injection" target="_blank">injection attacks</a> or do even do last minute checks and filtering. We will be covering processors in a separate post.</p>
<p>To wrap up, other tags from the TwigKit Tag Library provide more innocuous facilities such as providing particular attributes and features for authenticated users:</p>
<pre class="brush: xml; highlight: [7,10];">
&lt;security:userDetails authenticated=&quot;true&quot;&gt;
	Hello &lt;b&gt;${user.name}&lt;/b&gt; (&lt;a href=&quot;/logout&quot;&gt;sign out&lt;/a&gt;)
&lt;/security:userDetails&gt;
</pre>
<p>If you&#8217;ve come across other frameworks that tackle this, or  just have some thoughts on the matter, then we&#8217;d really like to hear from you &#8211; so put it in a comment or drop us a line.</p>
<h2>Final thoughts</h2>
<p>Unfortunately it&#8217;s hard to do such a complex topic proper justice in a short post like this. Admittedly it covers the problem at a very high level, ignoring other aspects of securing search such as ensuring the integrity of the search index when permissions change, and the overarching concerns addressed in <a href="http://en.wikipedia.org/wiki/Security_policy" target="_blank">security policies</a> (such as physical and network security).</p>
<p>Despite these caveats we hope it you found it helpful from a functional perspective. Last, but certainly not least, I want to thank my good friend and search guru <a href="http://www.christianmoen.com/" target="_blank">Christian Moen</a> from <a href="http://atilika.com/" target="_blank">Atilika</a> for all his invaluable thoughts on the subject.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.twigkit.com/security-in-search-applications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making Data Meaningful</title>
		<link>http://blog.twigkit.com/making-data-meaningful/</link>
		<comments>http://blog.twigkit.com/making-data-meaningful/#comments</comments>
		<pubDate>Sun, 17 Jan 2010 19:11:24 +0000</pubDate>
		<dc:creator>Hjortur Stefan Olafsson</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[User Experience]]></category>
		<category><![CDATA[data visualisation]]></category>

		<guid isPermaLink="false">http://blog.twigkit.com/?p=178</guid>
		<description><![CDATA[<div id="attachment_204" class="wp-caption alignright" style="width: 354px"><a href="http://blog.twigkit.com/wp-content/uploads/2010/01/TwigKitVisualisationsPie.png"><img class="size-full wp-image-204" title="TwigKitVisualisationsPie" src="http://blog.twigkit.com/wp-content/uploads/2010/01/TwigKitVisualisationsPie.png" alt="" width="344" height="182" /></a><p class="wp-caption-text">Facet information displayed with TwigKit as 3D pie chart</p></div>
Most modern enterprise search platforms provide some inherent capability to illustrate the shape and nature of the data within. Take for example faceted search. Facets will quickly break down the dimensions in all the data we're storing or even just the stuff that meets our search criteria. In either case we can get some form of statistical feedback e.g. on which top-level categories exist, their names and how many documents each represents. This will not only give the user insight into what information is available, but also guides them in their search, allowing them to slice and dice the data to get to the information they're after. The question is, how do we best represent this information and make it useful (and meaningful) to us?]]></description>
			<content:encoded><![CDATA[<p>Most modern enterprise search platforms provide some inherent capability to illustrate the shape and nature of the data within. Take for example <a href="http://en.wikipedia.org/wiki/Faceted_search" target="_blank">faceted search</a>.</p>
<div id="attachment_216" class="wp-caption alignright" style="width: 260px"><a href="http://blog.twigkit.com/wp-content/uploads/2010/01/CareerBuilderExample.png"><img class="size-full wp-image-216 " title="CareerBuilderExample" src="http://blog.twigkit.com/wp-content/uploads/2010/01/CareerBuilderExample.png" alt="" width="250" height="213" /></a><p class="wp-caption-text">Job adverts for &#39;project managers&#39; broken down by category on a popular recruitment site</p></div>
<p>Facets will quickly break down the dimensions in all the data we&#8217;re storing or even just the stuff that meets our search criteria. In either case we can get some form of statistical feedback e.g. on which top-level categories exist, their names and how many documents each represents. Take this search for positions as a &#8216;project manager&#8217; as an example. Using faceted search, we can quickly see that some of these are are in the &#8216;Engineering&#8217; field, with still more for IT professionals.</p>
<p>Not only does this give the user insight into what information is available, but also guides them in their search, allowing them to slice and dice the data to get precisely to the information they&#8217;re after. The question is, how do we best represent this information and make it useful (and meaningful) to us?</p>
<p>As you saw in Tyler&#8217;s <a href="/pagination-common-problems/" target="_blank">previous</a> <a href="/data-visualisations-in-search/" target="_blank">posts</a> in most cases there might be sufficient utility in just getting the broad strokes, preferably in a manner that minimises the cognitive burden of taking it in. In some cases proportions may give us the visual cues we&#8217;re after. For example it may be useful enough for us to see that there are 1) almost no orders pending shipment this week (phew), 2) a bunch in transit, with 3) the vast majority already delivered. And, thanks to faceted search all the detail on each group or dimension is a mere click away.</p>
<div id="attachment_204" class="wp-caption alignright" style="width: 354px"><a href="http://blog.twigkit.com/wp-content/uploads/2010/01/TwigKitVisualisationsPie.png"><img class="size-full wp-image-204" title="TwigKitVisualisationsPie" src="http://blog.twigkit.com/wp-content/uploads/2010/01/TwigKitVisualisationsPie.png" alt="" width="344" height="182" /></a><p class="wp-caption-text">Facet information displayed with TwigKit as a 3D pie chart</p></div>
<p>To achieve this, the TwigKit UI libraries provide widgets that will turn facet information from the search platform into pretty pictures, charts and graphs. Traditionally, a developer would have written some code to extract the necessary information from the facet, integrated a visualisation library, and displayed the result on a web page. But we&#8217;ve done all that for you.</p>
<p>In the code snippet below you can see how to create visualisations using the TwigKit JSP Tag Library. All you&#8217;d need to do is specify which facet to display, the format (such as column, line or pie chart) and the result is an interactive visualisation &#8211; where clicking a particular aspect will further refine your search. Easy as pie :)</p>
<pre class="brush: xml;">
&lt;widget:facetChart
	type=&quot;Column3D&quot;
	facet=&quot;${response.facets.manufacturer}&quot;
	numberOfFilters=&quot;6&quot;
	color=&quot;ffbb33&quot;
	backgroundColor=&quot;fbfbfb&quot;
	query=&quot;${query}&quot;
	width=&quot;700&quot;
	height=&quot;250&quot;
	title=&quot;Top Manufacturers&quot;
	subTitle=&quot;Number of products per manufacturer&quot;
	showAverage=&quot;true&quot; /&gt;
</pre>
<div id="attachment_202" class="wp-caption aligncenter" style="width: 610px"><a href="http://blog.twigkit.com/wp-content/uploads/2010/01/TwigKitVisualisationColumns.png"><img class="size-full wp-image-202" title="TwigKitVisualisationColumns" src="http://blog.twigkit.com/wp-content/uploads/2010/01/TwigKitVisualisationColumns.png" alt="" width="600" height="259" /></a><p class="wp-caption-text">Simple example of Facet information on products, broken down by manufacturer and represented as a column chart.</p></div>
<p>The important thing here is that search engines have a myriad of ways to efficiently mine vast volumes of data, providing insights that simply weren&#8217;t achievable in the traditional relational paradigm. However it is often the little things that transform that analysis into meaningful, every day tools that truly alters the way we consume information.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.twigkit.com/making-data-meaningful/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
