TemaSearch Technical Documentation

Introduction to TemaSearch

TemaSearch is a system for finding search term aliases. It aims to improve the effectiveness of your existing search system by finding additional words that are related to the words given in a search query. Out of the box, it comes with inflection dictionaries, lists of spelling variations, close and general synonyms, and translation lists – available in both Bokmal and Nynorsk. The built-in applications allow you to add selected types of words to the search query, with a choice of interaction mode, from fully automatic addition of new words, through to full manual selection by the user, or a combination of these two.

Please note that this documentation discusses all features available in TemaSearch. Some features may not be available for certain types of licences.

 

General Features

§     Database of a variety of types of word alternatives

§     Includes related words automatically in searches

§     Manual selection of words to include in search.

§     Features configurable by the administrator and administrator can select features to be configured by user.

§     Customizable user-interface style, layout and content.

§     Implemented HTTP services following the REST pattern, suitable for caching.

§     More than 50 queries per second on a single CPU core.

§     Lower-level service interfaces for building custom applications.

Implementation Steps

1     Install the service (internal service users only.)

2     Determine which applications you plan to use (direct search, association search) and which features you require.

3     Apply the required changes to your website.

4     Test and fine-tune the service

The documentation is organised around these steps. First, how to install the service, followed by a high-level discussion of the applications available. Each application is then described in detail, and finally there is a reference section.

 


Installation

This section describes the steps needed to get TemaSearch up and running for these access types. 

TemaSearch is made available either as a hosted service or as an internal service. The hosted service is managed externally by a third party. The internal service is installed, deployed and managed by your organisation.

Hosted Service

No installation or deployment is required, as this has already been done for you. To access the service, you need your account name and the URL of the TemaSearch service. For some services you will also need account group and account password.

Internal service

Your organisation deals with setting up and running the internal service, and has full control over how the service is configured, deployed, and maintained. This section describes how to install and configure the service, and how to verify that the service is running.

Hardware and Software Requirements

TemaSearch is a web application built using Servlet technology. It requires modest resources and for many sites can be deployed in a shared application server running your existing web applications.

A typical server configuration is

§     1GHz CPU or faster

§     256MB RAM (or enough for the servlet container plus 100MB.)

§     30MB disk space for the installed app.

§     A servlet container compatible with the servlet 2.3 specification.

Under this configuration the query processing throughput is around 10 to 100 queries per second depending upon the features used. The hardware requirements needed for your site may of course vary from this according to the load you expect to place on the service and on the request throughput required. Sites requiring more than a sustained 50qps may need to use a dedicated application server, or a cluster of servers for even heavier loads.

The service is entirely self-contained and does not require additional external resources or databases.

Deployment

The internal service is delivered as a compressed web application (WAR) file. In order to work correctly, it should be deployed to your servlet container in an uncompressed form. How this is done varies from one servlet container to another, although the procedure is usually something like this

1     Using unzip or a similar tool, extract the files from the WAR file to a directory on the server

2     Configure the container to load the web application in that directory.

Each web application requires a unique context path within the container. TemaSearch can be installed to any valid context. In the URL examples given, we assume the context path is /ts.

Configuring the Service

Now the service has been deployed to the servlet container, it needs to be configured before first use. Configuration comprises:

1     creating deployment directories

2     creating configuration files

3     installing the licence file

Creating Deployment Directories

All deployment files specific to your own deployment of the service are stored under \WEB-INF\deploy. You will need to this directory after the first installation. When the application is upgraded or reinstalled, the files this directory will not be overwritten, so your configuration settings will be preserved.

Logging information is stored under \WEB-INF\deploy\logs. This directory should also be created.

Configuration Files

Create the configuration file \WEB-INF\deploy\deploy-conf.properties and add this line

ps.install.url=<URL of installation context>

For example,

ps.install.url=http://www.myserver.com/apps/ts

for a HTTP server at domain www.myserver.com, where the servlet container is under the apps path and TemaSearch was installed under ts. Notice that no final slash is required.

Install Licence File

As part of the internal service distribution, you will have received a licence file named licence.bin. This file should be copied to \WEB-INF\deploy\a. That is, the licence file is saved in the deploy directory and renamed to the filename 'a'. The unusual naming is a security feature to protect the licensing system.

The distribution also includes a file called suac.xml. This should be copied to the \WEB-INF\deploy directory.

Authentication

The internal service is configured with a single account that is used to authenticate access to the service. The account details are stored in the suac.xml file. The user, realm and password used to authenticate the account (puser,prealm,ppwd parameters) should be set to the name of the account that appears in the suac.xml file.

Verifying Installation

After the service has been deployed and configured, you are ready to verify that the service is operational. You can check the service is running by browsing to /version.xml under the deployment URL of the application (e.g. http://www.myserver.com/apps/ts/version.xml), which displays version number for the software.


TemaSearch Applications

TemaSearch provides two main applications, direct search and association search. Both applications provide alternative words for search queries but have different ways of interacting with the user:

§     Direct Search is mostly transparent - words are added to the search without involving the user and  requires no changes to the user interface. Optionally, form controls can be added to enable the user to configure direct search features.

§     Association Search is user-oriented - words are presented for the user to select for including in the search. Association search features a rich, customizable user interface for displaying and selecting alternative words.

Both applications can be statically configured by the administrator, and/or configured by the site visitor using FORM elements added to the existing search form.

Technical Differences

Apart from the main difference of automatic vs. manual selection of words, the search applications have these main technical differences, which may influence which application you implement.

Aspect

Direct Search

Association Search

Implementation

Web form or
Server-side scripts

Server-side scripts only

Requires Boolean search

Yes

No

User-interface changes

None required, optional form controls for user configuration. Optionally, the alternative words used in the search can be shown to the user.

 

Optional search form controls for user configuration. Alternative words shown with the search results for user selection. User-interface is fully customizable.

 

Performance per CPU core

Ca. 50 queries per sec.

Ca. 20 queries per sec.

 

 

 

About Service Parameters

Service parameters are used to supply configuration details and search query data to the service. TemaSearch is implemented as a HTTP service, and so these parameters are either form-encoded, using either a GET method query on the URL, or sent as POST data sent.

Parameters to the service fall into two main categories: search engine parameters, and TemaSearch parameters.

Search engine parameters: These parameters are eventually destined for your search engine and are not consumed by the service. They are passed into the service, possibly modified and are then passed out again in the search results.

TemaSearch parameters, by contrast, are consumed by the service and are not present in the results, and thus not passed on to your search engine.

To inform service which parameters are service parameters, a parameter list is used (the px parameter.). This lists all service parameters, and only parameters in this list are taken to be service parameters. Parameters not in the list are search engine parameters by default.

Thus, when adding new parameters to a call to the service, be sure you remember to update the parameter list! If you forget to do this, the service will not recognise the parameter as a service parameter, and so it will not be unused. If the parameter is mandatory, this will give an error, while default parameters will assume their default value, and not the value passed in.

Query Syntax

A core function of the service is input query rewriting, where words are added or replaced to produce a new query. For the new query to be understood by your search engine, any modifications must of course be made using the syntax expected by your search engine. TemaSearch has support for a number of popular query syntaxes, described in the query syntax reference. If the syntax used by your search engine syntax is not supported, contact Nynodata to discuss possible implementation choices.

Boolean search

The standard applications work best when the search engine supports Boolean search. TemaSearch does not use all Boolean functions, but requires that the search engine has a syntax to indicate “OR”: that is, for two words A and B the engine should retrieve documents containing word A or word B or both. Some search engines support this type of query by actually writing “OR” between the words, or by writing the words one after the other.

Once you have determined the syntax expected by your search engine, you should check that it is one that is supported by TemaSearch. The available query syntaxes are described in the reference section.

Performance

On our test system, a 2.4GHz Intel box running Windows, we obtained these average query times:

§     Direct Search: less than 20ms per query (50 qps)

§     Association Search: 50ms per query (20 qps)

The direct search application is approximately 5 times faster than association search. This is mostly due to the XSLT transforms used to produce the association search user interface. Whether this performance difference is significant or not depends upon your desired throughput and how the service is hosted.

All HTTP interfaces to the service are RESTful, and thus are good candidates for caching. Caching can improve performance considerably when common search queries are frequently requested by visitors.

Search Submit Method

Direct search (web form integration) and association search require that search requests can be submitted using the GET method. Even if your current search form uses POST, the GET method may still be supported, as search engines that use POST often support GET as well. To see if your search engine supports GET, change the form's method to GET, and try it out. If the search works as normal, then GET is supported. Note that changing the form method type is just for testing - you are not expected to change the form method when implementing TemaSearch.

Direct search implemented via server-side scripting does not submit new searches to the HTTP interface, and so is free from this requirement.

 


Direct Search

The direct search application adds additional words to a search query, and can do this without needing to involve the user. When configured to be fully automatic, the service functions entirely transparently to the user - the only change the user should be aware of is improved search results.

Prerequisites

Direct search can only be implemented with search engines that support a syntax for Boolean “OR”. (See Boolean search.)  If your search engine does not support syntax for expressing “OR” then you are not able to use direct search. Instead, you can use association search in word-replace mode.

Integration Overview

Direct search can be implemented in two ways:

§     Web Form: the HTML code for the search form the user submits is changed. This is a simple integration method requiring simple HTML changes.

§     Server-side scripting: The server-side script or program that serves the page produced from the search form is altered to include calls to direct search. This requires more programming than webform integration, although offers more flexibility and the possibility to introduce graceful degradation for the best reliability. Server scripting also opens up the possibility to move to combination search.

Web Form Integration

Adding direct search to an existing search form requires some changes to be made to the form's HTML code. Before you begin, you will need access to edit the existing HTML page where the search field resides. If you are unable or do not wish to alter the HTML on your site, you can still test direct search using a standalone test page, as described here.

Web form integration requires that a search query can be submitted using the GET method (see Search Submit Method). If this is not possible, direct search can be implemented using server-side scripts.

How It Works

The action URL for existing search form is changed to point to the direct search service.  Additional INPUT elements are added containing configuration values used by the service. When the form is submitted, the service rewrites the query using hidden or user-selectable configuration values, and then redirects the browser to the original search page to run the search with the rewritten query.

Simple Setup

The code below shows a simple search form

<FORM method=GET action="/search">
<INPUT type="text" name="q" size="30" value="">
<INPUT type="submit" name="submit" VALUE="Søk">
</FORM>

To activate TemaSearch, you add a handful of hidden input element to the form. The INPUT elements correspond to service parameters. For example, the code below adds the service parameter "pq" with the value "boolean".

<INPUT type="hidden" name="pq" value="boolean">

You will need to add at least these parameters shown in the table below. (You can click on the parameter name to jump to the section providing details about that parameter.)

puser

Your username associated with your TemaSearch service account.

pu

The search engine URL. This is usually taken from action attribute, fully qualified if necessary.

pq

The name of the input element that contains the search expression. TemaSearch modifies the field to include new search words.

px

Lists the names of all the elements that were added for TemaSearch. The list is used so that TemaSearch parameters are not forwarded to the target search engine.

pqsntx

Describes how queries are written for your search engine.

If we assume that the example form is hosted at http://www.myhost.com/search, that our username is "jsmith", and that the "standard" query syntax is used, then we will add these parameters:

puser

jsmith

pu

http://www.myhost.com/search

px

px pu puser pq pqsntx

pqsntx

standard

pq

q

After adding these parameters as hidden input elements, the example form looks like this (new or changed parts shown in bold.)

<FORM method=GET action="search">
<INPUT type="hidden" name="puser" value="jsmith">
<INPUT type="hidden" name="pu" value="http://www.myhost.com/search">
<INPUT type="hidden" name="px" value="pu px pq pqsntx puser">
<INPUT type="hidden" name="pq" value="q">
<INPUT type="hidden" name="pqsntx" value="standard">
<INPUT type="text" name="q" size="30" value="">

<INPUT type="submit" name="submit" VALUE="Søk">
</FORM>

Note:

§     The order the parameters appear in the form is not important, all that is required is that they appear between the <form> and </form> tags.

§     The pu value is the full URL of the search results page. Normally, the action attribute will indicate the URL of the search results, although this may need to be fully qualified by adding "http://" and then your server's domain name, if not already present. In the example the action attribute was /search, which appears fully qualified as http://www.myhost.com/search in the pu input field.

§     If your form already includes a input element with the same name as one of the new elements, see the section renaming conflicting parameters.

Now that the parameters have been added, the final step is to change the action attribute to the URL of the TemaSearch service. For example, change

<FORM method=GET action="search">

to

<FORM method=GET action="http://www.temasearch.no/form">

Once you have saved the page to your site, direct search will be enabled. If your search results page shows the submitted query, you will see that it may include additional words.  If you have other search fields on your site, you can add direct search them by repeating these steps for the other fields. If these fields use a different target URL for the search results, the URL must be registered with your account. See the section on authentication for details.

Configuration Options

TemaSearch offers various configuration options that you can use to indicate the types of alternative words to add to the query

povs

Enables use of bokmål/nynorsk translations

psynn

Enables use of near synonyms

pifl

Enables equivalent inflection endings

penab

Master switch to turn on or off TemaSearch.

By default, all these choices are active. Not all accounts include access to the various configuration options. Check your account details to ensure you have access.

In the simplest case, the service is configured by providing fixed values for service parameters. As usual, the configuration parameters are coded as hidden input elements. For example, to exclude synonyms and inflections, psynn and pifl parameters are added to the form, set to the value “0”:

<INPUT type="hidden" name="psynn" value="0">
<INPUT type="hidden" name="pinf" value="0">

Although not shown here, you need to add "psynn pinf" to the px parameter already added to the form. This is needed to be sure all parameters are recognised by TemaSearch.

When you search, you'll see that TemaSearch adds translations, but not synonyms or alternative inflections.

Authentication

The TemaSearch account used to complete the request is determined from the URL given in the pu parameter. At present, no other parameters are required. A future version f the service will require the prealm and puser parameters, so you can add these if desired.

Frequently Asked Questions

General

Q.

I do not have access/do not want to change the current form on our webserver, but I would still like to try out direct search. Is this possible?

A.

Yes. You can still evaluate TemaSearch even if you do not have permission or do not wish to change the search page on your current site.

Create a local copy of the web page containing the search form, by saving the page in your browser to your local drive (Usually done using the menu item "File | Save As...")

Make the changes to the page on your local drive, as described in this document. Save the changes to disk.

Open the saved page in your browser. When you type in a search query, you will see the search results from your main web site.

Note that because the page is saved to a new location, relative links and resources will not display, although this should not affect your ability to test the service.

 

Q.

Will using direct search affect the performance of my site?

A.

Our experience shows that adding direct search to a site does introduce a delay of around 100 milliseconds, or 1/10 of a second. In practice, when taken with the other delays present in the web, this delay is virtually unnoticeable.

Details

After integrating direct search, the additional time required to produce the search results can be attributed to

Network delay contacting the direct search server

Finding alternative words

Additional time required by your search engine to process the modified query.

To ensure network delay is minimal, we aim to provide you access to a TemaSearch server that is geographically close to your own server, resulting in faster access time. Typical network delay is around 20-50 milliseconds.

The time required by TemaSearch to process a request and rewrite a query is in the order of 1-10ms.

The additional words added to the search query may increase the time required by your search engine. The exact increase will depend on the efficiency of the search engine you use. Most "world-class" search engines use highly efficient implementations resulting in no noticeable increase in processing time (typically a few tens of milliseconds.)

Adding all of these delays together gives a total delay of around 100 ms.

 

Q.

For words that are inflected, TemaSearch adds alternatives that are also inflected. How are these corresponding inflected words produced?

A.

TemaSearch incorporates full morphological analysis and synthesis for both Bokmål and Nynorsk. That is, inflection details are maintained for every single word in the system. Inflections are produced based on the grammatical category describing the inflection, and ensures the correct corresponding inflection is produced, even when the original word and alternative words are from different languages, or when changes are required, such as doubling a final consontant or omlyd.

This is in contrast to simpler systems that copy the inflection ending from the original word to the alternative word, or use generalized rules to best-guess the correct inflection. Both these systems typically produce more incorrect inflections than the approach used by TemaSearch.

Security

Q.

The TemaSearch parameters include my account name (username) which is visible to anyone who looks as the source of the HTML for the search form. What is there to stop someone else using my account?

A.

Each use of TemaSearch involves directing the browser to a search results page to display the actual search results. TemaSearch will only direct to pages registered with your account. This prevents your account being used to show results from soneone else's website. Additionally, the page the user visited to launch temasearch is also checked against pages registered in the account. If the pages to not match with any registered pages on your account, TemaSearch will not be activated, and no usage is recorded against your account.

 

Q.

Despite these security measures, what should I do if I think my account is being used without permission?

A.

Contact us as soon as possible and we will investigate the issue immediately. All account accesses are logged, along with the users IP, which can help track down unauthorized usage.

Troubleshooting

Q.

Why do I get the error message "parameter 'XX' not defined", even though I have included it in the form?

A.

This happens when the parameter is not listed in the px parameter list. Make sure the px parameter is defined and that it lists the missing parameter.

 

Q.

Why do I get strange characters appearing in the converted query?

A.

This is usually due to a difference in the character encodings used by the page on your site and the search engine. Check that the pcharset parameter is set to the encoding expected by your search engine.

 

Q.

I am trying to turn off configuration options, but it is not working, I'm still getting options enabled that I want to disable. What's going on?

A.

Check that the option parameters have been listed in the px parameter.

 

Q.

I've made all the necessary changes, but it still isn't working.

A.

If you are using the pnoerr parameter, set this to 0 to display errors rather than silently ignore them. When you enter a new search you should see an error message indicating the likely cause of the problem.

 


Server-side Scripting

The basic approach to integrating direct search server-side is to add a call to the direct search service before the query passed to your search engine. The service amends the query with additional words which is then passed on to your search engine to run the search as usual. The search thus runs with new words suggested by TemaSearch automatically included.

The integration steps are:

§     Construct the service invocation URL. Direct search results are accessed via a URL that locates the direct search service and provides configuration parameters to the service.

§     Extract the rewritten query from the content fetched from the invocation URL.

§     Use the rewritten query to invoke your search engine.

Construct the Service Invocation URL

To retrieve the rewritten query, you construct a URL to invoke direct search. The basic URL is

http://<temasearch-service-location>/ts/svc/rewrite?<configuration-params>&<search-params>

The <configuration-params> is a list of form-encoded parameters for the direct search service. The <searc-params> are the search parameters submitted by the search form. This can be named anything – the name you use is given to the service using the pq configuration parameter.

If your query parameter is called query, then you might build URL that looks like this:

http://<temasearch-service-location>/ts/svc/rewrite?pq=query&pu=&pqsntx=standard&
pcharset=UTF-8&px=px+pq+pqsntx+pcharset+prealm+puser+ppwd+pu&
query=somequerytext

(The prealm, puser and ppwd parameters are not included for brevity. The values for these parameters appear with your account details.) Note that also the ‘pu’ parameter should be specified to avoid an error, though the value is unused.

Note that the URLs are shown with GET-style form parameters for simplicity. If you prefer, you can use the POST method to invoke the service. POST should be used if the entire URL is likely to exceed 2K in length.

Extract data from the result

The body of the content returned by the URL contains the result, which is a list of parameters and values, one parameter and value on each line. For example:

queryParamName: word or alternative1 or alternative2 ….
otherParam: value

In the simplest case, the result includes just one line with the rewritten query parameter for your search engine. However, if you pass in other search engine parameters (parameters not listed in the px parameter), then they will also be included in the result. Finally, if you have enabled the history parameter, this will be added as the last parameter in the list.

The parameters appear in the same order they were in the URL used to invoke the service.

Invoke the Search Engine

This proceeds mostly as without direct search. The only difference is that search engine is invoked using the new query retrieve from the direct search results, instead of the query submitted by the search form.

Which query to show the user?

If your results page includes a search form for resubmitting a search, it is common practice to show the query the user typed as the default value for the query text. However, after integrating direct search, the query shown to the user will be the revised version produced by direct search.

If this is not what you want, you may be able to show the original query when generating the results page. If you have access to the script that produces the form in the results page you can then change the value of the query text box from the rewritten query (the one submitted to the search engine) to the original query submitted by the user.

Authentication

The TemaSearch account used to authorize the request is determined from the authentication parameters: prealm, puser and ppwd. These parameters must be included with the request.

End User Configuration

The integration steps so far described set up direct search with a static configuration – the configuration was hired-wired into the form or URL used to invoke the service.

Visible form controls can be used to allow the website visitor to control how direct search functions. Practical examples include:

§     Allow the user to choose the types of words that direct search will add to the query, or even whether direct search is enabled or not.

§     Allow the user to indicate the intended query language or desired target languages

With Webform Integration

User-configuration under webform integration is done by adding new INPUT elements to the form to provide visible controls such as checkboxes and list boxes for the user. The names of these new input controls are set to the corresponding direct search parameter that they control.

For example, to allow the user to enable or disable use of translations, you add

<INPUT type="checkbox" name="povs" value="1" CHECKED>

(Not forgetting of course to make sure povs is listed the px parameter value, defined elsewhere in the form.) When displayed in a browser, the form includes a checkbox that controls whether translations are added to the query. The CHECKED attribute checks the box by default, so translations would be active until the user specifically turns them off by clearing the checkbox. The other on/off-style parameters, such as psynn, pifl and penab can also be controlled in this manner.

Other parameters, such as plangin, plangout, pmax1, pmax2 can be controlled using radio buttons, or a SELECT box.

When the user submits the form, these parameters are sent to the service just as if they were hidden parameters statically configured by the administrator.

The configuration changes made by the user are not maintained from one search to the next. With each search, the configuration is reset to the default values specified in the form. To have the changes remembered from one search to the next, you can use server-side script integration.

With Server-side script

Adding user-configurable features for direct search via server-side scripting is more involved than editing the basic web form, although offers more flexibility. Here’s an overview of the process:

1     Add new INPUT elements to the search form for each item the user can configure.

2     In your server-side script that handles the form, ensure these parameters are included in the URL you construct to invoke direct search.

3     If you wish to show the settings again in the results page, write out a new set of INPUT elements on the results page, with default values taken from the submitted search form. Typically, the search query is remembered from one search to the next, and remembering the user settings will follow the same pattern.


Association Search

Overview

The association search application retrieves words related to the search query, and presents these to the user, allowing them to select which additional words are included the search.

Association search is implemented like this:

1     After the user has submitted a search query, association search suggests additional words, which are presented along with the search results.

2     If the user felt that the search failed to find what they were looking for, they can click on one of the words suggested by association search to add it to the query.

3     The word is added to the query and new search results including the new word are shown.

Selection of a word or group of words requires a single click, and with each click, a new set of search results is generated. The suggested words can be added one by one, or in groups. For more precise control, a two-stage approach is used where all desired words are first selected, and then a submit button clicked to submit the new query to the search engine.

The words suggested can be displayed with additional information, such as the language, type of alternative (synonym, spelling variation, translation etc...) Furthermore, this information can be used to group and sort the words according to configured criteria.

Finally, to fine-tune the user-interface of association search to match the style of your site, a number of customizations are available, ranging from selecting from a number ready-made stylesheets, through to complete control over the user interface via custom XSL transforms.

Association search requires that a search query can be submitted using the GET method (see Search Submit Method). If this is not possible, you cannot implement association search. Instead, consider implementing direct search using server-side scripts.

Integration Overview

To integrate Association Search, you make changes to the server-side script (or program) that handles the search form and produces the results page the user sees after submitting a search query. Here's an overview of the changes required:

§     Add a LINK tag in the HEAD of the HTML to link a CSS file for the association search user interface.  (Optional, though recommended.)

§     Add a dynamic server-side include includes the Association Search suggestions in the results page. The URL of the include is dynamically constructed from the submitted search form parameters.

These steps are described in more detail below. If you want a quick start, add the following items to your search results page, replacing server.com with the server name where temasearch is installed:

In the head of the search results page, add

<LINK href=”http://server.com/ts/css/base.css”>
<LINK href=”http://server.com/ts/css/color.css”>
<LINK href=”http://server.com/ts/css/links1.css”>
<LINK href=”http://server.com/ts/css/css3.css”>

In the body, add code that implements this pseudo-code

service = “http://server.com/ts/svc/querypresfrag.html?
tsparams = <configuration parameters for service>
searchparams = <params and values passed from the search form>
url = service + tsparams + searchparams
include-content-from(url)

Or you can modify this template URL and

TODO: sample association search URL

What the pseudo-code above does is to construct a URL, combining static configuration and including runtime information (such as the search query), and include the content provided by the URL in the search results page.

The set of parameters used here are just to get you started, and some parameters may require changes to work with your search engine. (See tips below.) At the very least, you should be able to see the association search box on the results page.

Hints and Tips 

§     The parameters that you mostly likely will need to change to get association search working properly are: pq, pqsyntx, and pcharset.

§     If your search engine does not support Boolean search, set the pqbld parameter to “r”, which activates word-replace mode.

§     Once you have the service operational, there are a number of parameters that fine-tune the service. The parameters above are given with general default values, which you can adjust to suit your own needs. All parameters are listed in the reference section.

§     You may find it helpful to wrap the included content in a container element (such as a DIV or TABLE) for easier positioning and styling.

Once you have the basic service working you may wish to refer to the sections below for more in-depth look at integration.

Linking CSS Files

The HTML returned by the Association Search has very little markup for adding color, layout and other presentation details. To style and layout the suggested words to fit with the style of your site, you can include a link on the results page to one or more CSS files. You can use one of the CSS files provided with the service or you are free to author your own. The table below describes the ready-made CSS files, all located under the /css folder under the TemaSearch service URL.

When using the ready-made CSS files, you typically include the base.css, followed by color.css and link1.css, followed by one of the cssX.css files.

base.css

Layout and CSS fixes common to all layouts.

color.css

Default color and styling for the results box.

Links1.css

Color and styling for the groups and links in the results.

css1.css

Very compact layout. All headings and text appear inline.

css2.css

Compact, with some line breaks for easier reading.

css3.css

Sparse layout. Each heading is on a new line. Alternatives for a word are arranged on one line.

css4.css

Vertical layout. Each heading and suggested word is on a separate line.

Placement of Suggested Words

The suggested words produced by association search are added to the search results page via a server side include. You may want to place the include so that the word suggestions appear in a place that fits with the existing content on the results page and with typical user search patterns. Here are some suggested placements:

§     Above or alongside the search results: This is useful when users will quickly scan the first few results and decide if they found what they wanted. If not, the alternative words are in the same field of vision.

§     Below the search results: This is useful when users often read through the entire results, as may be the case in a large site containing diverse information.

§     Below the repeat search query text box: By being close to the search entry box, users are reminded they can click on suggested words rather than having to manually typing in new words.

When you have decided where the association search words should appear, locate the corresponding place in the script that produces the HTML for the page. You then script the server side include at that point so that the association search results are textually included at the right place.

Now that you have decided where on the page the included content will be, you need to construct the URL for the content to include. The URL used to retrieve the included content comprises the association search service URL and a number of parameters

http://<temasearch-server-domain>/<service-path>?<search-engine-params>&<temasearch-params>

§     Parameters for your search engine. Search engine parameters include the search query parameter, and may also include other configuration parameters that are expected by your search engine and passed in from the search form. These parameters should be forwarded to association search so that it can include them when starting a new search. (The URLs that invoke a new search will then include the original search parameters, keeping successive searches consistent.)

§     Association Search parameters: these provide details needed by the association search service and configuration values to customize the results. These parameters are all listed within the proxy parameter list (px) to denote that these are service parameters and not regular search engine parameters. These parameters are consumed by the service and not subsequently passed back to your search engine when the user selects additional words.

In pseudo-code, you might construct the include URL like this

searchEngineParams = ...; // extract params from the current URL, e.g. from after the '?' from the current URL being served
temaSearchParams = "px=px pu pq&pq=<queryParamName>&pu=<search-submit-uri>"
temaSearchLocation = “http://temasearch-server.com/ts/svc/querytemafrag.html
temaSearchURL = encodeURL(temaSearchLocation+"?"+searchEngineParams+”&”+temasearchParams);

Once the URL is built, you then script the inclusion of content returned from this URL.

The pu parameter is set to the URL of the search query handler (i.e. the current page), as this is the target URL for new requests to the search engine. It is also used by association search to authenticate requests to the service.

Retrieving the Original Query

By default, as association search adds words to the query, the query shown to the user is the full query containing all the word alternatives added.

If you wish to show the original query, you can extract the original query from the history parameter (ptshist). This parameter is passed in to your search engine by association search. The parameter has this format:

Version “:” SequenceNum “/” keywords list “/” original query “/”query language “/”

Extracting the original query is done by extracting the substring between the second and third forward slashes.

The full format of the history parameter is given in the reference section.

Word Replace Mode

Association search works best when the search engine supports Boolean search. (See XXX.) However, association search can also be used with engines that do not support Boolean search. Rather than adding words to the query (which requires the Boolean OR function) association search can be configured to replace the original word in the query with the suggested word selected by the user. The main benefit is that replacement does not require any additional syntax or significant change to the query, which makes it suitable for all search engines, particularly those without Boolean search, and engines that search for one word at a time.

To activate this feature, set the pqbld parameter to “r”.

Query Selection Types

Association search provides two user-interface types for selecting words to add to the query, immediate selection and delayed selection.

Immediate selection, also called one-step selection: words and queries are represented as hyperlinks. With this representation a word is selected and added to the search immediately once the link is clicked.

Delayed selection, also called two-step selection: words are selectable items (e.g. checkboxes, radio buttons or single/multi-select list elements) where selection of a word does not automatically submit the search. The user must use a button or other control to start a search with the words selected.

The query selection type is given by the psm parameter. See the parameter reference for more details.

End User Configuration

Using the integration steps described above, configuration of association search has been statically wired in to the request used to invoke the service. It is naturally possible that the configuration parameters are dynamically produced, such as from the search form.

Providing items the user can configure is done by adding additional INPUT elements to the search form. The parameters submitted by these INPUT elements are included in the service URL in the same way the search query parameter, and other search engine parameters are included.

§     Add new INPUT elements to the search form for each item the user can configure.

§     In your server-side script that handles the form, ensure these parameters are included in the URL you construct to invoke association search. Additionally, add the pp parameter with a value listing all TemaSearch parameters that should be output. The service configuration parameters will be included in URLs to the search engine, which ensures new searches started by association search are consistent with the original search by using same configuration parameters.

§     If you wish to show the user-selected settings again in the results page, write out a new set of INPUT elements on the results page, with default values taken from the parameters passed to the script. Typically, the search query is preserved from one search to the next, and preserving the user settings will follow the same pattern.

Advanced Integration Strategies

Alternative Services

In service paths given so far, querytemafrag.html has been used to produce a HTML fragment for inclusion directly in your search results  pages. For building custom applications, association search provides a number of other service end-points, selected using a path.

The XML service points authenticate using the parameters: prealm, puser and ppwd. The HTML service points use the URL of the target search engine as authentication.

querytemapres0.xml

Returns unprocessed XML data from the association search service. No XSLT transforms are applied so this is service point with the best performance.

querytemapres.xml

Uses the data from querytemapres0.xml and adds a several levels of transformation. The level used is selected by the plevel parameter. The output is always valid XML regardless of the transformation used. If levels 5 and above are used, an HTML interface is returned as XHTML. These are the functions of the various levels

§     Level 0: no transformation – produces association search XML without a namespace

§     Level 1: Adds namespace declarations to the raw data.

§     Level 2: Moves qalt nodes to the same level as the corresponding alt node. This places the word alternatives and the corresponding search query at the same level.

§     Level 3: Converts alternative word data into an abstract user interface description. The description includes messages denoted by message ID.

§     Level 4: Resolves messages IDs by looking up the associated message for the current request locale.

§     Level 5: Generates a XHTML page from the abstract user interface description.

§     Level 6: Cleans up the XHTML output by removing any unnecessary namespaces, which can confuse some browsers.

§     Level -1: Apply all transformations. This is for future compatibility should additional transformations be added.

This service interface can be used to create custom applications, such as customizing the default user interface. The steps to create a custom interface are:

§     Retrieve the XML at the level that best suites your needs. Minor changes may use output from levels 5/6. More substantial changes will start with XML from lower levels.

§     Transform the XML, for example using XSLT. The results will be in whatever format you are using to describe the user interface. This is typically XHTML, although you can use any format, e.g. WML, VML, SVG etc.

§     Include the transformed content where it is needed (e.g. the search results page.).

If the data is being consumed by other programs, for example when building TemaSearch into the indexer of a search engine, levels 0-3 are typically most useful.

querytemapres.html

Produces the whole-page HTML user-interface for association search. This is an HTML serialization of the full XML (level -1) produced by querytemapres.xml. The HTML is produced as a self-standing page, and is suitable for including via embedded IFRAME or similar.

querytemafrag.html

Produces the association search user-interface as an HTML fragment for including directly in web pages. Essentially a HTML serialization of 'Level - 1' XML produced by 'querytemapres.xml', with the HTML document structure elements removed.

Scalability and Reliability

When availability and performance are critical, you may want to consider these strategies as an extension to the basic server-side include.

§     Error handling: trapping/catching any errors from the server-side include so that searching is uninterrupted should the service be unavailable or fail for some reason.

§     Timeouts: adding a timeout to the URL include so that search results are not held up if the service is heavily loaded.

§     Load Management: if a certain number of errors or timeouts occur, temporarily stop sending requests to the service. Run checks on a background thread and enable the service when it becomes available/more responsive.

Customizing the User-Interface

The user-interface is constructed using HTML+CSS, produced via a series of XSL transformations. The gives the possibility for user interface to be changed in the following ways:

§     Make minor changes to the existing color scheme and layout: use an override CSS file to override selectors for specific features and styles in the standard CSS files

§     More radical color and layout changes: create a new custom CSS file.

§     customize the interface text messages by creating new text files

§     Redefine the structure and operation of the user interface: using custom XSLT to produce the markup for the interface from an intermediate representation.

§     Entirely define user interface by using XSLT to transform the raw data from the association search service.

Working with CSS, you can change many presentation details, but the structure of the UI remains much the same. Using XSLT, the UI can be completely re-structured and re-designed, or targeted to a different presentation format, such as WML for mobile devices, SWF for highly interactive user interface or DHTML+javascript for a richer HTML interface.

How to show no message when there are no results for he given word

By default, when there are no suggested words for a query, the association search results include a message like "No alternatives for <query>".

You may prefer not show any message if no suggestions are available. This is done by defining the style

.tsresultsnone { display:none; }

With this style defined, when there are no results, no message is displayed.

See CSS Styles for more details.

Customizing Interface Messages

[Internal Service Only]

The messages for the user interface can be extended in the following ways:

§     Translations for new languages can be added. For example, if your site is multilingual, you may wish the association search interface to be available in all the languages supported by your site.

§     Altering existing messages

New files containing custom text messages are placed in /WEB-INF/deploy/i18n. (This directory may need to be created.) The new resources are named temasearch_<locale>.properties.

The existing messages are found in the \WEB-INF\lib\parasearch-core-impl-<version>.jar file, named messages_<locale>.properties. These should used as a template for creating new languages or customizing existing messages.

Add support for a new language by

4     copying an existing message file into the message resources directory, renaming it to temasearch_<newlocale>.properties.

5     In the new file, translate all the message strings to the new language.

For example, translating the English text to German, you would

6     copy messages_en.properties from inside parasearch-core-impl.jar to /WEB-INF/deploy/i18n/temasearch_de.properties.

7     translate all the text in temasearch_de.properties, taking care to keep all of the message names intact (the message name is everything before the first '=' on each line.)

You can customize an existing message by

8     For the locale you want to customize the message in, create /WEB-INF/deploy/i18n/temasearch_<locale>.properties file if it doesn't already exist.

9     Locate the message name in the existing resources and copy this to the corresponding temasearch_<locale>.properties file you located in the previous step. Add an '=' after the message name followed by your new text for the message.

10 If you wish to customize the message for multiple locales, then repeat for each locale.

Testing and Troubleshooting

In most cases, it is usually immediately obvious if the service is not set up correctly as it just does not function or you get an error message. There are however a few problems that can easily go unnoticed. The following sections describe what these problems are and how to identify and fix them.

Character encoding

A character encoding mismatch can result in some characters being displayed incorrectly. Perform these tests to check character encoding is correctly set:

§     Type in queries that contain characters not part of the normal ASCII character set, e.g. accented letters, Scandinavian vowels etc. Verify these are recognised correctly by the search engine and that subsequent searches from association search also function correctly.

§     Type in a word containing only ASCII characters that produces results containing non-ASCII characters. Again, verify these are recognised correctly by the search engine and that subsequent searches from association search also functions correction. 

§     See the pcharset for details about the correct setting for character encoding.

The History Parameter

When generating URLs for new search requests, the service adds a history parameter to maintain state between requests, such as the initial query and input language. This parameter is passed to the search results page script, and should be included in the invocation URL of the association search service. Of course, if you have arranged so that all request parameters are passed to the service, then this will happen automatically. In other situations, you should pass on the parameter explicitly.

As the history parameter is treated as a service parameter, and should be added to the service parameter list (px parameter) as usual.

To check the history parameter is being passed through your search handler, type in a query to produce some suggested words. Inspect the URLs linked to the suggested words: they will include a ptshist parameter, with a value starting with 1,1. The first number is a version identifier, and can be ignored for now. The second number is the invocation count. As additional suggested words are added to a query, you should see later URLs include an ever-increasing invocation count.

If this is not happening, check that

§     The invocation URL that you construct includes the ptshist parameter. This parameter will be passed in to the search engine when the service performs a new search, such as when the user clicks on words suggested by the service.

§     The parameter ptshiston is not set to 0. Doing so will turn off use of the history parameter.

If the search history parameter is not passed through, as the user selects words from the list of suggestions, the next list of suggestions can actually grow, as it now includes suggestions for the words most recently added to the search. In the worst case, when using language auto-detection (the default, see plangin) the detected language can change. This will often result in a completely new set of word alternatives being produced, and the resulting inconsistency can be confusing to the user.


Combination Search

Overview

Combination search uses both direct search and association search together for the same search page. For example, you might configure direct search to automatically add words to the query that have the same meaning, such as near synonyms, spelling variations and inflection variations, while using association search to allow the user to select from words that may be more distantly related, such as general synonyms and translations.

Implementation

Combination Search is implemented by applying direct search and association search to the same search results. It is implemented as changes to server side scripts for the search results page. (The webform-style integration for direct search should not be used when implementing combination search.)

Most of the implementation details for combination search are exactly the same as for implementing direct search and association search. The main difference is that some handling of the history parameter is needed; the history parameter is passed from direct search to association search.

When handling an original query from the user, invoke direct search on the query to rewrite the query before using it to run a search. You can determine when a query has come from the user when there is no ptshist parameter passed to the page. (Re-queries from the user clicking additional words always include a ptshist parameter.) Direct search is invoked on an original query like this

1     Determine what types of words you want added automatically to the query, and use this to configure direct search. The configuration will indicate what types of words to include, input/output language etc. You need to include changes to the default configuration, as the default includes all word types, and thus would add all words automatically, leaving none available for the user to select from, which defeats the aim of combination search.

2     Based on your determined configuration, build a URL to invoke direct search. This is done exactly as for regular direct search, but you also include the ptshiston parameter, set to the value “1”. This instructs direct search to generate a ptshist parameter, which is needed by the later call to association search.

3     Extract the new query from the result and use this to invoke your search engine and produce the search results page. Also extract the ptshist parameter from the result and save for use when invoking association search later.

At some point when producing the results page, invoke association search (following much the same pattern as for regular association search):

1     Determine the configuration of association search. This will typically include less restrictive settings compared to those used by direct search, such as more types of alternatives enabled, or more words allowed.

2     Build a URL to invoke association search. Be sure to include the ptshist history parameter. If direct search was invoked, the ptshist parameter value can be taken from the direct search results. Otherwise, the parameter will have been passed into to the script as a page parameter and the value can be taken from that

3     Fetch the content from the association search URL and include the content in the page.

Because combination search is essentially a marriage of direct and association search, all configuration details available for direct and association search (e.g. end user configuration, customizing the user interface) are also available when implementing combination search.


Custom Applications

Both direct search and association search applications provide a lower-level service interface which can be used to integrate TemaSearch into your own applications. Which service you choose is governed by how much detail you require

§     Direct Search provides a search query containing the alternatives, or simple list of word alternatives for a given word

§     Association Search provides varying levels of categorization of the alternatives, and has facilities to support construction of a user-interface.

Some examples:

§     Adding word alternatives to search index (basic): the Direct Search service is used to retrieve alternatives for the words in the index. These are added to the index as aliases of the original word.

§     Adding word alternatives to search index (advanced): Using the Association Search service, alternative words can be added to the index as above, and included is information describing the type of alternative. This might allow per-language or otherwise more refined selection of word alternatives from the index.

§     Database Search: a SQL query processor that alters the SQL query (e.g. the WHERE clause) to include additional words.

Reference

 

Configuration of Word Alternatives

TemaSearch provides these types of word alternatives for both association and direct search:

 

Type of word alternative

Configuration Parameter

Base forms

pbase

Inflected forms

plemifl

Alternative inflection endings

pifl

Alternative spellings, near synonyms

psynn

General synonyms

psyng

Translations

povs

 

Further control over the alternatives is available with these parameters:

Stem alternative words

pstem

Limit number of alternatives per original word

pmax2

Limit maximum words per query

pmax1

 

 

Service Parameters

TemaSearch parameter list

px

Search engine/authentication URL

pu

Service authentication

puser,pgroup,ppwd

Error handling control

pnoerr

Enable/Disable service

penab

Licence control

paltlic

Input/Output Character set

pcharset

Select profile for parameter defaults

pprof

TemaSearch putput parameter list

pp

Query Parameters

Search engine query parameter name

pq

Query syntax

pqsntx

Language Parameters

Define query language, or enable auto-detection.

plangin

Restrict language of words in the result query

plangout

Set preferred language for auto-detection

plangpref

Association Search Parameters

Sort order for suggested alternatives

psort

Grouping levels for suggested alternatives

pgroup

Type of new search queries

pqbld

Select words in same group level

pqgrpidx

 

 

User Interface Parameters

One or two step selection mode

psm

Presentation language

plocale

Show truncated input query

pqtrunc

XSLT transformation pipeline length

plevel

 

 

Supported Query Syntaxes

The table below lists the query syntaxes supported. These syntaxes are used by the service to determine the meaning of the query and for adding new elements to the query using the correct syntax.

Not all elements of your search engine syntax need supported by the service. Select the one that most closely matches the syntax. If your chosen application uses requires Boolean search syntax, the syntax of OR operator should match that expected by your search engine. Support for other operators is optional, as any unknown query syntax elements will be ignored and left unchanged by the service.

Resolving Parameter Name Conflicts

The HTTP interface to TemaSearch services expects configuration parameters to be passed as query parameters in the URL or as POST parameters. In many cases, these parameters are specified alongside parameters for your search engine. Although TemaSearch parameter names have been chosen to be fairly unusual so not to clash with search engine parameter names, it is still possible that a TemaSearch parameter has the same name as a search engine parameter.

If this happens, you can rename the TemaSearch parameter to avoid having two parameters with the same name. Renaming is done by adding an underscore '_' at the front of the parameter name.

For example, if the parameter puser was already used by your search engine, then the TemaSearch parameter should be renamed to _puser. The px parameter lists all the TemaSearch parameters in the request, and includes each parameter using the name as it appears the form. For example, after renaming puser to _puser, the px parameter would then be defined as “px=px … _puser”. (Ellipsis is used for other parameters not shown here.)

 

Service Parameters

Common Parameters

Pnoerr = instructs the service to degrade gracefully when there are errors. If an error occurs, such as authentication failure, incorrect parameter, etc.., ordinarily the server returns a 500 error code and an error message. Setting pnoerr to 1 causes the service to gracefully degrade by returning the original query unmodified.

This is principally intended for use with direct search webform integration, as the search would not proceed if the service failed to send back the HTTP redirect. For server-side integration, errors can be caught and dealt with, so this parameter is less useful then.

ptshiston – enables/disables the history parameter. When set to 1, a history parameter is added to the result. When set to “0” no history parameter is produced. Defaults to “0” for direct search, and “1” for association search. You typically only change this parameter when implementing combination search (see XXX.).

Version: Describes the version identifier for the remainder of the history parameter. Current version is version 1.

SequenceNum: Indicates how many times association search has been used to add to the query.

Keywords list: the original keywords extracted from the query. This is the data used by association search when rebuilding a query.

Original query: the original query text in full.

Query language: the language used to generate alternatives for the query.

 

plangin, a list of languages that the query is written in. If not set, defaults to auto-detection of language.

plangpref If automatic language identification is active (plangin param not defined), this gives the preference to the languages defined here. The first language in the list gets highest preference, followed by the second and so on.

pqt – the query text. This is used to specify the query text in direct service calls, when the service returns the result directly, rather than as a rewritten search URL.

pprof – selects a ready-made profile. This can avoid specifying many different parameters.

padjlic – Controls what happens when features are used are not valid for the current licence. The default is “1”, which simply deactivates those features that are not licenced without producing any error. When set to “0”, use of an unlicenced feature will cause the call to the service to fail with an error message.

Pp – propagate parameters. By default, service parameters denoted by the px parameter are consumed by the service, and are not passed through to the results. If TemaSearch service parameters are also used externally (by your search engine for presenting configuration controls, logging or some other use) then they should also be included in the URLs generated by the service.

For services that return a direct result, such as direct search /rewrite, authentication is performed by inspecting the account credentials provided in the parameters. prealm, puser and ppwd. These details are used authenticate use of the service, and determine what features are licensed.

Psort – For association search, this parameter specifies the sorting order of the alternative words. Sorting of words can happen at multiple levels – the parameter is a comma-separated list of sort keys. For example, “l,w,t” sorts by the language of the alternative word, then by the index of the original word, and then by the word text itself.

The default value is “”, which indicates the words are not sorted, and will appear in no particular order. The order is not guaranteed to be the same each time the service is called, so it is recommended you apply a sort order.

The available sort keys are given in the table below.

T

Orders alternative words alphabetically.

L

Orders words by their language. The language of a word is <ISO-language-code>_<ISO-country-code>_<variant>. For example, Nynorsk is no_NO_NY. See http://java.sun.com/j2se/1.4.2/docs/api/java/util/Locale.html for more details.

This sorting is most useful when accessing the data programmatically. For user interfaces, the l2 sorting order is typically more useful.

S

Sorts alternative words by their score, placing words with higher (better) score first. The score reflects how close a word alternative is to the original word in the query. Please note that the score feature is only partially implemented and available only for certain types of word alternatives.

w

Sorts alternative words by the corresponding original word they relate to. Using this sort key, all alternatives for the first original word appear before alternatives for the second original word and so on.

l2

Sorts alternative words by how close their language is to the language used in the original query. Alternative words that are in the same language appear first followed by those that are in a closely related language. More distant languages come later.

mc

Sorts alternative words by their category. Categories are sorted in this order:

§     Same-language alternatives

§     Translations

At present, this performs essentially the same function as the l2 sort order. However, future versions will include additional categories, such as word correction.

mt

Sorts alternative words by what type of alternative they are. The types are sorted in this order, which has a bias towards putting closely related types before more distant types

§     Base forms (pbase)

§     Inflections of a word (plemifl)

§     Alternative inflection spellings (pifl)

§     Near synonyms (psynn)

§     General synonyms (psyng)

§     Translations (povs)

Putting an exclamation mark in front of the sort key reverses the order. For example, “1,!t” sorts words by language and then reverse alphabetical order.

Pgroup – Defines how alternative words are grouped. When displaying alternative words, they can be shown as a simple list of words (the order of this list specified by the psort parameter). Alternatively, the words can be grouped under various headings. For example, specifying grouping as “l” (language), will display the language for the alternatives.

Grouping works by gathering together words that have the same value for the grouping key (language, word text etc.) Thus grouping and sort order a related – if you are grouping on a attribute (say language) then you will also use a sort key for that attribute. (If the sort order is different from the grouping order, then some headings may appear more than once, which can be confusing.)

Sorting and Grouping

The alternative words produced by association search can be sorted and grouped for easier reading and structured presentation. Sorting the words is done by comparing various attributes of each word with the others in the list. For example, sorting by word text will sort the words alphabetically.

Grouping takes all words with the grouping attribute (word text, language, score etc.) and places these under a group heading.

In Detail

As you might expect, the sort order and grouping structure are closely related. For example, if you wish to group the alternatives first by language and then by original word, then the sort key should include at least sorting by language and then original word.

It is possible to have grouping and sorting not follow the same pattern, but the results may not be what you would expect.

It may seem redundant to specify sorting and grouping separately. The reasons for doing this are:

§     the number levels of sorting keys can be different from the number of groups. The additional sorting keys can be used to control the order of groups using a non-grouped  attribute, or can be used to sort words within a group.

§     the grouping key only requires knowledge of the attribute, sorting also requires knowledge of the specific sort order - some attributes can be sorted by different criteria (for example, language has two different sort keys, l and l2.)

An example. We wish to show the alternatives grouped by language and original word. We want the language to be sorted by relevance to the original query. Additionally, the list of words for each original word should be sorted alphabetically. This is done by setting:

§     psort=l2,w,t – sort by language relevance then original word number and then word text.

§     pgroup=l,w – group by language and then original word.

Query Builder

When association search has determined the alternative keywords available, it uses these to produce a number of search queries based on those keywords. For example, when a user clicks on a word to produce a new search, this is using a new query produced a query builder.

s

single word

A new query is produced for each word alternative, that comprises the original query plus a single word alternative.

This is the type of query that adds words one by one to the search.

m

multiple words

A new query is produced for a number of word alternatives taken from a group.

ms

single and multiple words

Combines single and multiple. For a group of words, one multiple word query is produced containing all alternative words in the group, and a number of single word queries are produced, one for each alternative word.

This is used to produce single selectable words, and a [all] selection for selecting groups of words in one hit.

r

replace words

A new query is produced for each word alternative. The word alternative is included in the query, but the original word is not. This effectively replaces the original word in the query.

This type is suitable for search engines that have no support for Boolean search.

The m and ms query types produce queries containing multiple words. The set of alternative words are put into a multiple word query is determined by the pqgrpidx parameter. By default, all word alternatives produced placed in a single multiple word query. By setting the pqgrpidx parameter, a multiple word query can be created for each group at a given grouping level.

 

pqgrpidx – Set the group level for turning words into queries.

NB: If the parameter is set to a value greater than the number of groping levels, no warning will be given, yet no alternatives will be produced. This is because the query builder never finds the specified group level, and so no queries are produced.

psm – Word Selection Mode

§     Delayed (single-step) – This is the two-step selection mode, where the user can click to select or delect words. A new search is run only when the user submits the selection.

§     Immediate (two-step, default) – This is the single-step selection mode. When the user selects a word or group of words, the selection is immediate and a new search is started.

If you are using delayed select, be sure that the association search results are not placed inside an existing FORM. The selection user interface for two-step select is implemented itself using a FORM. Including this inside an existing FORM will result in nested forms. This is not allowed by the W3C HTML standards and is best avoided, as support and behaviour varies from browser to browser.

Plevel – Sets the pipeline level

Controls how long the transformation pipeline is that transforms XML output from association search.

Plocale – Sets the language for the user interface.

Association search includes various messages as part of the standard user interface. The language used for these messages is specified by this parameter. The messages are available in these locales as standard:

no_NO_B

Bokmål

no_NO_NY

Nynorsk

If you are using the locally installed service, you can also add support for additional languages. (See XXX.)

If no value is specified for the parameter, the default value is determined like this:

§     If the language of the original query was given, or determined automatically, the interface language is set to that.

§     Otherwise, the language set in the host operating system is used.

pqtrunc – Sets the maximum number of characters to use for the query in the user interface. When the user enters a long query, this can cause the association search title box to become considerably larger than the rest of the interface. As the query often appears elsewhere on the page, the query can be truncated to save space.

The default value is 30 characters.

 

Mandatory parameters are parameters that do not have a default value. Such parameters must be defined, or the request will fail with an error message indicating the missing parameter.

'paltlic' parameter

Summary

Alters options to comply with licenced features.

Details

If an option is selected for a feature that is not licenced, normally an error is produced. (The result of the error depends upon the pnoerr parameter, but at the very least, the no alternatives will be added to the query. Setting this parameter allows the search to continue using those features that are licenced, and ignoring the request for unlicenced features.

In a production environment, you would typically set this value to true so that attempted use of unliceced features does not stop temasearch from being used for those features that are licenced. In development, setting this value to false can help uncover that some features are not working because they are unlicenced.

Values

false

Don't adjust features to comply with the licence. If an unlicenced feature is used, the query is not rewritten at all and an error is returned.

true

Adjust features to comply with the licence. If an unlicenced feature is requested, the request is ignored and the search continues using those features that are licenced.

Default

true

'pbase' parameter

Summary

Adds the baseform of an inflected form.

Details

When the query includes an inflected word, the non-inflected for of that word is added to the query.

If steming is active, this parameter has no effect.

Values

1

Baseforms of inflected words are added.

0

Baseforms of inflected words are not added.

Default

0

'pcharset' parameter

Summary

Indicates the character set expected by the search engine.

Details

This is usually the same as the character set defined by the META tag. For example

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

This defines the character set to be "utf-8", so the pcharset should be set to "UTF-8".

Values

ISO-8859-1

Selects ISO-8859-1 as the character set.

UTF-8

Selects UTF-8 as the character set.

Default

ISO-8859-1

'penab' parameter

Summary

Enabled or disables TemaSearch.

Values

0

TemaSearch is disabled and no changes are made to the search query.

1

TemaSearch is enabled and alternative words are added to the query.

Default

1

'pifl' parameter

Summary

Includes alternative inflections.

Values

0

alternative inflections are not included

1

alternative inflections are included

Default

1

'plangin' parameter

Summary

Provides a hint as to the language the user is using to enter the search query.

Details

When the search query could be both bokmål and nynorsk, the language TemaSearch assumes it to be is given by this parameter. Note that if the language of the input query is not ambiguous, then this parameter has no effect.

This parameter is useful if the user base is strongly biased to one langage or the other.

Values

no_NO_B

When ambiguous, the search text is treated as bokmål.

no_NO

When ambiguous, the search text is treated as nynorsk.

(Empty string)

The search text is treated as both bokmål and nynorsk.

Default

(Empty string)

'plangout' parameter

Summary

Selects the language for new words added to the query.

Details

This parameter is usually set if the documents being searched are all written in the same language. Setting this parameter to that language ensures that the rewritten query only contains words in that language. If the documents being searched comprise a mix of languages, this parameter should not be set, so all relevant languages are included.

While similar, this is not the same as disabling translation. For example, assume the output language is set to bokmål. A nynorsk user will still require translation (to bokmål) while a bokmål user will not. So, translation is used if the text the user types needs translating to the output language.

Values

no_NO_B

Only bokmål words are added to the query. If they were translated, any nynorsk words in the original query are removed.

no_NO

Only nynorsk words are added to the query. If they were translated, bokmål words are removed.

(Empty string)

Both bokmål and nynorsk are added to the query and no words are removed.

Default

(Empty string)

'plemifl' parameter

Summary

Adds all inflections of a non-inflected word.

Details

If the query includes a non-inflected word, then inflections of that word are added to the query.

Inflected forms are output even if stemming is active (pstem=1), though you typically will not enable this parameter if stemming is active.

Values

1

Baseform inflections are enabled.

0

Baseform inflections are disabled.

Default

0

'pmax1' parameter

Summary

Limits the total number of words that are included to the query.

Details

This is useful for keeping the overall number of words to a reasonable level, especially when the search engine accepts upto a specific maximum number of words. Often, if the number of words goes over this limit, the additional words are ignored, which may cause important words in the query to be dropped. When TemaSearch can provide more words than there is room for, it discards the least relevant words to ensure that all the original words and the 'best' alternatives are included in the query, without going over the total-word limit.

Values

-1

The number of words allowed in the query is not limtied.

N

Limits the number of words to

Default

-1

'pmax2' parameter

Summary

Limits the number of words that TemaSearch will add to each word in the query.

Details

If many options are selected, such as both translations and synonyms, some words may have a large number of possible alternatives. In some cases, a large number of alternatives can make the search less precise. Setting this value to a lower number can help maintain a high search precision. When there are more alternatives for a given word than are allowed by this parameter, the least relevant words are discarded until the number of alternative words is within this limit.

Values

-1

The number of alternatives added to each word the query is not limited.

N

Limits the number of alternatives for each word to

Default

4

'pnoerr' parameter

Summary

Controls error reporting.

Details

This parameter is intended as a development aid. When setting up TemaSearch or making changes to your TemaSearch configuration, it is a good idea to enable error reporting by either removing this parameter or setting it to "0". If there are problems with any of the parameters, these problems are shown instead of the search results so that you can fix the problem. Eventually, when you have tested that TemaSearch functions as you want, you can disable error reporting. This will ensure that your users are not disrupted should errors occur when using TemaSearch.

Values

0

errors are reported

1

errors are not reported

Default

0

'povs' parameter

Summary

Includes translations (nynorsk or bokmål).

Values

0

translations are not included

1

translations are included

Default

1

'pq' parameter

Summary

The name of the parameter in the original form that holds the search query.

Details

This usually corresponds to an input field of type "text".

'pqsntx' parameter

Summary

Indicates how the search engine expects the query to be written.

Details

This is used by TemaSearch to ensure new words can be added in a way that is compatible with the search engine. Choose a syntax that most closely matches the features of your search engine. The most important features are how the search engine expects mandatory and alternative words to be written. If a syntax includes features that are not available with your search engine, that is not a problem. The features indicate that TemaSearch would understand the feature should a user use it with a query. TemaSearch only changes the query using the boolean OR operator, in the form is written for your search engine.

All syntaxes allow the use of the word prefixes + and - to specifically include or exclude words. Temasearch does not expand these words.

Values

standard

Adjacent words are assumed to be mandatory (boolean AND). Optional words are separated with 'OR'.

standard-lower

Adjacent words are assumed to be mandatory (boolean AND). Optional words separated with 'or'.

standard-lower+paren

Adjacent words are assumed to be mandatory (boolean AND). Optional words separated with 'or'. Alternatives added by temasearch are included in parentheses, which can help group the terms correctly in complex queries.

standard-or

Words are assumed to be alternatives (boolean OR) unless otherwise indicated with a boolean operator.

bool-paren-and

Words not enclosed in parentheses are assumed to be mandatory (boolean AND). Word enclosed in parenteses are alternatives (boolean OR.)

bool-plus-comma

'+' for boolean AND, ',' for boolean OR. Words are separated by whitespace are taken as phrases.

norsk-bool-paren-and

All words are assumed to be mandatory (boolean AND). Words enclosed in parenteses are alternatives (boolean OR.) The aliases OG, ELLER, IKKE are available.

'pstem' parameter

Summary

Controls stemming.

Details

An inflected word (such as a plural noun, or past-tense verb) produces alternative words that are similarly inflected. Some search engines automatically search for inflections of a word, and so there is little need to include alternative words as inflections, as the non-inflected form is sufficient. Use of stemming with search engines that do not search for inflections, that is, engines that perform an exact match, can help reduce the number of search terms for a more focused result.

Values

0

No stemming is applied. Alternative words added to the query are inflected according to the input word.

1

The input words are stemmed. Alternative words added to the query are not inflected, even when the input words are inflected.

Default

0

'psyng' parameter

Summary

Includes general synonyms. General synonyms are words that are close but not always identical in meaning for all senses of the original word. This is typically used to expand the search into related areas.

Values

0

general synonyms are not included

1

general synonyms are included

Default

0

'psynn' parameter

Summary

Includes near synonyms. Near synonyms are words that are identical, or almost identical in meaning to the original word. These mostly include spelling variations for a given word.

Values

0

near synonyms are not included

1

near synonyms are included

Default

1

'pu' parameter

Summary

The fully qualified URL of the search results page.

Details

This is usually the same as the action attribute, though this parameter must be fully-qualified, so protocol and server name will need to be included in the parameter if not present in the action.

'puser' parameter

Summary

The name of the TemaSearch account to use to gain access to TemaSearch services.

'px' parameter

Summary

Lists all TemaSearch parameters added to the form.

Details

The FORM element contains parameters for your search engine and parameters for TemaSearch. In orderthat TemaSearch can locate information in need, all TemaSearch parameter must be listed in the px parameter, separated by a space.

For example, if the parameters "pq" and "puser" were added, then the "px" parameter should be defined as "px pq puser" (note that the list includes the px parameter itself.) If any of the temasearch parameters have to be renamed to avoid clashes with existing parameters, the renamed parameter should appear in the list. (See resolving parameter conflicts for details.

 

CSS Selectors

The CSS selectors used by association search are described below. The selectors and structure used in the user interface is more complex than would be necessary if all browsers supported the CSS 2 selectors.

 

Selector

Description

Example Uses

#tsresult

Root DIV containing all association search content.

Can apply temasearch-wide formatting at this level.

.tsresultsnone

Class indicating no results were available. Set on the outermost element.

Style UI differently when there are no results. (e.g. hide it completely.)

.tsresultsnomore

Class indicating that the user has selected all available words. Set on the outermost element.

 

.tsresults

Alternative words are available. Set on the outermost element.

 

DIV.tsheading1

Heading text, contains overall information for the words returned..

 

SPAN.query

User query text

Can highlight query as it is variable data

.tsbody

Class for main bulk of the user interface, apart from the headings and footer.

 

DIV.immediate-select

Container for immediate selection results

 

#groupc-<class>-<instance>0

Container for a grouping instance.

 

DIV.groupN

Group container class for level N

Apply formatting to all roups as a given level.

#group-<class>-<instance>

Group for specific class and instance.

 

A.tswordlink

Link to directly modify query.

Custom hover, highlight etc..

DIV.tsactionGroup

Container for command buttons associated with word alternatives

 

SPAN.grouptitle

Outer SPAN enclosing group titles

Apply styles to all group titles.

SPAN.grouptitleN

Inner SPAN enclosing group title for a specific group.

Apply styles to specific group titles.

.tsimsel

Immediate selection container. Can contain groups and selection words.

                                   

.tsdelsel

Delayed selection (using input elements and forms to build query)

 

.tsdelsel FORM

Input form for delayed selection.

 

SPAN.product

Class assigned to text describing the product.

Used to highlight product name.