TemaSearch Technical Documentation
Introduction to TemaSearch
TemaSearch is a
system for finding search term aliases. It aims to improve the effectiveness of
your existing search system by finding additional words that are related to the
words given in a search query. Out of the box, it comes with inflection
dictionaries, lists of spelling variations, close and general synonyms, and
translation lists – available in both Bokmal and Nynorsk. The built-in
applications allow you to add selected types of words to the search query, with
a choice of interaction mode, from fully automatic addition of new words,
through to full manual selection by the user, or a combination of these two.
Please note that
this documentation discusses all features available in TemaSearch. Some
features may not be available for certain types of licences.
General Features
§
Database of a variety of types
of word alternatives
§ Includes related words automatically in searches
§ Manual selection of words to include in search.
§ Features configurable by the administrator and administrator can select features to be configured by user.
§ Customizable user-interface style, layout and content.
§ Implemented HTTP services following the REST pattern, suitable for caching.
§ More than 50 queries per second on a single CPU core.
§ Lower-level service interfaces for building custom applications.
Implementation Steps
1 Install the service (internal service users only.)
2 Determine which applications you plan to use (direct search, association search) and which features you require.
3 Apply the required changes to your website.
4
Test and fine-tune the service
The documentation is organised around these steps. First, how to install the service, followed by a high-level discussion of the applications available. Each application is then described in detail, and finally there is a reference section.
Installation
This section
describes the steps needed to get TemaSearch up and running for these access
types.
TemaSearch is made
available either as a hosted service or as an internal service. The hosted
service is managed externally by a third party. The internal service is
installed, deployed and managed by your organisation.
No installation or
deployment is required, as this has already been done for you. To access the
service, you need your account name and the URL of the TemaSearch service. For
some services you will also need account group and
account password.
Your organisation
deals with setting up and running the internal service, and has full control
over how the service is configured, deployed, and maintained. This section
describes how to install and configure the service, and how to verify that the
service is running.
TemaSearch is a web
application built using Servlet technology. It requires modest resources and for
many sites can be deployed in a shared application server running your existing
web applications.
A typical server
configuration is
§ 1GHz CPU or faster
§ 256MB RAM (or enough for the servlet container plus 100MB.)
§ 30MB disk space for the installed app.
§ A servlet container compatible with the servlet 2.3 specification.
Under this
configuration the query processing throughput is around 10 to 100 queries per
second depending upon the features used. The hardware requirements needed for
your site may of course vary from this according to the load you expect to
place on the service and on the request throughput required. Sites requiring
more than a sustained 50qps may need to use a dedicated application server, or
a cluster of servers for even heavier loads.
The service is
entirely self-contained and does not require additional external resources or
databases.
The internal
service is delivered as a compressed web application (WAR) file. In order to
work correctly, it should be deployed to your servlet container in an
uncompressed form. How this is done varies from one servlet container to
another, although the procedure is usually something like this
1 Using unzip or a similar tool, extract the files from the WAR file to a directory on the server
2 Configure the container to load the web application in that directory.
Each web
application requires a unique context path within the container. TemaSearch can
be installed to any valid context. In the URL examples given, we assume the
context path is /ts.
Now the service has
been deployed to the servlet container, it needs to be configured before first
use. Configuration comprises:
1 creating deployment directories
2 creating configuration files
3 installing the licence file
All deployment
files specific to your own deployment of the service are stored under \WEB-INF\deploy. You will need to this directory after the
first installation. When the application is upgraded or reinstalled, the files
this directory will not be overwritten, so your configuration settings will be
preserved.
Logging information
is stored under \WEB-INF\deploy\logs. This directory should also be
created.
Create the
configuration file \WEB-INF\deploy\deploy-conf.properties and add this line
ps.install.url=<URL
of installation context>
For example,
ps.install.url=http://www.myserver.com/apps/ts
for a HTTP server at domain www.myserver.com, where the servlet container is under the apps path and TemaSearch was installed under ts. Notice that no final slash is required.
As part of the
internal service distribution, you will have received a licence file named
licence.bin. This file should be copied to \WEB-INF\deploy\a. That is, the licence file is saved
in the deploy directory and renamed to the filename 'a'. The unusual naming is
a security feature to protect the licensing system.
The distribution also includes a file called suac.xml. This should be copied to the \WEB-INF\deploy directory.
The internal
service is configured with a single account that is used to authenticate access
to the service. The account details are stored in the suac.xml file. The user, realm and password
used to authenticate the account (puser,prealm,ppwd
parameters) should be set to the name of the account that appears in the suac.xml file.
After the service
has been deployed and configured, you are ready to verify that the service is
operational. You can check the service is running by browsing to /version.xml under the deployment URL of the application (e.g. http://www.myserver.com/apps/ts/version.xml), which displays version number for
the software.
TemaSearch provides
two main applications, direct search and association search. Both applications
provide alternative words for search queries but have different ways of
interacting with the user:
§
Direct Search is mostly transparent
- words are added to the search without involving the user and requires no changes to the user
interface. Optionally, form controls can be added to enable the user to
configure direct search features.
§ Association Search is user-oriented - words are presented for the user to select for including in the search. Association search features a rich, customizable user interface for displaying and selecting alternative words.
Both applications can be statically configured by the administrator, and/or configured by the site visitor using FORM elements added to the existing search form.
Apart from the main
difference of automatic vs. manual selection of words, the search applications
have these main technical differences, which may influence which application
you implement.
Aspect |
Direct Search |
Association
Search |
Implementation |
Web form or |
Server-side
scripts only |
Requires Boolean search |
Yes |
No |
User-interface
changes |
None required, optional form controls for user configuration.
Optionally, the alternative words used in the search can be shown to the
user. |
Optional search form
controls for user configuration. Alternative words shown with the search
results for user selection. User-interface is fully customizable. |
Performance per CPU core |
Ca. 50 queries
per sec. |
Ca. 20 queries per
sec. |
|
|
|
Service parameters are used to supply configuration details and search query data to the service. TemaSearch is implemented as a HTTP service, and so these parameters are either form-encoded, using either a GET method query on the URL, or sent as POST data sent.
Parameters to the service fall into two main categories: search engine parameters, and TemaSearch parameters.
Search engine parameters: These parameters are eventually destined for your search engine and are not consumed by the service. They are passed into the service, possibly modified and are then passed out again in the search results.
TemaSearch parameters, by contrast, are consumed by the service and are not present in the results, and thus not passed on to your search engine.
To inform service which parameters are service parameters, a parameter list is used (the px parameter.). This lists all service parameters, and only parameters in this list are taken to be service parameters. Parameters not in the list are search engine parameters by default.
Thus, when adding new parameters to a call to the service, be sure you remember to update the parameter list! If you forget to do this, the service will not recognise the parameter as a service parameter, and so it will not be unused. If the parameter is mandatory, this will give an error, while default parameters will assume their default value, and not the value passed in.
A core function of
the service is input query rewriting, where words are added or replaced to
produce a new query. For the new query to be understood by your search engine,
any modifications must of course be made using the syntax expected by your
search engine. TemaSearch has support for a number of popular query syntaxes,
described in the query syntax reference. If the syntax
used by your search engine syntax is not supported, contact Nynodata to discuss
possible implementation choices.
The standard
applications work best when the search engine supports Boolean search.
TemaSearch does not use all Boolean functions, but requires that the search
engine has a syntax to indicate “OR”: that
is, for two words A and B the engine should retrieve documents containing word
A or word B or both. Some search engines support this type of query by actually
writing “OR” between the words, or by writing the words one after
the other.
Once you have
determined the syntax expected by your search engine, you should check that it
is one that is supported by TemaSearch. The available query syntaxes are
described in the reference section.
On our test system, a 2.4GHz Intel box running Windows, we obtained these average query times:
§
Direct Search: less than 20ms
per query (50 qps)
§ Association Search: 50ms per query (20 qps)
The direct search
application is approximately 5 times faster than association search. This is
mostly due to the XSLT transforms used to produce the association search user
interface. Whether this performance difference is significant or not depends
upon your desired throughput and how the service is hosted.
All HTTP interfaces to the service are RESTful, and thus are good candidates for caching. Caching can improve performance considerably when common search queries are frequently requested by visitors.
Direct search (web form integration) and association search require that search requests can be submitted using the GET method. Even if your current search form uses POST, the GET method may still be supported, as search engines that use POST often support GET as well. To see if your search engine supports GET, change the form's method to GET, and try it out. If the search works as normal, then GET is supported. Note that changing the form method type is just for testing - you are not expected to change the form method when implementing TemaSearch.
Direct search implemented via server-side scripting does not submit new searches to the HTTP interface, and so is free from this requirement.
The direct search
application adds additional words to a search query, and can do this without
needing to involve the user. When configured to be fully automatic, the service
functions entirely transparently to the user - the only change the user should
be aware of is improved search results.
Direct search can
only be implemented with search engines that support a syntax for Boolean
“OR”. (See Boolean search.) If your search engine does not support
syntax for expressing “OR” then you are not able to use direct
search. Instead, you can use association search in word-replace
mode.
Direct search can
be implemented in two ways:
§
Web Form: the HTML code for the search
form the user submits is changed. This is a simple integration method requiring
simple HTML changes.
§ Server-side scripting: The server-side script or program that serves the page produced from the search form is altered to include calls to direct search. This requires more programming than webform integration, although offers more flexibility and the possibility to introduce graceful degradation for the best reliability. Server scripting also opens up the possibility to move to combination search.
Adding direct search to an existing search form requires some changes to be made to the form's HTML code. Before you begin, you will need access to edit the existing HTML page where the search field resides. If you are unable or do not wish to alter the HTML on your site, you can still test direct search using a standalone test page, as described here.
Web form integration requires that a search query can be submitted using the GET method (see Search Submit Method). If this is not possible, direct search can be implemented using server-side scripts.
The action URL for existing search form is changed to point to the direct search service. Additional INPUT elements are added containing configuration values used by the service. When the form is submitted, the service rewrites the query using hidden or user-selectable configuration values, and then redirects the browser to the original search page to run the search with the rewritten query.
The code below shows a simple search form
<FORM
method=GET action="/search">
<INPUT type="text" name="q" size="30"
value="">
<INPUT type="submit" name="submit" VALUE="Søk">
</FORM>
To activate TemaSearch, you add a handful of hidden input element to the form. The INPUT elements correspond to service parameters. For example, the code below adds the service parameter "pq" with the value "boolean".
<INPUT type="hidden" name="pq"
value="boolean">
You will need to add at least these parameters
shown in the table below. (You can click on the parameter name to jump to the
section providing details about that parameter.)
Your username associated with your TemaSearch service account. |
|
The search engine URL. This is usually taken from action attribute, fully qualified if necessary. |
|
The name of the input element that contains the search expression. TemaSearch modifies the field to include new search words. |
|
Lists the names of all the elements that were added for TemaSearch. The list is used so that TemaSearch parameters are not forwarded to the target search engine. |
|
Describes how queries are written for your search engine. |
If we assume that the example form is hosted at http://www.myhost.com/search,
that our username is "jsmith", and that the
"standard" query syntax is used, then we will add these parameters:
jsmith |
|
http://www.myhost.com/search |
|
px pu puser pq pqsntx |
|
standard |
|
q |
After adding these parameters as hidden input elements, the example form looks like this (new or changed parts shown in bold.)
<FORM
method=GET action="search">
<INPUT type="hidden"
name="puser" value="jsmith">
<INPUT type="hidden" name="pu"
value="http://www.myhost.com/search">
<INPUT type="hidden" name="px"
value="pu px pq pqsntx puser">
<INPUT type="hidden" name="pq"
value="q">
<INPUT type="hidden" name="pqsntx"
value="standard">
<INPUT type="text" name="q" size="30"
value="">
<INPUT type="submit" name="submit" VALUE="Søk">
</FORM>
Note:
§ The order the parameters appear in the form is not important, all that is required is that they appear between the <form> and </form> tags.
§ The pu value is the full URL of the search results page. Normally, the action attribute will indicate the URL of the search results, although this may need to be fully qualified by adding "http://" and then your server's domain name, if not already present. In the example the action attribute was /search, which appears fully qualified as http://www.myhost.com/search in the pu input field.
§ If your form already includes a input element with the same name as one of the new elements, see the section renaming conflicting parameters.
Now that the parameters have been added, the final step is to change the action attribute to the URL of the TemaSearch service. For example, change
<FORM
method=GET action="search">
to
<FORM
method=GET action="http://www.temasearch.no/form">
Once you have saved the page to your site, direct search will be enabled. If your search results page shows the submitted query, you will see that it may include additional words. If you have other search fields on your site, you can add direct search them by repeating these steps for the other fields. If these fields use a different target URL for the search results, the URL must be registered with your account. See the section on authentication for details.
TemaSearch offers various configuration options that you can use to indicate the types of alternative words to add to the query
Enables use of bokmål/nynorsk translations |
|
Enables use of near synonyms |
|
Enables equivalent inflection endings |
|
Master switch to turn on or off TemaSearch. |
By default, all these choices are active. Not all accounts include access to the various configuration options. Check your account details to ensure you have access.
In the simplest case, the service is configured by providing fixed values for service parameters. As usual, the configuration parameters are coded as hidden input elements. For example, to exclude synonyms and inflections, psynn and pifl parameters are added to the form, set to the value “0”:
<INPUT
type="hidden" name="psynn" value="0">
<INPUT type="hidden" name="pinf"
value="0">
Although not shown here, you need to add "psynn pinf" to the px parameter already added to the form. This is needed to be sure all parameters are recognised by TemaSearch.
When you search, you'll see that TemaSearch adds translations, but not synonyms or alternative inflections.
The TemaSearch account used to complete the request is determined from the URL given in the pu parameter. At present, no other parameters are required. A future version f the service will require the prealm and puser parameters, so you can add these if desired.
|
|
A. |
Create a local copy of the web page containing the search form, by saving the page in your browser to your local drive (Usually done using the menu item "File | Save As...") Make the changes to the page on your local drive, as described in this document. Save the changes to disk. Open the saved page in your browser. When you type in a search query, you will see the search results from your main web site. Note that because the page is saved to a new location, relative links and resources will not display, although this should not affect your ability to test the service. |
|
|
Q. |
|
A. |
Details After integrating direct search, the additional time required to produce the search results can be attributed to Network delay contacting the direct search server Finding alternative words Additional time required by your search engine to process the modified query. To ensure network delay is minimal, we aim to provide you access to a TemaSearch server that is geographically close to your own server, resulting in faster access time. Typical network delay is around 20-50 milliseconds. The time required by TemaSearch to process a request and rewrite a query is in the order of 1-10ms. The additional words added to the search query may increase the time required by your search engine. The exact increase will depend on the efficiency of the search engine you use. Most "world-class" search engines use highly efficient implementations resulting in no noticeable increase in processing time (typically a few tens of milliseconds.) Adding all of these delays together gives a total delay of around 100 ms. |
|
|
Q. |
|
A. |
This is in contrast to simpler systems that copy the inflection ending from the original word to the alternative word, or use generalized rules to best-guess the correct inflection. Both these systems typically produce more incorrect inflections than the approach used by TemaSearch. |
Q. |
|
A. |
|
|
|
Q. |
|
A. |
|
Q. |
|
A. |
|
|
|
Q. |
|
A. |
|
|
|
Q. |
|
A. |
|
|
|
Q. |
|
A. |
|
The basic approach
to integrating direct search server-side is to add a call to the direct search service
before the query passed to your search engine. The service amends the query
with additional words which is then passed on to your search engine to run the
search as usual. The search thus runs with new words suggested by TemaSearch
automatically included.
The integration
steps are:
§
Construct the service
invocation URL. Direct search results are accessed via
a URL that locates the direct search service and provides configuration
parameters to the service.
§ Extract the rewritten query from the content fetched from the invocation URL.
§ Use the rewritten query to invoke your search engine.
To retrieve the
rewritten query, you construct a URL to invoke direct search. The basic URL is
http://<temasearch-service-location>/ts/svc/rewrite?<configuration-params>&<search-params>
The <configuration-params> is a list of form-encoded parameters for the direct search service. The <searc-params> are the search parameters submitted by the search form. This can be named anything – the name you use is given to the service using the pq configuration parameter.
If your query parameter is called query, then you might build URL that looks like this:
http://<temasearch-service-location>/ts/svc/rewrite?pq=query&pu=&pqsntx=standard&
pcharset=UTF-8&px=px+pq+pqsntx+pcharset+prealm+puser+ppwd+pu&
query=somequerytext
(The prealm, puser and ppwd parameters are not included for brevity. The values for these parameters appear with your account details.) Note that also the ‘pu’ parameter should be specified to avoid an error, though the value is unused.
Note that the URLs are shown with GET-style form parameters for simplicity. If you prefer, you can use the POST method to invoke the service. POST should be used if the entire URL is likely to exceed 2K in length.
The body of the
content returned by the URL contains the result, which is a list of parameters
and values, one parameter and value on each line. For example:
queryParamName: word or alternative1 or alternative2
….
otherParam: value
In the simplest
case, the result includes just one line with the rewritten query parameter for
your search engine. However, if you pass in other search engine parameters
(parameters not listed in the px parameter), then they will also be
included in the result. Finally, if you have enabled the history parameter,
this will be added as the last parameter in the list.
The parameters appear in the same order they were in the URL used to invoke the service.
This proceeds
mostly as without direct search. The only difference is that search engine is
invoked using the new query retrieve from the direct search results, instead of
the query submitted by the search form.
If your results
page includes a search form for resubmitting a search, it is common practice to
show the query the user typed as the default value for the query text. However,
after integrating direct search, the query shown to the user will be the
revised version produced by direct search.
If this is not what
you want, you may be able to show the original query when generating the
results page. If you have access to the script that produces the form in the
results page you can then change the value of the query text box from the
rewritten query (the one submitted to the search engine) to the original query
submitted by the user.
The TemaSearch account used to authorize the request is determined from the authentication parameters: prealm, puser and ppwd. These parameters must be included with the request.
The integration
steps so far described set up direct search with a static configuration –
the configuration was hired-wired into the form or URL used to invoke the
service.
Visible form controls can be used to allow the
website visitor to
control how direct search functions. Practical examples include:
§ Allow the user to choose the types of words that direct search will add to the query, or even whether direct search is enabled or not.
§
Allow the user to indicate the
intended query language or desired target languages
User-configuration under webform integration is done by adding new INPUT elements to the form to provide visible controls such as checkboxes and list boxes for the user. The names of these new input controls are set to the corresponding direct search parameter that they control.
For example, to allow the user to enable or disable use of translations, you add
<INPUT type="checkbox"
name="povs" value="1" CHECKED>
(Not forgetting of course to make sure povs is listed the px parameter value, defined elsewhere in the form.) When displayed in a browser, the form includes a checkbox that controls whether translations are added to the query. The CHECKED attribute checks the box by default, so translations would be active until the user specifically turns them off by clearing the checkbox. The other on/off-style parameters, such as psynn, pifl and penab can also be controlled in this manner.
Other parameters, such as plangin, plangout, pmax1, pmax2 can be controlled using radio buttons, or a SELECT box.
When the user submits the form,
these parameters are sent to the service just as if they were hidden parameters
statically configured by the administrator.
The configuration changes made by the user are
not maintained from one search to the next. With each search, the configuration
is reset to the default values specified in the form. To have the changes remembered
from one search to the next, you can use server-side script integration.
Adding
user-configurable features for direct search via server-side scripting is more involved
than editing the basic web form, although offers more flexibility. Here’s
an overview of the process:
1 Add new INPUT elements to the search form for each item the user can configure.
2 In your server-side script that handles the form, ensure these parameters are included in the URL you construct to invoke direct search.
3 If you wish to show the settings again in the results page, write out a new set of INPUT elements on the results page, with default values taken from the submitted search form. Typically, the search query is remembered from one search to the next, and remembering the user settings will follow the same pattern.
The association
search application retrieves words related to the search query, and presents
these to the user, allowing them to select which additional words are included
the search.
Association search
is implemented like this:
1 After the user has submitted a search query, association search
suggests additional words, which are presented along with the search results.
2 If the user felt that the search failed to find what they were looking for, they can click on one of the words suggested by association search to add it to the query.
3 The word is added to the query and new search results including the new word are shown.
Selection of a word
or group of words requires a single click, and with each click, a new set of
search results is generated. The suggested words can be added one by one, or in
groups. For more precise control, a two-stage approach is used where all
desired words are first selected, and then a submit button clicked to submit
the new query to the search engine.
The words suggested
can be displayed with additional information, such as the language, type of
alternative (synonym, spelling variation, translation etc...) Furthermore, this
information can be used to group and sort the words according to configured
criteria.
Finally, to
fine-tune the user-interface of association search to match the style of your
site, a number of customizations are available, ranging from selecting from a
number ready-made stylesheets, through to complete
control over the user interface via custom XSL
transforms.
Association search requires that a search query can be submitted using the GET method (see Search Submit Method). If this is not possible, you cannot implement association search. Instead, consider implementing direct search using server-side scripts.
To integrate
Association Search, you make changes to the server-side script (or program)
that handles the search form and produces the results page the user sees after
submitting a search query. Here's an overview of the changes required:
§
Add a LINK tag in the HEAD of the HTML to link a CSS file for the association search user interface. (Optional, though recommended.)
§ Add a dynamic server-side include includes the Association Search suggestions in the results page. The URL of the include is dynamically constructed from the submitted search form parameters.
These steps are
described in more detail below. If you want a quick start, add the following
items to your search results page, replacing server.com with the server name where temasearch is installed:
In the head of the search results page, add
<LINK href=”http://server.com/ts/css/base.css”>
<LINK href=”http://server.com/ts/css/color.css”>
<LINK href=”http://server.com/ts/css/links1.css”>
<LINK href=”http://server.com/ts/css/css3.css”>
In the body, add code that implements this pseudo-code
service = “http://server.com/ts/svc/querypresfrag.html?”
tsparams = <configuration
parameters for service>
searchparams = <params and values passed from the search form>
url = service + tsparams + searchparams
include-content-from(url)
Or you can modify
this template URL and
TODO: sample association search URL
What the pseudo-code
above does is to construct a URL, combining static configuration and including
runtime information (such as the search query), and include the content provided
by the URL in the search results page.
The set of
parameters used here are just to get you started, and some parameters may
require changes to work with your search engine. (See tips below.) At the very least,
you should be able to see the association search box on the results page.
Hints and
Tips
§
The parameters that you mostly
likely will need to change to get association search working properly are: pq, pqsyntx, and pcharset.
§ If your search engine does not support Boolean search, set the pqbld parameter to “r”, which activates word-replace mode.
§ Once you have the service operational, there are a number of parameters that fine-tune the service. The parameters above are given with general default values, which you can adjust to suit your own needs. All parameters are listed in the reference section.
§ You may find it helpful to wrap the included content in a container element (such as a DIV or TABLE) for easier positioning and styling.
Once you have the
basic service working you may wish to refer to the sections below for more
in-depth look at integration.
The HTML returned
by the Association Search has very little markup for
adding color, layout and other presentation details.
To style and layout the suggested words to fit with the style of your site, you
can include a link on the results page to one or more CSS
files. You can use one of the CSS files provided with
the service or you are free to author your own. The table below describes the
ready-made CSS files, all located under the /css folder under the TemaSearch service URL.
When using the
ready-made CSS files, you typically include the base.css, followed by color.css and link1.css, followed by one of the cssX.css files.
base.css |
Layout and CSS fixes
common to all layouts. |
color.css |
Default color and
styling for the results box. |
Links1.css |
Color and styling for the groups and links in the results. |
css1.css |
Very compact layout. All headings and text appear
inline. |
css2.css |
Compact, with some line breaks for easier
reading. |
css3.css |
Sparse layout. Each heading is on a new line. Alternatives
for a word are arranged on one line. |
css4.css |
Vertical layout. Each heading and suggested word
is on a separate line. |
The suggested words
produced by association search are added to the search results page via a
server side include. You may want to place the include
so that the word suggestions appear in a place that fits with the existing
content on the results page and with typical user search patterns. Here are
some suggested placements:
§
Above or alongside the search
results: This is useful when users will quickly scan the first few results and
decide if they found what they wanted. If not, the alternative words are in the
same field of vision.
§ Below the search results: This is useful when users often read through the entire results, as may be the case in a large site containing diverse information.
§ Below the repeat search query text box: By being close to the search entry box, users are reminded they can click on suggested words rather than having to manually typing in new words.
When you have
decided where the association search words should appear, locate the
corresponding place in the script that produces the HTML for the page. You then
script the server side include at that point so that
the association search results are textually included at the right place.
Now that you have
decided where on the page the included content will be, you need to construct
the URL for the content to include. The URL used to retrieve the included
content comprises the association search service URL and a number of parameters
http://<temasearch-server-domain>/<service-path>?<search-engine-params>&<temasearch-params>
§
Parameters for your search
engine. Search engine parameters include the search query parameter, and may
also include other configuration parameters that are expected by your search
engine and passed in from the search form. These parameters should be forwarded
to association search so that it can include them when starting a new search.
(The URLs that invoke a new search will then include the original search
parameters, keeping successive searches consistent.)
§ Association Search parameters: these provide details needed by the association search service and configuration values to customize the results. These parameters are all listed within the proxy parameter list (px) to denote that these are service parameters and not regular search engine parameters. These parameters are consumed by the service and not subsequently passed back to your search engine when the user selects additional words.
In pseudo-code, you
might construct the include URL like this
searchEngineParams = ...; // extract params from the
current URL, e.g. from after the '?' from the current URL being served
temaSearchParams = "px=px pu
pq&pq=<queryParamName>&pu=<search-submit-uri>"
temaSearchLocation = “http://temasearch-server.com/ts/svc/querytemafrag.html”
temaSearchURL = encodeURL(temaSearchLocation+"?"+searchEngineParams+”&”+temasearchParams);
Once the URL is built, you then script the inclusion of content returned from this URL.
The pu parameter is set to the URL of the search query handler (i.e. the current page), as this is the target URL for new requests to the search engine. It is also used by association search to authenticate requests to the service.
By default, as association search adds words to the query, the query shown to the user is the full query containing all the word alternatives added.
If you wish to show the original query, you can extract the original query from the history parameter (ptshist). This parameter is passed in to your search engine by association search. The parameter has this format:
Version
“:” SequenceNum “/” keywords
list “/” original query “/”query language
“/”
Extracting the
original query is done by extracting the substring between the second and third
forward slashes.
The full format of
the history parameter is given in the reference section.
Association search
works best when the search engine supports Boolean search. (See XXX.) However,
association search can also be used with engines that do not support Boolean
search. Rather than adding words to the query (which requires the Boolean OR
function) association search can be configured to replace the original word in
the query with the suggested word selected by the user. The main benefit is
that replacement does not require any additional syntax or significant change
to the query, which makes it suitable for all search engines, particularly
those without Boolean search, and engines that search for one word at a time.
To activate this
feature, set the pqbld parameter to “r”.
Association search
provides two user-interface types for selecting words to add to the query,
immediate selection and delayed selection.
Immediate
selection, also called one-step selection: words and
queries are represented as hyperlinks. With this representation a word is
selected and added to the search immediately once the link is clicked.
Delayed selection,
also called two-step selection: words are selectable
items (e.g. checkboxes, radio buttons or single/multi-select list elements)
where selection of a word does not automatically submit the search. The user
must use a button or other control to start a search with the words selected.
The query selection
type is given by the psm parameter. See the parameter
reference for more details.
Using the
integration steps described above, configuration of association search has been
statically wired in to the request used to invoke the service. It is naturally
possible that the configuration parameters are dynamically produced, such as
from the search form.
Providing items the
user can configure is done by adding additional INPUT elements to the search form. The
parameters submitted by these INPUT elements are included in the
service URL in the same way the search query parameter, and other search engine
parameters are included.
§
Add new INPUT elements to the search
form for each item the user can configure.
§ In your server-side script that handles the form, ensure these parameters are included in the URL you construct to invoke association search. Additionally, add the pp parameter with a value listing all TemaSearch parameters that should be output. The service configuration parameters will be included in URLs to the search engine, which ensures new searches started by association search are consistent with the original search by using same configuration parameters.
§ If you wish to show the user-selected settings again in the results page, write out a new set of INPUT elements on the results page, with default values taken from the parameters passed to the script. Typically, the search query is preserved from one search to the next, and preserving the user settings will follow the same pattern.
In service paths
given so far, querytemafrag.html has been used to produce a HTML
fragment for inclusion directly in your search results pages. For building custom
applications, association search provides a number of other service end-points,
selected using a path.
The XML service points authenticate using the parameters: prealm, puser and ppwd. The HTML service points use the URL of the target search engine as authentication.
Returns unprocessed
XML data from the association search service. No XSLT
transforms are applied so this is service point with the best performance.
Uses the data from
querytemapres0.xml and adds a several levels of transformation. The level used is
selected by the plevel parameter. The output is always
valid XML regardless of the transformation used. If levels 5 and above are
used, an HTML interface is returned as XHTML. These
are the functions of the various levels
§
Level 0: no transformation
– produces association search XML without a namespace
§ Level 1: Adds namespace declarations to the raw data.
§ Level 2: Moves qalt nodes to the same level as the corresponding alt node. This places the word alternatives and the corresponding search query at the same level.
§ Level 3: Converts alternative word data into an abstract user interface description. The description includes messages denoted by message ID.
§ Level 4: Resolves messages IDs by looking up the associated message for the current request locale.
§ Level 5: Generates a XHTML page from the abstract user interface description.
§ Level 6: Cleans up the XHTML output by removing any unnecessary namespaces, which can confuse some browsers.
§ Level -1: Apply all transformations. This is for future compatibility should additional transformations be added.
This service
interface can be used to create custom applications, such as customizing the
default user interface. The steps to create a custom interface are:
§
Retrieve the XML at the level
that best suites your needs. Minor changes may use output from levels 5/6. More
substantial changes will start with XML from lower levels.
§ Transform the XML, for example using XSLT. The results will be in whatever format you are using to describe the user interface. This is typically XHTML, although you can use any format, e.g. WML, VML, SVG etc.
§ Include the transformed content where it is needed (e.g. the search results page.).
If the data is being consumed by other programs, for example when building TemaSearch into the indexer of a search engine, levels 0-3 are typically most useful.
Produces
the whole-page HTML user-interface for association search. This is an HTML serialization of
the full XML (level -1) produced by querytemapres.xml. The HTML is produced as a self-standing page,
and is suitable for including via embedded IFRAME or similar.
Produces the
association search user-interface as an HTML fragment
for including directly in web pages. Essentially a HTML serialization of 'Level
- 1' XML produced by 'querytemapres.xml', with the HTML document structure
elements removed.
When availability
and performance are critical, you may want to consider these strategies as an
extension to the basic server-side include.
§
Error handling:
trapping/catching any errors from the server-side include so that searching is
uninterrupted should the service be unavailable or fail for some reason.
§ Timeouts: adding a timeout to the URL include so that search results are not held up if the service is heavily loaded.
§ Load Management: if a certain number of errors or timeouts occur, temporarily stop sending requests to the service. Run checks on a background thread and enable the service when it becomes available/more responsive.
The user-interface
is constructed using HTML+CSS, produced via a series
of XSL transformations. The gives the possibility for
user interface to be changed in the following ways:
§
Make minor changes to the
existing color scheme and layout: use an override CSS file to override selectors for specific features and
styles in the standard CSS files
§ More radical color and layout changes: create a new custom CSS file.
§ customize the interface text messages by creating new text files
§ Redefine the structure and operation of the user interface: using custom XSLT to produce the markup for the interface from an intermediate representation.
§ Entirely define user interface by using XSLT to transform the raw data from the association search service.
Working with CSS, you can change many presentation details, but the
structure of the UI remains much the same. Using XSLT, the UI can be completely
re-structured and re-designed, or targeted to a different presentation format,
such as WML for mobile
By default, when
there are no suggested words for a query, the association search results
include a message like "No alternatives for <query>".
You may prefer not
show any message if no suggestions are available. This is done by defining the
style
.tsresultsnone
{ display:none; }
With this style defined, when there are no results, no message is displayed.
See CSS Styles for more details.
[Internal Service
Only]
The messages for
the user interface can be extended in the following ways:
§
Translations for new languages
can be added. For example, if your site is multilingual, you may wish the
association search interface to be available in all the languages supported by
your site.
§ Altering existing messages
New files
containing custom text messages are placed in /WEB-INF/deploy/i18n. (This directory may need to be
created.) The new resources are named temasearch_<locale>.properties.
The existing
messages are found in the \WEB-INF\lib\parasearch-core-impl-<version>.jar
file, named messages_<locale>.properties. These should used as a template
for creating new languages or customizing existing messages.
Add support for a
new language by
4
copying an existing message file into the message resources directory,
renaming it to temasearch_<newlocale>.properties.
5 In the new file, translate all the message strings to the new language.
For example,
translating the English text to German, you would
6
copy messages_en.properties from inside parasearch-core-impl.jar to /WEB-INF/deploy/i18n/temasearch_de.properties.
7 translate all the text in temasearch_de.properties, taking care to keep all of the message names intact (the message name is everything before the first '=' on each line.)
You can customize
an existing message by
8
For the locale you want to
customize the message in, create /WEB-INF/deploy/i18n/temasearch_<locale>.properties file if it doesn't already exist.
9 Locate the message name in the existing resources and copy this to the corresponding temasearch_<locale>.properties file you located in the previous step. Add an '=' after the message name followed by your new text for the message.
10 If you wish to customize the message for multiple locales, then repeat for each locale.
In most cases, it
is usually immediately obvious if the service is not set up correctly as it
just does not function or you get an error message. There are however a few problems that can easily go unnoticed. The
following sections describe what these problems are and how to identify and fix
them.
A character
encoding mismatch can result in some characters being displayed incorrectly.
Perform these tests to check character encoding is correctly set:
§
Type in queries that contain
characters not part of the normal ASCII character set, e.g. accented letters,
Scandinavian vowels etc. Verify these are recognised correctly by the search
engine and that subsequent searches from association search also function
correctly.
§ Type in a word containing only ASCII characters that produces results containing non-ASCII characters. Again, verify these are recognised correctly by the search engine and that subsequent searches from association search also functions correction.
§ See the pcharset for details about the correct setting for character encoding.
When generating
URLs for new search requests, the service adds a history parameter to maintain
state between requests, such as the initial query and input language. This
parameter is passed to the search results page script, and should be included
in the invocation URL of the association search service. Of course, if you have
arranged so that all request parameters are passed to the service, then this
will happen automatically. In other situations, you should pass on the
parameter explicitly.
As the history
parameter is treated as a service parameter, and should be added to the service
parameter list (px parameter) as usual.
To check the
history parameter is being passed through your search handler, type in a query
to produce some suggested words. Inspect the URLs linked to the suggested
words: they will include a ptshist parameter, with a value starting
with 1,1. The first number is a version identifier, and
can be ignored for now. The second number is the invocation count. As
additional suggested words are added to a query, you should see later URLs
include an ever-increasing invocation count.
If this is not
happening, check that
§
The invocation URL that you
construct includes the ptshist parameter. This parameter will be passed in to the search engine
when the service performs a new search, such as when the user clicks on words
suggested by the service.
§ The parameter ptshiston is not set to 0. Doing so will turn off use of the history parameter.
If the search history parameter is not passed
through, as the user selects words from the list of suggestions, the next list
of suggestions can actually grow, as it now includes suggestions for the words
most recently added to the search. In the worst case, when using language
auto-detection (the default, see plangin) the detected
language can change. This will often result in a completely new set of word
alternatives being produced, and the resulting inconsistency can be confusing
to the user.
Combination search
uses both direct search and association search together for the same search
page. For example, you might configure direct search to automatically add words
to the query that have the same meaning, such as near synonyms, spelling
variations and inflection variations, while using association search to
allow the user to select from words that may be more distantly related, such as
general synonyms and translations.
Combination Search
is implemented by applying direct search and association search to the same
search results. It is implemented as changes to server side scripts for the
search results page. (The webform-style integration
for direct search should not be used when implementing combination search.)
Most of the
implementation details for combination search are exactly the same as for
implementing direct search and association search. The main difference is that
some handling of the history parameter is needed; the history parameter is
passed from direct search to association search.
When handling an
original query from the user, invoke direct search on the query to rewrite the
query before using it to run a search. You can determine when a query has come
from the user when there is no ptshist parameter passed to the page.
(Re-queries from the user clicking additional words always include a ptshist parameter.) Direct search is invoked on an original query like this
1 Determine what types of words you want added automatically to the
query, and use this to configure direct search. The configuration will indicate
what types of words to include, input/output language
etc. You need to include changes to the default configuration, as the default
includes all word types, and thus would add all words automatically, leaving
none available for the user to select from, which defeats the aim of
combination search.
2 Based on your determined configuration, build a URL to invoke direct
search. This is done exactly as for regular direct search, but you also include
the ptshiston parameter, set to the value “
3 Extract the new query from the result and use this to invoke your search engine and produce the search results page. Also extract the ptshist parameter from the result and save for use when invoking association search later.
At some point when
producing the results page, invoke association search (following much the same
pattern as for regular association search):
1 Determine the configuration of association search. This will
typically include less restrictive settings compared to those used by direct
search, such as more types of alternatives enabled, or more words allowed.
2 Build a URL to invoke association search. Be sure to include the ptshist history parameter. If direct search was invoked, the ptshist parameter value can be taken from the direct search results. Otherwise, the parameter will have been passed into to the script as a page parameter and the value can be taken from that
3 Fetch the content from the association search URL and include the content in the page.
Because combination
search is essentially a marriage of direct and association search, all
configuration details available for direct and association search (e.g. end
user configuration, customizing the user interface) are also available when
implementing combination search.
Both direct search
and association search applications provide a lower-level service interface
which can be used to integrate TemaSearch into your own applications. Which
service you choose is governed by how much detail you require
§
Direct Search provides a search
query containing the alternatives, or simple list of word alternatives for a
given word
§ Association Search provides varying levels of categorization of the alternatives, and has facilities to support construction of a user-interface.
Some examples:
§
Adding word alternatives to
search index (basic): the Direct Search service is used to retrieve
alternatives for the words in the index. These are added to the index as
aliases of the original word.
§ Adding word alternatives to search index (advanced): Using the Association Search service, alternative words can be added to the index as above, and included is information describing the type of alternative. This might allow per-language or otherwise more refined selection of word alternatives from the index.
§ Database Search: a SQL query processor that alters the SQL query (e.g. the WHERE clause) to include additional words.
TemaSearch provides
these types of word alternatives for both association and direct search:
Type of word
alternative |
Configuration
Parameter |
Base forms |
pbase |
Inflected forms |
plemifl |
Alternative inflection endings |
pifl |
Alternative spellings, near synonyms |
psynn |
General synonyms |
psyng |
Translations |
povs |
Further control
over the alternatives is available with these parameters:
Stem alternative
words |
pstem |
Limit number of
alternatives per original word |
pmax2 |
Limit maximum
words per query |
pmax1 |
|
|
TemaSearch parameter
list |
px |
Search
engine/authentication URL |
pu |
Service
authentication |
puser,pgroup,ppwd |
Error handling
control |
pnoerr |
Enable/Disable
service |
penab |
Licence control |
paltlic |
Input/Output
Character set |
pcharset |
Select profile
for parameter defaults |
pprof |
TemaSearch putput
parameter list |
pp |
Search engine
query parameter name |
pq |
Query syntax |
pqsntx |
Define query
language, or enable auto-detection. |
plangin |
Restrict language
of words in the result query |
plangout |
Set preferred
language for auto-detection |
plangpref |
Sort order for
suggested alternatives |
psort |
Grouping levels
for suggested alternatives |
pgroup |
Type of new
search queries |
pqbld |
Select words in
same group level |
pqgrpidx |
|
|
One or two step
selection mode |
psm |
Presentation
language |
plocale |
Show truncated
input query |
pqtrunc |
XSLT transformation
pipeline length |
plevel |
The table below lists the query syntaxes supported. These syntaxes are used by the service to determine the meaning of the query and for adding new elements to the query using the correct syntax.
Not all elements of your search engine syntax need supported by the service. Select the one that most closely matches the syntax. If your chosen application uses requires Boolean search syntax, the syntax of OR operator should match that expected by your search engine. Support for other operators is optional, as any unknown query syntax elements will be ignored and left unchanged by the service.
The HTTP interface
to TemaSearch services expects configuration parameters to be passed as query
parameters in the URL or as POST parameters. In many cases, these parameters
are specified alongside parameters for your search engine. Although TemaSearch
parameter names have been chosen to be fairly unusual so not to clash with
search engine parameter names, it is still possible that a TemaSearch parameter
has the same name as a search engine parameter.
If this happens,
you can rename the TemaSearch parameter to avoid having two parameters with the
same name. Renaming is done by adding an underscore '_' at the front of the
parameter name.
For example, if the
parameter puser was already used by your search engine, then
the TemaSearch parameter should be renamed to _puser. The px parameter lists all the TemaSearch
parameters in the request, and includes each parameter using the name as it
appears the form. For example, after renaming puser to _puser, the px parameter would then be defined as
“px=px … _puser
…”. (Ellipsis is
used for other parameters not shown here.)
Pnoerr = instructs the service to degrade
gracefully when there are errors. If an error occurs, such as authentication
failure, incorrect parameter, etc.., ordinarily the server returns a 500 error
code and an error message. Setting pnoerr to 1 causes the service to gracefully degrade by returning the
original query unmodified.
This is principally
intended for use with direct search webform
integration, as the search would not proceed if the service failed to send back
the HTTP redirect. For server-side integration, errors can be caught and dealt
with, so this parameter is less useful then.
ptshiston – enables/disables the history
parameter. When set to
Version: Describes the version identifier for the remainder of the history parameter. Current version is version 1.
SequenceNum: Indicates how many times association search has been used to add to the query.
Keywords list: the original keywords extracted from the query. This is the data used by association search when rebuilding a query.
Original query: the original query text in full.
Query language: the language used to generate alternatives for the query.
plangin, a list of languages that the query is written
in. If not set, defaults to auto-detection of language.
plangpref If automatic language identification is active
(plangin param not
defined), this gives the preference to the languages defined here. The first
language in the list gets highest preference, followed by the second and so on.
pqt – the query text. This is used to
specify the query text in direct service calls, when the service returns the
result directly, rather than as a rewritten search URL.
pprof – selects a ready-made profile. This can
avoid specifying many different parameters.
padjlic – Controls what happens when features
are used are not valid for the current licence. The default is “
Pp –
propagate parameters. By default, service parameters denoted by the px parameter are consumed by the
service, and are not passed through to the results. If TemaSearch service
parameters are also used externally (by your search engine for presenting
configuration controls, logging or some other use) then they should also be
included in the URLs generated by the service.
For services that
return a direct result, such as direct search /rewrite, authentication is performed by
inspecting the account credentials provided in the parameters. prealm, puser and ppwd. These details are used
authenticate use of the service, and determine what features are licensed.
Psort – For association search,
this parameter specifies the sorting order of the alternative words. Sorting of
words can happen at multiple levels – the parameter is a comma-separated
list of sort keys. For example, “l,w,t”
sorts by the language of the alternative word, then by the index of the
original word, and then by the word text itself.
The default value
is “”, which indicates the words are not sorted, and will appear in
no particular order. The order is not guaranteed to be the same each time the
service is called, so it is recommended you apply a sort order.
The available sort
keys are given in the table below.
T |
Orders
alternative words alphabetically. |
L |
Orders words by
their language. The language of a word is <ISO-language-code>_<ISO-country-code>_<variant>. For example, Nynorsk is no_NO_NY. See http://java.sun.com/j2se/1.4.2/docs/api/java/util/Locale.html
for more details. This sorting is
most useful when accessing the data programmatically. For user interfaces,
the l2 sorting order is typically more useful. |
S |
Sorts alternative words by their score,
placing words with higher (better) score first. The score reflects how close
a word alternative is to the original word in the query. Please note that the
score feature is only partially implemented and available only for certain
types of word alternatives. |
w |
Sorts alternative
words by the corresponding original word they relate to. Using this sort key,
all alternatives for the first original word appear before alternatives for
the second original word and so on. |
l2 |
Sorts alternative
words by how close their language is to the language used in the original query.
Alternative words that are in the same language appear first followed by
those that are in a closely related language. More distant languages come
later. |
mc |
Sorts alternative
words by their category. Categories are sorted in this order: §
Same-language alternatives § Translations At present, this
performs essentially the same function as the l2 sort order. However, future
versions will include additional categories, such as word correction. |
mt |
Sorts alternative
words by what type of alternative they are. The types are sorted in this
order, which has a bias towards putting closely related types before more
distant types §
Base forms (pbase) § Inflections of a word (plemifl) § Alternative inflection spellings (pifl) § Near synonyms (psynn) § General synonyms (psyng) § Translations (povs) |
Putting an
exclamation mark in front of the sort key reverses the order. For example,
“1,!t” sorts words by language and then
reverse alphabetical order.
Pgroup – Defines how alternative
words are grouped. When displaying alternative words, they can be shown as a
simple list of words (the order of this list specified by the psort parameter). Alternatively, the words can be grouped under various
headings. For example, specifying grouping as “l” (language), will
display the language for the alternatives.
Grouping works by
gathering together words that have the same value for the grouping key
(language, word text etc.) Thus grouping and sort
order a related – if you are grouping on a attribute (say language) then
you will also use a sort key for that attribute. (If the sort
order is different from the grouping order, then some headings may appear more
than once, which can be confusing.)
The alternative
words produced by association search can be sorted and grouped for easier
reading and structured presentation. Sorting the words is done by comparing
various attributes of each word with the others in the list. For example,
sorting by word text will sort the words alphabetically.
Grouping takes all
words with the grouping attribute (word text, language, score etc.) and places
these under a group heading.
As you might
expect, the sort order and grouping structure are closely related. For example,
if you wish to group the alternatives first by language and then by original
word, then the sort key should include at least sorting by language and then
original word.
It is possible to
have grouping and sorting not follow the same pattern, but the results may not
be what you would expect.
It may seem
redundant to specify sorting and grouping separately. The reasons for doing
this are:
§
the number levels of sorting keys can be different from the number of
groups. The additional sorting keys can be used to control the order of groups
using a non-grouped
attribute, or can be used to sort words within a group.
§ the grouping key only requires knowledge of the attribute, sorting also requires knowledge of the specific sort order - some attributes can be sorted by different criteria (for example, language has two different sort keys, l and l2.)
An example. We wish
to show the alternatives grouped by language and original word. We want the
language to be sorted by relevance to the original query. Additionally, the list
of words for each original word should be sorted alphabetically. This is done
by setting:
§
psort=l2,w,t – sort by language relevance then original word number and
then word text.
§ pgroup=l,w – group by
language and then original word.
When association
search has determined the alternative keywords available, it uses these to
produce a number of search queries based on those keywords. For example, when a
user clicks on a word to produce a new search, this is using a new query
produced a query builder.
s |
single word |
A new query is
produced for each word alternative, that comprises the original query plus a
single word alternative. This is the type
of query that adds words one by one to the search. |
m |
multiple words |
A new query is produced
for a number of word alternatives taken from a group. |
ms |
single and
multiple words |
Combines single
and multiple. For a group of words, one multiple word query is produced containing
all alternative words in the group, and a number of single word queries are
produced, one for each alternative word. This is used to
produce single selectable words, and a [all] selection for selecting groups
of words in one hit. |
r |
replace words |
A new query is
produced for each word alternative. The word alternative is included in the
query, but the original word is not. This effectively replaces the original
word in the query. This type is
suitable for search engines that have no support for Boolean search. |
The m and ms query types produce queries containing
multiple words. The set of alternative words are put into a multiple word query
is determined by the pqgrpidx parameter. By default, all word alternatives
produced placed in a single multiple word query. By setting the pqgrpidx parameter, a multiple word query can be created for each group at a
given grouping level.
pqgrpidx – Set the group level for turning words
into queries.
NB: If the parameter
is set to a value greater than the number of groping levels, no warning will be
given, yet no alternatives will be produced. This is because the query builder
never finds the specified group level, and so no queries are produced.
psm – Word Selection Mode
§
Delayed (single-step) –
This is the two-step selection mode, where the user can click to select or delect words. A new search is run only when the user
submits the selection.
§ Immediate (two-step, default) – This is the single-step selection mode. When the user selects a word or group of words, the selection is immediate and a new search is started.
If you are using
delayed select, be sure that the association search results are not placed
inside an existing FORM. The selection user interface for
two-step select is implemented itself using a FORM. Including this inside an existing FORM will result in nested forms. This is not allowed by the W3C HTML
standards and is best avoided, as support and behaviour varies from browser to
browser.
Plevel – Sets the pipeline level
Controls how long
the transformation pipeline is that transforms XML output from association
search.
Plocale – Sets the language for the
user interface.
Association search
includes various messages as part of the standard user interface. The language
used for these messages is specified by this parameter. The messages are
available in these locales as standard:
no_NO_B |
Bokmål |
no_NO_NY |
Nynorsk |
If you are using
the locally installed service, you can also add support for additional
languages. (See XXX.)
If no value is
specified for the parameter, the default value is determined like this:
§
If the language of the original
query was given, or determined automatically, the interface language is set to
that.
§ Otherwise, the language set in the host operating system is used.
pqtrunc – Sets the maximum number of characters
to use for the query in the user interface. When the user enters a long query,
this can cause the association search title box to become considerably larger
than the rest of the interface. As the query often appears elsewhere on the
page, the query can be truncated to save space.
The default value
is 30 characters.
Mandatory parameters are parameters that do not have a default value. Such parameters must be defined, or the request will fail with an error message indicating the missing parameter.
Summary |
Alters options to comply with licenced features. |
||||
Details |
If an option is selected for a feature that is not licenced, normally an error is produced. (The result of the error depends upon the pnoerr parameter, but at the very least, the no alternatives will be added to the query. Setting this parameter allows the search to continue using those features that are licenced, and ignoring the request for unlicenced features. In a production environment, you would typically
set this value to true so that attempted use of unliceced
features does not stop temasearch from being used
for those features that are licenced. In |
||||
Values |
|
||||
Default |
true |
Summary |
Adds the baseform of an inflected form. |
||||
Details |
When the query includes an inflected word, the non-inflected for of that word is added to the query. If steming is active, this parameter has no effect. |
||||
Values |
|
||||
Default |
0 |
Summary |
Indicates the character set expected by the search engine. |
||||
Details |
This is usually the same as the character set
defined by the <meta
http-equiv="Content-Type" content="text/html; charset=utf-8"> This defines the character set to be "utf-8", so the pcharset should be set to "UTF-8". |
||||
Values |
|
||||
Default |
ISO-8859-1 |
Summary |
Enabled or disables TemaSearch. |
||||
Values |
|
||||
Default |
1 |
Summary |
Includes alternative inflections. |
||||
Values |
|
||||
Default |
1 |
Summary |
Provides a hint as to the language the user is using to enter the search query. |
||||||
Details |
When the search query could be both bokmål and nynorsk, the language TemaSearch assumes it to be is given by this parameter. Note that if the language of the input query is not ambiguous, then this parameter has no effect. This parameter is useful if the user base is strongly biased to one langage or the other. |
||||||
Values |
|
||||||
Default |
(Empty string) |
Summary |
Selects the language for new words added to the query. |
||||||
Details |
This parameter is usually set if the documents being searched are all written in the same language. Setting this parameter to that language ensures that the rewritten query only contains words in that language. If the documents being searched comprise a mix of languages, this parameter should not be set, so all relevant languages are included. While similar, this is not the same as disabling translation. For example, assume the output language is set to bokmål. A nynorsk user will still require translation (to bokmål) while a bokmål user will not. So, translation is used if the text the user types needs translating to the output language. |
||||||
Values |
|
||||||
Default |
(Empty string) |
Summary |
Adds all inflections of a non-inflected word. |
||||
Details |
If the query includes a non-inflected word, then inflections of that word are added to the query. Inflected forms are output even if stemming is active (pstem=1), though you typically will not enable this parameter if stemming is active. |
||||
Values |
|
||||
Default |
0 |
Summary |
Limits the total number of words that are included to the query. |
||||
Details |
This is useful for keeping the overall number of words to a reasonable level, especially when the search engine accepts upto a specific maximum number of words. Often, if the number of words goes over this limit, the additional words are ignored, which may cause important words in the query to be dropped. When TemaSearch can provide more words than there is room for, it discards the least relevant words to ensure that all the original words and the 'best' alternatives are included in the query, without going over the total-word limit. |
||||
Values |
|
||||
Default |
-1 |
Summary |
Limits the number of words that TemaSearch will add to each word in the query. |
||||
Details |
If many options are selected, such as both translations and synonyms, some words may have a large number of possible alternatives. In some cases, a large number of alternatives can make the search less precise. Setting this value to a lower number can help maintain a high search precision. When there are more alternatives for a given word than are allowed by this parameter, the least relevant words are discarded until the number of alternative words is within this limit. |
||||
Values |
|
||||
Default |
4 |
Summary |
Controls error reporting. |
||||
Details |
This parameter is intended as a |
||||
Values |
|
||||
Default |
0 |
Summary |
Includes translations (nynorsk or bokmål). |
||||
Values |
|
||||
Default |
1 |
Summary |
The name of the parameter in the original form that holds the search query. |
Details |
This usually corresponds to an input field of type "text". |
Summary |
Indicates how the search engine expects the query to be written. |
||||||||||||||
Details |
This is used by TemaSearch to ensure new words can be added in a way that is compatible with the search engine. Choose a syntax that most closely matches the features of your search engine. The most important features are how the search engine expects mandatory and alternative words to be written. If a syntax includes features that are not available with your search engine, that is not a problem. The features indicate that TemaSearch would understand the feature should a user use it with a query. TemaSearch only changes the query using the boolean OR operator, in the form is written for your search engine. All syntaxes allow the use of the word prefixes + and - to specifically include or exclude words. Temasearch does not expand these words. |
||||||||||||||
Values |
|
Summary |
Controls stemming. |
||||
Details |
An inflected word (such as a plural noun, or past-tense verb) produces alternative words that are similarly inflected. Some search engines automatically search for inflections of a word, and so there is little need to include alternative words as inflections, as the non-inflected form is sufficient. Use of stemming with search engines that do not search for inflections, that is, engines that perform an exact match, can help reduce the number of search terms for a more focused result. |
||||
Values |
|
||||
Default |
0 |
Summary |
Includes general synonyms. General synonyms are words that are close but not always identical in meaning for all senses of the original word. This is typically used to expand the search into related areas. |
||||
Values |
|
||||
Default |
0 |
Summary |
Includes near synonyms. Near synonyms are words that are identical, or almost identical in meaning to the original word. These mostly include spelling variations for a given word. |
||||
Values |
|
||||
Default |
1 |
Summary |
The fully qualified URL of the search results page. |
Details |
This is usually the same as the action attribute, though this parameter must be fully-qualified, so protocol and server name will need to be included in the parameter if not present in the action. |
Summary |
The name of the TemaSearch account to use to gain access to TemaSearch services. |
Summary |
Lists all TemaSearch parameters added to the form. |
Details |
The FORM element contains parameters for your search engine and parameters for TemaSearch. In orderthat TemaSearch can locate information in need, all TemaSearch parameter must be listed in the px parameter, separated by a space. For example, if the parameters "pq" and "puser" were added, then the "px" parameter should be defined as "px pq puser" (note that the list includes the px parameter itself.) If any of the temasearch parameters have to be renamed to avoid clashes with existing parameters, the renamed parameter should appear in the list. (See resolving parameter conflicts for details. |
The CSS selectors used by association search are described
below. The selectors and structure used in the user interface is more complex
than would be necessary if all browsers supported the CSS
2 selectors.
Selector |
Description |
Example Uses |
#tsresult |
Root DIV
containing all association search content. |
Can apply temasearch-wide formatting at this level. |
.tsresultsnone |
Class indicating
no results were available. Set on the outermost element. |
Style UI
differently when there are no results. (e.g. hide it
completely.) |
.tsresultsnomore |
Class indicating
that the user has selected all available words. Set on the outermost element. |
|
.tsresults |
Alternative words
are available. Set on the outermost element. |
|
DIV.tsheading1 |
Heading text,
contains overall information for the words returned.. |
|
SPAN.query |
User query text |
Can highlight
query as it is variable data |
.tsbody |
Class for main
bulk of the user interface, apart from the headings and footer. |
|
DIV.immediate-select |
Container for
immediate selection results |
|
#groupc-<class>-<instance>0 |
Container for a
grouping instance. |
|
DIV.groupN |
Group container
class for level N |
Apply formatting to all roups as a given level. |
#group-<class>-<instance> |
Group for
specific class and instance. |
|
A.tswordlink |
Link to directly
modify query. |
Custom hover,
highlight etc.. |
DIV.tsactionGroup |
Container for
command buttons associated with word alternatives |
|
SPAN.grouptitle |
Outer SPAN
enclosing group titles |
Apply styles to all group titles. |
SPAN.grouptitleN |
Inner SPAN
enclosing group title for a specific group. |
Apply styles to specific group titles. |
.tsimsel |
Immediate
selection container. Can contain groups and selection words. |
|
.tsdelsel |
Delayed selection
(using input elements and forms to build query) |
|
.tsdelsel
FORM |
Input form for
delayed selection. |
|
SPAN.product |
Class assigned to
text describing the product. |
Used to highlight
product name. |