Media Partner

 

 

 

 

 

 

 

 

 

 


Meeting order
prices and details

Meeting order form

Printable Meeting
order form

Exhibitor briefing

 


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Meeting order
prices and details

Meeting order form

Printable Meeting
order form

Exhibitor briefing

 


 

 

 

 

 

 

This page last changed 24 April 2008

Boston, Massachusetts, April 28-29, 2008

Program

 

This annual meeting provides a forum and point-of-reference for all those interested in the intricacies of Search and Retrieval. The meeting draws those with a professional interest in search engines -- such as search engine designers and developers -- and those interested in applying search engines in their own professional environments. Search is at the heart of information retrieval; and the Search Engine Meeting provides an annual point of reference as to what is happening in this fast-moving and exciting field.

All presentations are given sequentially; there are no parallel sessions or parallel presentations at this meeting.



  Monday April 28  

09.00. CONFERENCE OPENING

Charles Clarke – Day One Opening Talk
University of Waterloo, Canada
XML Retrieval: Problems and Potential

While XML is not an ideal vehicle for capturing and exploiting document structure in search, it does provide a common ground for addressing a number of related retrieval problems across unrelated document types and collections. Using examples of retrieval over collections of books and journals, this talk outlines methods for focused retrieval: returning the right parts of documents, not just the right documents. In the case of books and journals, these parts may range from paragraphs and pages to entire volumes. The talk also discusses the evaluation of these focused retrieval methods. In particular, the talk describes INEX (the INitiative for the Evaluation of XML Retrieval) an ongoing forum for evaluating focused retrieval. Now in its sixth year, INEX annually brings together an international group of researchers to compare methods using common test collections.

Stephen E. Arnold
AIT, Kentucky
Beyond Search: Big Money and User Dissatisfaction as Catalysts for Next-Generation Search Solutions

There are more than 200 companies offering "search" solutions. Some of these are newcomers unknown in the US. Paris-based PolySpot and Budapest-based Tesuji are just two. The $1.2 billion buy out of Fast Search & Transfer, more than Google's advertising billions, has ignited consolidation in search. It is not just money. New research funded by the U.S. government reveals that three out of five enterprise search system users are dissatisfied or very dissatisfied with their present search system. A new study for the Gilbane Group identifies facets of the search business that receive scant attention:

  • Universities are funding search ventures. The goal is not technology transfer. The objective is to create value and, hence, revenue beyond the traditional research grant and licensing models.
  • Newcomers are making inroads against far larger vendors. Companies such as Coveo, Exalead, ISYS and Siderean Software are tallying double digit growth by finding new markets and capturing business from far-larger, higher-profile vendors. Companies like Bitext in Madrid generate revenue by providing established vendors with a way to add natural language processing to their ageing systems, as dtSearch has done with its Bitext relationship.
  • The shift to rich text processing via semantic and statistical techniques is now taking place. Although slow to take off, the "assisted navigation" interface pioneered by Endeca is now giving way to an information dashboard. Endeca's lock on the point-and-click interface and "suggestions" is being challenged by dozens of companies.
  • Search is no longer an option. It is an expected component of other enterprise applications. As a result, search-and-retreival, findability solutions, and social search are part of the standard enterprise software vendor's product functionality. IBM, Microsoft, Oracle and SAP are pushing downmarket.

The cumulative effect of these trends has significant implications for vendors, procurement teams and users. The two principal changes catalyzed by these trends are that users want increasingly intelligent systems, thus triggering significant opportunities for vendors and an increasing flow of funds into new technologies that will ensure a fast-changing, unstable market for the foreseeable future.

§ Exhibitor Product Highlights: Attivio / Groxis

Steven Forth and Amelia Newbury
Monitor Group, Massachusetts
Search as a Mode of Learning: Requirements for Next Generation Search Systems

In addition to hearing, watching, experiencing and other popular modes of learning, search is an important and fundamental mode for understanding complex subject areas. Learning in complex fields can be understood as the building of concept maps through the exploration of a knowledge space. As learning is often a social act, and learners need to be able to communicate and share their concept maps, there are important social and communication issues at stake here as well. Search provides a compelling way to explore multi-dimensional spaces. This is both a common and growing use of search systems. But the use of search for learning and social learning impose new requirements on search system. These needs go beyond the simple optimization of 'findability' of a known piece of information or even a sampling of possibly relevant results. When social and communication aspects are factored in, the current generation of search systems do not provide adequate support for learners. Among other things, search systems need to factor in the "white spaces" in the concept maps and influence the presentation of results to fill these spaces.
This presentation looks at a number of common patterns in search to see how these support learning, and then develops a set of requirements for a search system that provides better support for social learning in complex fields. eMonitor's experiments in the use of semantic constructs to organize learning and performance content to improve searchability will also be discussed.

10:45 Conference Break

Peter Jackson
Thomson Corporation, Minnesota
Blending Retrieval and Categorization Technologies in a Document Recommender System

The task of recommending documents to knowledge workers differs from the task of recommending products to consumers. Variations in search context can undermine the effectiveness of collaborative approaches, while many professionals function in an environment where the open sharing of information may be impossible or undesirable. There is also the 'cold start' problem of how to bootstrap a recommendation capability in the absence of current usage statistics. We describe a fully fielded system called ResultsPlus, which uses a blend of information retrieval and machine learning technologies to recommend secondary materials to attorneys engaged in primary law research based on document metadata. Rankings of recommended material are subsequently enhanced by incorporating both historical user behavior and document usage data.

Terry Clift
ISYS Search Software, Colorado / Australia
Forget “One Size Fits All,” Search is an Iterative Process

Taking a high-level, strategic approach to enterprise search might make sense in specific cases, but it should not be done at the exclusion of tactical deployments that pay immediate dividends while the “big rollout” is still in the configuration stage. Search is an iterative process that, when done correctly, lives and breathes and conforms to the requirements of various user communities over time. Understanding these users, their needs and environments, and the intended goals for search, lays the crucial foundation for any successful implementation. Vendors and customers only set each other up for failure when they assume environments are rigid and that search must have all of the answers from the beginning.
This presentation discusses the iterative approach to search and illustrates the best short-term and long-term strategies for bringing enterprise search into an organization. The process begins with how to identify where search can benefit immediately. The presentation then outlines steps to rolling out search implementations for broader requirements and how to generate lasting, long-term gain without sacrificing the short term.

§ Exhibitor Product Highlights: ISYS

Roger Bradford
Agilex Technologies, Virginia
Semantic Retrieval: Making the Computer do the Heavy Lifting

This presentation covers the range of modern applications of semantic processing to information retrieval. The emphasis is on techniques that reduce the cognitive load on the user. Techniques covered include conceptual retrieval, clustering and categorization, on-the-fly taxonomy generation, and text mining. Examples are taken from applications in industry and government. These applications include patent analysis, legal data discovery, and counter-terrorism analysis. The discussion includes multi-lingual and cross-lingual applications. Although the emphasis is on text, extensions to audio and video data are included. New results are presented that demonstrate the applicability of these techniques to very large document collections.

12:45 Lunch Break

Nigel Hamilton
Trexy, UK
Search Trails - Back to the Future

Each day millions search for the same things and often find themselves repeating their own searches. Would it not be good if we could harness this collective effort and remember the searches and the web pages visited to find information?
This presentation explores how new social search tools impact and assist the online searching community. Trexy.com remembers search trails and shares them anonymously with other searchers. Search trails are the pathways users make when searching on engines such as Google, Yahoo and MSN. But what is the optimal trail for a given search? How can we pass useful trails onto one another? Can search trails help users to pinpoint information? The presentation looks at the technical developments that have led to how we currently view, retrieve, and remember information online.

George Chitouras
Business Objects, California
Using Information Retrieval and NLP techniques to drive Business Intelligence

While traditional Business Intelligence has transformed business using structured information from operational applications and transaction systems, there is a huge source of information that has by and large been ignored: people’s thoughts and opinions, found in communications such as emails, web pages, reports, surveys, customer relationship management note fields, contracts, blogs, wikis, and reports. Whether it is customer complaints, employee feedback, analyst opinions, or competitors' intentions, this potentially valuable information lies hidden in unstructured text sources.
This presentation proposes that the artifacts of text analytics, when used in the aggregate, can drive business intelligence dashboards and measure “sentiment” as it relates to products, companies or marketing initiatives.

§ Exhibitor Product Highlights: Northern Light / Trexy

Sam Chapman
University of Sheffield, UK
Combining Semantics and Keyword Approaches to Enable Flexible Enterprise Search

Keyword search has issues, in that returns are not suitable for many business uses, reliable quantitative returns are impossible to obtain due to the uncertain relevance of any query return. More of an issue is that textual information in specialised domains is often repetitive, and the context of information is paramount to its meaning. In such circumstances standard keyword approaches are not the best method to use relevant information.
Semantic approaches offer a method to alleviate this issue by capturing "knowledge" according to a pre-assigned structure (ontology classes and relations). Although these techniques are proven to be helpful in answering precise queries, the complexity of how knowledge is searched and its rigid organisation can sometimes constrain a user, especially considering that not all possible "knowledge" is encoded into a re-usable structured form.
This presentation outlines a flexible approach combining both Keyword and Semantic approaches for specialised domains where the user can easily switch between, or use, both approaches together within a degree of variably structured and unstructured query to locate the information needed for quantitative analysis. The presentation focuses on a number of specific examples where this simple patented approach is used in large scale enterprises.

3.45 Conference Break

Jeff Fried
FAST Search & Transfer, Massachusetts
The Next Step in the Confluence of Search and Business Intelligence

Enterprise search (with a heritage in serving ad hoc queries on unstructured data) and BI (traditionally focused on structured inquiry into structured data) have been coming together. A range of capabilities combining search and BI are available and in use. Text mining, search-based “everyday analytics”, and search integrated in BI new technology that merges traditional database and traditional search cores is coming out of the lab, providing a next step in the search/BI space. This presentation outlines the internal architecture and data management approach of this next-generation core search technology.

Pascal Coupet
TEMIS, Philadelphia
Better Annotations for Text Mining: Using a Knowledge Server

Simple entity recognition is becoming more and more popular to improve user search experiences. We are now used to seeing personal names, places and others automatically recognized in texts. These new dimensions can be used for facet navigation, hyper linking and several statistic analysis types.
However, quality becomes quickly an issue in production systems because of ambiguities and naming variation. Normalization and disambiguation are a necessity for high quality systems.
The presentation discusses the next generation of entity recognition system which addresses these issues in a customizable way from one customer to another, based on a dedicated knowledge server which stores known entities for a specific project, associated with disambiguation methods and normalization methods. Its contents evolve according to historical annotations and allow customers to correct mistakes that will not be made again by the system. This is a key element in providing strong customization capability for each customer without modification to core annotator products.

5:15. Conference Mixer Cocktail

  Tuesday April 29  

09.00. CONFERENCE OPENING

Jason R. Baron – Day Two Opening Talk
Director of Litigation, National Archives and Records Adminstration, D.C.
Searching for the Good Lawyer: Emerging Best Practices In The Use of Search and Information Retrieval Methods in E-Discovery

The cost of litigation in the US involving electronically stored information (so-called “E-discovery”) is burgeoning: according to one Forrester study issued in 2006, corporate America is expected to increase its spending on E-discovery from what was $1.4 billion dollars in 2006 to $ 4.8 billion in 2011. Under the new US Federal Rules of Civil Procedure effective December 1, 2006, both private and public sector litigants increasingly confront a world of preservation orders, injunctions, subpoenas, and other demands for access to exponentially increasing volumes of records and information stored electronically. One set of emerging best practices highlighted by a recent best practices commentary published by The Sedona Conference® involves more serious attention being paid by lawyers and judges to information retrieval issues, and the deficiencies in the way in which documents and ESI routinely are searched for, including by way of “keywords.” This presentation provides a strategic approach to search problems lawyers and their clients face in e-discovery, as well as practical pointers drawn from real cases; it also includes the latest findings from the TREC Legal Track, an international text retrieval project run out of the US National Institute of Standards and Technology.

Marcelline Saunders
Groxis, California
Powering Search Results with Visualization

This presentation discusses trends and benefits of the convergence of search and visualization tools, including brief historical information to provide context. It looks at the latest tools as well as reviews the early adoption and specific use cases in a number of verticals. Finally it discusses how powering search results with visualization will impact the market moving forward, including the impact on today's most popular business models.

Richard Brath
Oculus, Canada
Search, Sense-Making and Visual User Interfaces

Following from research, observations, and interviews we determined that search and sense-making involved many different component tasks and many different workflows through these tasks. Advanced technologies, such as entity extraction, classification, etc, address part of the problem, but significant end-user performance improvement in search and sense-making tasks can be also be achieved through innovating the end-user interface to enable better workflows across these different technologies and tasks with a single unified interface. We have created a new interface using visualization techniques called nSpace to implement these ideas including a component called TRIST for interaction with large amounts of results data to help users find the relevant, novel and unexpected; and a component called Sandbox for collecting, organizing and reasoning with pieces of information for sense-making. nSpace uses novel techniques such as linked dimensions for characterization and use of gestures for fluid workflows. The presentation discusses some of the research findings, shows some examples of the nSpace interface, and discusses some of the results and feedback.

10:30 Conference Break

Abe Lederman
Deep Web Technologies, New Mexico
Federated Search: True Enterprise Search

Enterprise Search Software as it is known today, whether from Autonomy, Endeca, FAST or others, cannot provide access to all the information of value at any reasonably sized organization with a single search. Organizational information-content exists in numerous silos accessible through a myriad of individual, incompatible indices-engines. Technical, cost and bureaucratic reasons prevent unifying all these various enterprise silos under one index.
This presentation discusses how state-of-the art Federated Search software provides actual enterprise (-wide) single point of search-access to most, if not all, of the information repositories of value to an enterprise, including those beyond the firewall.

Spencer Shearer
Exalead, Massachusetts
The Next Big Thing in Search: Hybrid/Vertical Search

Recent trends in search have centered largely on specialized search functions, such as image and video search, signaling the importance of the ability to index specific types of information and different types of content. However, search’s potential does not end there. It will continue to grow, providing users with the ability to combine different forms of content in more effective ways. By connecting sources that were until now considered separate, hybrid/vertical search eclipses traditional text-oriented and directory-based search methods, and is poised to become the search industry’s next big trend.
In this presentation, Bourdoncle offers a technical perspective on the latest technologies to facilitate hybrid/vertical search, which ultimately fosters a simpler, more natural search experience for the user. Bourdoncle also provides insight into his first-hand experience in helping to design Exalead’s hybrid/vertical search solution and will discuss the various opportunities for employing hybrid/vertical search. In addition, the presentation addresses the benefits and challenges of the technologies employed, including entity extraction, real-time indexing, taxonomies and navigation.

Edwin Cooper
InQuira, California
Two Roads Diverged in a Google World

Is Enterprise Search the dull cousin of web-wide search? Is it destined to play follow-the-leader in the innovation game, gradually adapting technologies to the enterprise that were originally created for full web searches by Google, Yahoo, and MSN?
This presentation suggests that Enterprise Search is not just one application of full web search technology, but rather a fundamentally different problem. As such, the technologies of Enterprise Search and full web search are on unavoidably divergent paths; differences in content production, quality, and knowledge of domain will inevitably lead to fundamentally different solutions for the two problems.

12:20 Conference Lunch

The Emergence of Next Generation Systems

Brad Allen
Siderean Software, California
Relational Navigation Brings Social Computing and Semantic Technology to the Enterprise

Enterprise IT managers are increasingly looking for the quickest and most effective ways to bring the best of Web 2.0 and social networking into their organizations. This is a difficult task and many IT managers do not know where to where to begin and which technologies offer the fastest return on investment in this brave new world. One technology that is showing great promise in terms of bringing together the best of Web 2.0 for the enterprise is relational navigation, because it can bring together the best of the Web, such as tagging, sharing and annotating results to extend the huge investments that have already been made in enterprise content management (ECM), document and database management, and enterprise search.
Fundamentally, relational navigation is about providing a more effective way to find and manage the myriad of content enterprises import, store and export. It improves ECM by leveraging semantic technology and the principles of the social Web to aggregate, organize, manage and navigate an information centric architecture in ways that were never possible before. By focusing on the relationships between people, places and things, relational navigation maintains context and allows participation in the discovery process. In this presentation attendees will hear:

  • The benefits of relational navigation

  • How Web 2.0 is changing the enterprise

  • The future of Web 2.0

Chris Cleveland
Dieselpoint, Illinois
Open Pipeline: An Open Architecture for Document Processing

Open Search, an new XML standard, has simplified the process of standardizing search results from multiple sources. Open Pipeline is a new standard for the indexing side of the equation. It is a simple, common architecture for connectors to data sources, file filters, text analyzers, and modules to distribute documents across a network. Partly an API and partly a feed protocol based on Atom/RSS, it provides an open, non-proprietary way to fetch, parse, analyze, and route documents.

2:45 Break and Final Exhibition

Kelly Stirman
Mark Logic Corporation, California
Classification of XML: Leveraging Semantics and Syntax

In today’s fast moving marketplace, content applications need to quickly adapt to new content sources and new market demands. MarkLogic Server’s XML classifier is a new tool to let content owners:

  • Classify XML at any level of granularity: assign class membership for the whole document, an individual element, or anywhere in between

  • Classify synthetic documents: use the title, abstract, and first paragraph of each section, ignoring the footnotes

  • Classify based on semantics or syntax: leverage indexes for text, structure, or a mixture of both

  • Incorporate classification into the rendering logic: allow classification output to dynamically affect the rendering of content

This presentation describes how MarkLogic exposes its XML classifier through an XQuery interface, and shows a demonstration of the XML classifier at work within a content application built using MarkLogic.


Presentation of the Everett Brenner Award for the Best Paper at the 2008 Search Engine Meeting

Meeting Wrap-up Panel: What we Liked. What we Learned

Two expert industry commentators reflect on what was said during the two days of the 2008 Search Engine Meeting and, with the help of the audience, draw some lessons and conclusions.



Conference Ends at approximately 4.00 pm