Hebrew search: not a trivial task (Hebmorph)---Itamar Syn-Hershko

Abstract

Search engines available today render almost useless when you need to process Hebrew texts. Even effective open-source solutions like Lucene/Solr give up in despair when handed a Hebrew corpus to index.

HebMorph is an open-source project with the ultimate goal of solving this problem, and in the best way possible. In-depth understanding of how search engines work triggers ideas, ideas become software parts, and those in turn are experimented with various state of the art search engines. HebMorph already has a production-ready solution with several happy users. It is now in the phase of further development and relevance testing.

This talk is all about practical Hebrew search. We will see how search engines work, and learn how we can use their power to make our applications better for us and for our users. This will allow us to get a glimpse at the large array of issues Hebrew poses to search engines, and after seeing some real-world examples we will get to know how HebMorph approaches this.

Back to the Club's homepage