Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Platform - User Assistance (UA) » Can Infocenter use a search engine that supports Hebrew?
Can Infocenter use a search engine that supports Hebrew? [message #475431] Wed, 25 March 2009 18:09 Go to next message
Eli Lato is currently offline Eli LatoFriend
Messages: 35
Registered: July 2009
Member
So far as I can tell, Infocenter's search engine is Lucene, and Lucene
doesn't support Hebrew.

Can I tell Infocenter to use a search engine that can find Hebrew text?
I'd be happy to give up Lucene's sophisticated seach in order to be able
to find simple Hebrew text.

Thanks,
Eli
Re: Can Infocenter use a search engine that supports Hebrew? [message #475437 is a reply to message #475431] Mon, 30 March 2009 21:30 Go to previous messageGo to next message
Chris Goldthorpe is currently offline Chris GoldthorpeFriend
Messages: 815
Registered: July 2009
Senior Member
Currently the help system has analyzers for Brazilian, Chinese, Czech,
German, Greek, French, Dutch, Russian and English. What this means is
that the search understands something about how words are constructed in
different languages and how to recognize endings such as 's' for plural,
"ed" for past participle etc.

For every language other the search is less sophisticated, and search
will require an exact match to be in the text, and will only recognize a
limited number of separator characters.

I have no idea how well or badly this works in Israeli, since you posted
this I'm guessing not so well. You may want to check out to see if
Lucene has an Israeli analyzer, if so it would not be difficult to hook
it into the current mechanism, it is done using extension points.


Eli Lato wrote:
> So far as I can tell, Infocenter's search engine is Lucene, and Lucene
> doesn't support Hebrew.
>
> Can I tell Infocenter to use a search engine that can find Hebrew text?
> I'd be happy to give up Lucene's sophisticated seach in order to be able
> to find simple Hebrew text.
>
> Thanks,
> Eli
>
Re: Can Infocenter use a search engine that supports Hebrew? [message #475442 is a reply to message #475437] Tue, 31 March 2009 11:43 Go to previous messageGo to next message
Eli Lato is currently offline Eli LatoFriend
Messages: 35
Registered: July 2009
Member
Chris,
You explained that Lucene can find simple Hebrew text, but my search can't
find any Hebrew text at all. This hints that I'm probably not configuring
the Unicode encodings correctly in the plugin. So my next stop is Lucene
to see how to config it.
Thanks for a very helpful answer!
Eli
Re: Can Infocenter use a search engine that supports Hebrew? [message #475469 is a reply to message #475442] Mon, 06 April 2009 20:08 Go to previous messageGo to next message
Gerardo Laster is currently offline Gerardo LasterFriend
Messages: 14
Registered: July 2009
Junior Member
Hi,
We have a similar issue with extended character support. The search engine
works on infocenters running on Windows, but does not work on Solaris.
Are there any server settings that need to be modified so the search
engine detects Japanese and extended characters?
Thanks
Gerardo
Re: Can Infocenter use a search engine that supports Hebrew? [message #475471 is a reply to message #475469] Mon, 06 April 2009 22:30 Go to previous messageGo to next message
Chris Goldthorpe is currently offline Chris GoldthorpeFriend
Messages: 815
Registered: July 2009
Senior Member
You shouldn't need to do anything on the server side to handle non ASCII
characters as long as the html or xhtml document specifies the charset
used. Is the problem only with non-ascii documents or do you see it in
all languages?


Gerardo Laster wrote:
> Hi,
> We have a similar issue with extended character support. The search
> engine works on infocenters running on Windows, but does not work on
> Solaris.
> Are there any server settings that need to be modified so the search
> engine detects Japanese and extended characters?
> Thanks
> Gerardo
>
Re: Can Infocenter use a search engine that supports Hebrew? [message #475473 is a reply to message #475471] Tue, 07 April 2009 16:22 Go to previous messageGo to next message
Gerardo Laster is currently offline Gerardo LasterFriend
Messages: 14
Registered: July 2009
Junior Member
Hi Chris,
The way the search "works" right now is that if you search for word with
no extended characters it shows results, however if you search for a word
with extended characters it does not show any results. Of course with JA,
KO etc, the problem is critical as all the characters are non-ascii.
Our content is defined as
<html xmlns="http://www.w3.org/1999/xhtml" lang="ja-jp" xml:lang="ja-jp">
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>

And this only affects the search feature.
Thanks,
I am going to check to the info on the other thread.
Re: Can Infocenter use a search engine that supports Hebrew? [message #475478 is a reply to message #475473] Thu, 09 April 2009 21:20 Go to previous messageGo to next message
Chris Goldthorpe is currently offline Chris GoldthorpeFriend
Messages: 815
Registered: July 2009
Senior Member
Can someone file a bug report on this with a simple test plug-in
containing documentation and the exact steps you took (i.e. did you
search from the help view or from the help browser)? I have not heard of
this problem before.

Chris

Gerardo Laster wrote:
> Hi Chris,
> The way the search "works" right now is that if you search for word with
> no extended characters it shows results, however if you search for a
> word with extended characters it does not show any results. Of course
> with JA, KO etc, the problem is critical as all the characters are
> non-ascii.
> Our content is defined as <html xmlns="http://www.w3.org/1999/xhtml"
> lang="ja-jp" xml:lang="ja-jp">
> <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>
>
> And this only affects the search feature.
> Thanks,
> I am going to check to the info on the other thread.
>
>
Re: Can Infocenter use a search engine that supports Hebrew? [message #475480 is a reply to message #475478] Fri, 10 April 2009 21:21 Go to previous message
Gerardo Laster is currently offline Gerardo LasterFriend
Messages: 14
Registered: July 2009
Junior Member
Hi Chris,
I opened the bug report
https://bugs.eclipse.org/bugs/show_bug.cgi?id=271924 for this issue.
Thanks
Gerardo

Chris Goldthorpe wrote:

> Can someone file a bug report on this with a simple test plug-in
> containing documentation and the exact steps you took (i.e. did you
> search from the help view or from the help browser)? I have not heard of
> this problem before.

> Chris
Re: Can Infocenter use a search engine that supports Hebrew? [message #623268 is a reply to message #475431] Mon, 30 March 2009 21:30 Go to previous message
Chris Goldthorpe is currently offline Chris GoldthorpeFriend
Messages: 815
Registered: July 2009
Senior Member
Currently the help system has analyzers for Brazilian, Chinese, Czech,
German, Greek, French, Dutch, Russian and English. What this means is
that the search understands something about how words are constructed in
different languages and how to recognize endings such as 's' for plural,
"ed" for past participle etc.

For every language other the search is less sophisticated, and search
will require an exact match to be in the text, and will only recognize a
limited number of separator characters.

I have no idea how well or badly this works in Israeli, since you posted
this I'm guessing not so well. You may want to check out to see if
Lucene has an Israeli analyzer, if so it would not be difficult to hook
it into the current mechanism, it is done using extension points.


Eli Lato wrote:
> So far as I can tell, Infocenter's search engine is Lucene, and Lucene
> doesn't support Hebrew.
>
> Can I tell Infocenter to use a search engine that can find Hebrew text?
> I'd be happy to give up Lucene's sophisticated seach in order to be able
> to find simple Hebrew text.
>
> Thanks,
> Eli
>
Re: Can Infocenter use a search engine that supports Hebrew? [message #623271 is a reply to message #475437] Tue, 31 March 2009 11:43 Go to previous message
Eli Lato is currently offline Eli LatoFriend
Messages: 35
Registered: July 2009
Member
Chris,
You explained that Lucene can find simple Hebrew text, but my search can't
find any Hebrew text at all. This hints that I'm probably not configuring
the Unicode encodings correctly in the plugin. So my next stop is Lucene
to see how to config it.
Thanks for a very helpful answer!
Eli
Re: Can Infocenter use a search engine that supports Hebrew? [message #623317 is a reply to message #475442] Mon, 06 April 2009 20:08 Go to previous message
Gerardo Laster is currently offline Gerardo LasterFriend
Messages: 14
Registered: July 2009
Junior Member
Hi,
We have a similar issue with extended character support. The search engine
works on infocenters running on Windows, but does not work on Solaris.
Are there any server settings that need to be modified so the search
engine detects Japanese and extended characters?
Thanks
Gerardo
Re: Can Infocenter use a search engine that supports Hebrew? [message #623319 is a reply to message #475469] Mon, 06 April 2009 22:30 Go to previous message
Chris Goldthorpe is currently offline Chris GoldthorpeFriend
Messages: 815
Registered: July 2009
Senior Member
You shouldn't need to do anything on the server side to handle non ASCII
characters as long as the html or xhtml document specifies the charset
used. Is the problem only with non-ascii documents or do you see it in
all languages?


Gerardo Laster wrote:
> Hi,
> We have a similar issue with extended character support. The search
> engine works on infocenters running on Windows, but does not work on
> Solaris.
> Are there any server settings that need to be modified so the search
> engine detects Japanese and extended characters?
> Thanks
> Gerardo
>
Re: Can Infocenter use a search engine that supports Hebrew? [message #623321 is a reply to message #475471] Tue, 07 April 2009 16:22 Go to previous message
Gerardo Laster is currently offline Gerardo LasterFriend
Messages: 14
Registered: July 2009
Junior Member
Hi Chris,
The way the search "works" right now is that if you search for word with
no extended characters it shows results, however if you search for a word
with extended characters it does not show any results. Of course with JA,
KO etc, the problem is critical as all the characters are non-ascii.
Our content is defined as
<html xmlns="http://www.w3.org/1999/xhtml" lang="ja-jp" xml:lang="ja-jp">
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>

And this only affects the search feature.
Thanks,
I am going to check to the info on the other thread.
Re: Can Infocenter use a search engine that supports Hebrew? [message #623326 is a reply to message #475473] Thu, 09 April 2009 21:20 Go to previous message
Chris Goldthorpe is currently offline Chris GoldthorpeFriend
Messages: 815
Registered: July 2009
Senior Member
Can someone file a bug report on this with a simple test plug-in
containing documentation and the exact steps you took (i.e. did you
search from the help view or from the help browser)? I have not heard of
this problem before.

Chris

Gerardo Laster wrote:
> Hi Chris,
> The way the search "works" right now is that if you search for word with
> no extended characters it shows results, however if you search for a
> word with extended characters it does not show any results. Of course
> with JA, KO etc, the problem is critical as all the characters are
> non-ascii.
> Our content is defined as <html xmlns="http://www.w3.org/1999/xhtml"
> lang="ja-jp" xml:lang="ja-jp">
> <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>
>
> And this only affects the search feature.
> Thanks,
> I am going to check to the info on the other thread.
>
>
Re: Can Infocenter use a search engine that supports Hebrew? [message #623328 is a reply to message #475478] Fri, 10 April 2009 21:21 Go to previous message
Gerardo Laster is currently offline Gerardo LasterFriend
Messages: 14
Registered: July 2009
Junior Member
Hi Chris,
I opened the bug report
https://bugs.eclipse.org/bugs/show_bug.cgi?id=271924 for this issue.
Thanks
Gerardo

Chris Goldthorpe wrote:

> Can someone file a bug report on this with a simple test plug-in
> containing documentation and the exact steps you took (i.e. did you
> search from the help view or from the help browser)? I have not heard of
> this problem before.

> Chris
Previous Topic:Can Infocenter display files other than HTML and PDF?
Next Topic:Anyone have experience with PDF generation from Eclipse Help?
Goto Forum:
  


Current Time: Fri Mar 29 15:45:56 GMT 2024

Powered by FUDForum. Page generated in 0.04755 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top