java - Lucene Search with Unicode Characters -


I have indexed a database of some texts and database texts are of Unicode encoding. When I search an English term with Lucene Search, everything goes fine. But when I use a non-English queries like: "تو" It gives me the following exception:

exception in thread "main" org.apache.lucene.queryParser.ParseException: not parse can do ' ?? ':' 'or'? 'Org.apache.lucene.queryParser.ParseException: org.apache.lucene.queryParser.QueryParser.parse on Search.main (QueryParser.java:187) as a result of the first character in WildcardQuery on (Search.java:151) Not allowed: '' or '?' First org.apache.lucene.queryParser.QueryParser.getWildcardQuery (QueryParser.java:923) on org.apache.lucene.queryParser.QueryParser.Term (QueryParser.java:1347) on org.apache allowed as character WildcardQuery is not. lucene.queryParser.QueryParser.Clause (QueryParser.java:1250) org.apache.lucene.queryParser.QueryParser.Query (QueryParser.java:1178) on org.apache.lucene.queryParser.QueryParser.TopLevelQuery (on QueryParser.java: 1167) org.apache.lucene.queryParser.QueryParser.parse (QueryParser.java:182) ... more 1

what should I do?

Thank you

here two points -.

  • encoding of what is sure (* .java) your srouce file that UTF8
  • is other than the default encoding UTF8 Java is expected to show some . Make sure that you specify the encoding like: () New FileInputStream (filename, "UTF-8")

    InputStreamReader; `


Comments

Popular posts from this blog

c# - sqlDecimal to decimal clr stored procedure Unable to cast object of type 'System.Data.SqlTypes.SqlDecimal' to type 'System.IConvertible' -

Calling GetGUIThreadInfo from Outlook VBA -

Obfuscating Python code? -