How to search for entries?

Started by axin, 08. January 2010, 12:16:43

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

axin

Dear all,

I'm currently implementing a search-as-you-type / auto-completion function for the Android UI. It works well for things like taking German->Chinese dictionary, entering dampfs it will suggest Dampfschiff, you click on Dampfschiff and the corresponding entry (as well as other matches) are shown.

But there are problems with more complicated entries.

Let's take again the German-Chinese dictionary. The user enters gut, the auto-completion searches the dictionary for gut* and suggests Glück, Gut, segensreich (Adj), the user clicks on the suggestion, the dictionary is searched for Glück, Gut, segensreich (Adj) and no results are found. But when the user just searches for what he enters (gut), of course Glück, Gut, segensreich (Adj) is found.

Now my question: How do I have to format search queries so that the suggested result is found? What parts should I remove? Are there functions ready to do this? (e.g. somewhere hidden in the Normation classes??) I guess I should run the same function on the suggestion that is used to create the search index...?

Please let me know what you think about this.

Thanks,

Achim

Gert

Great that you continue to enhance the Android version of DictionaryForMIDs !!

And thank you for your good postings here in the forum !

Well, about your problem ... I guess I am already too narrow-minded in my views because of the JavaME version ... cause I do not really understand what happens in the Android version after the user "the user clicks on the suggestion":
QuoteThe user enters gut, the auto-completion searches the dictionary for gut* and suggests Glück, Gut, segensreich (Adj), the user clicks on the suggestion, the dictionary is searched for Glück, Gut, segensreich (Adj) and no results are found.

Let me try to explain what happens in the JavaME version:

In the JavaME version, when you activate in the Settings dialogue the 'Incremental Search' and the 'Translation List', then after the user enters 'gut', the screenshot 'Search_gut' appears  (step 1).

Then the user can scroll in the 'translation list' and select the entry "Glück, Gut, segensreich (Adj)", see screenshot 'Select_gut'  (step 2).

Finally, when the entry "Glück, Gut, segensreich (Adj)" is selected, the user selects 'Translate' and the translation is shown, see screenshot 'Translation_gut'  (step 3).

The interesting point to mention is: for (step 3) no translation is executed in the translation layer, because all the translations were already retrieved in (step 1). But the translations are not shown when the 'Translation List' is activated; only after the user selects one entry in the translation list and then selects 'Translate', then the translation is shown. So the 'Translate' function in (step 3) does actually a 'show translation of selected entry'.

Does that explanation on the JavaME version help you ?

Best regards,
Gert

axin

Hi Gert,

thanks for the extensive explanations! If I see it correctly, on JavaME we have a search-as-you-type feature, yes?

That's a cool feature. I was thinking maybe a complete search of the dictionary is too time consuming, so I thought let's make a auto-complete feature when entering search terms - similar to the one on Google.com: You start typing, some suggestions appear in a drop-down box, you click on one of the suggestions and then the standard search for the suggested term is performed.

Quote from: Gert on 08. January 2010, 20:28:30
Well, about your problem ... I guess I am already too narrow-minded in my views because of the JavaME version ... cause I do not really understand what happens in the Android version after the user "the user clicks on the suggestion":
So after a user clicks on one of the suggested terms, I perform a standard search on the dictionary using the suggested term as the search term. Does this help to answer your question?

To find the suggestions, the search is limited to break after 10 results or 1 second, to quickly show some suggestions. If I do a whole search right away in some cases it takes a while to fill the list...

But back to the formatting question: What happens, if you enter Glück, Gut, segensreich into the JavaME search field? Will it find the corresponding translation? Is this kind of search supported by our underlying data structure?

-Achim

Gert

#3
Achim,

QuoteIf I see it correctly, on JavaME we have a search-as-you-type feature, yes?

Yes, the JavaME version has a "search-as-you-type feature", called "incremental search" in the JavaME Settings dialogue.

This incremental search does a search on the search expression as the user types it and all hits will be shown (either as a 'translation list' or in the normal result display). In general, the search will complete rather quickly,  of course depending on the search expression and the device. Well, I know that the Android Dalvik VM is rather slow compared to modern Just-in-Time compilers. And in any case, also on JIT-powered JavaME devices, the response time is not sophisticated when there are plenty of hits, for example when a search is done on "a*".

For this reason the Translation layer will break (= cancel) the execution of a search when one of the following is true:
- The number of translation hits is reached as defined in TranslationParameters.maxHitsParam
- The duration of the translation in milliseconds has reached TranslationParameters.durationForCancelSearchParam
- TranslationExecution.cancelLastTranslation is called

When the user updates the search expression, e.g. by adding a character, then the last executing translation is cancelled by calling TranslationExecution.cancelLastTranslation. Note that for incremental searches, the translations are executed by a separate thread, i.e. TranslationParameters.executeInBackgroundParam is set to true.  

The above description concerning cancelling of searches in the Translation layer is not specific to JavaME or any other platform; it is implemented in the Translation layer and therefor available to all platforms.


Ok, you knew most of that anyway, I just wanted to elaborate a little there ...  8) now let me try to answer your question.

QuoteWhat happens, if you enter Glück, Gut, segensreich into the JavaME search field? Will it find the corresponding translation? Is this kind of search supported by our underlying data structure?

Well, a normal search is done on "Glück, Gut, segensreich". That means that the NormationClass is invoked on the search expression etc and the corresponding hits will be found.

However if "Glück, Gut, segensreich (Adj)" is entered, then, as you described, no hits will be found, because the "(Adj)" information is not included in the search index. That is the same also for JavaME.

Hmmmmm, does any of my above writing help you ? Guess not really, maybe I should think a little more about your problem :-\

Best regards,
Gert


axin

Hi Gert,

you are right, most parts I was guessing... now I'm sure, thanks :-)

You are saying, if you search JavaME version for Glück, Gut, segensreich (without (Adj)), the correct result is found. Can you try that again? On Android, searching for this string already gives no results...

Anyways, this is what I'm guessing:
There is some kind of Excel-Database that includes a row where we have two cells Glück, Gut, segensreich (Adj) and xing4 yun4 (Chinese characters omitted here).
Then we have a function that takes Glück, Gut, segensreich (Adj), that removes strange stuff and splits it into Glück, Gut and segensreich and links those three entries to the original entry which then is linked to the Chinese translation.
So if a user searches for either Glück or Gut or segensreich, that complete entry will be quickly found. But if he searches for the whole thing, the entry cannot be found as only its parts are in the search index.
Is that correct till up to now?

BUT now I take a different search term: gut, artig, schöne Frau (Adj). If I enter that complete term (including (Adj)) into the search on Android - surprisingly the translation is found...
I really don't understand what's the difference between those two search terms - why one can be found while the other cannot be found. Is there a problem in my implementation on Android? Or is that the same on JavaME?

Thanks for your patience  ::)
Achim

Gert

Achim,

honestly speaking ... I just was assuming that "Glück, Gut, segensreich" would be found - I now tested it ... and it is not found  ::)

I also can confirm that "gut, artig, schöne Frau (Adj)" is found.

However, I believe the problem is related to the set-up of German-Chinese dictionary (I spent a _lot_ of time to do all sort of testing of the generation and search algorithms of DictionaryForMIDs, and I would be very surprised if there should be any error in the implementation; and if you really found an error there, then I guarantee you free support for all times  ;)  ).

We would have to look at classes such as DictionaryUpdateHanDeDictGer in order to understand how the index is built. Is there a chance that you contact Sebastian and clarify these topics with him ?


QuoteSo if a user searches for either Glück or Gut or segensreich, that complete entry will be quickly found. But if he searches for the whole thing, the entry cannot be found as only its parts are in the search index.

I'd assume that you are right (but it really depends how the index is built). This is the behaviour when dictionaryGenerationLanguageXExpressionSplitString is used. Or if there is a DictionaryUpdate class which does a similar handling as dictionaryGenerationLanguageXExpressionSplitString.

Hope there is a chance to sort things out with Sebastian ... and I will certainly take responsibility if you convince me that the implementation is buggy ;)

Best greetings !
Gert

P.S.: I am short of time these weeks, so that right now I do not have the time to look in the index files of the German-Chinese dictionary in order to track the problem further down.

axin

Hi Gert,

I tried to further investigate into this matter and downloaded the source dictionary file from HanDeDict. But as this file did not include all the entries anymore (Glück, Gut, segensreich was missing), I could not trace the dictionary generation for this entry. Once I find other, suitable terms I may continue debugging the dictionary generation process (the free support for all times sounds promising  ;) ).

Actually I found another problem that seems to make the auto-completion hard to implement: When trying to auto-complete a term in a dictionary with multiple contents (languageXContentY) it's hard to decide to which content to use for auto-completion, as some may be better than others.
For example, consider an entry "[01noun][02nougat]". If the user wants to search for nougat, he starts typing "nou". Now I don't know what to add to the auto-completion drop down field: noun OR nougat (adding both would again make the search for "noun nougat" return no results)

Because of this and the previous problem I'll remove the auto-completion feature and replace it by the JavaME incremental/search-as-you-type function. Of course, if someone finds a way around those problems I'm still happy to provide both features on the Android platform!

-Achim

Gert

Achim,

thank you for your further investigation on HanDeDict !!

The days before I had some more thoughts on your 'google-like auto completion drop down field'. However also me I could not figure out a solution that does not have confusing effects (just like you described them in your "noun, nougat" example).

I thought maybe the translation layer could return the index entry under which the hit was found. The index entry certainly would be found if a search is done. But the index entry went through the 'Normation'-class which will produce confusing or even incorrect spellings (as an example: someone may decide for the German language that 3 same following consonants such as in "Schifffahrt" are normated to the index entry "schiffahrt"; so someone who will enter "Schiffahrt" will find the term "Schifffahrt", but the index entry "schiffahrt" is incorrect German spelling and as such not suitable for the drop down list.

So, I could not figure out a solution which would not cause confusion.

Ok, keep us updated please !!

Best regards,
Gert