Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - waldermort

#1
Welcome.

Looking forward to an updated version. If you require a beta tester, please just drop me an email.

Regards

Owen
#2
QuoteJeff or Lars would have to check whether CEDICT provides that information.
Unfortunately, it doesn't.

QuoteHmmm, what exactly would be the lookup table and the algorithm (I admit that I do not really know Chinese/Pinyin)

Welcome.

I would write up a patch myself if I had the time (c/c++ background), but unfortunately my duties call me elsewhere.

Take a look at http://www.studypond.com/pinyin.aspx

Basically pinyin is composed of an Initial followed by a Final or, in some cases, only a final. I would have an array for each and iterate the input string while trying to match an Initial/Final pair (followed by an optional tone number) and add a space accordingly. If it can't be matched then abort and use the input as-is. I believe this could be incorporated into the existing code quite easily.

NOTE, Strings may have combinations such as "Tiananmen" which expanded would be "tian an men". A greedy search would be advised.

NOTE, the letter 'u' with two dots above, in the Final table, is often represented by the letter 'v' in plain ascii. This is actually a bug in the existing code also. An example, the Chinese character '女' (U+5973) in pinyin is 'nü' which can be typed into any existing IME as 'nv'. In the mid dictionary, the search string 'nv' returns no results, and the string 'nu' doesn't return '女' as expected (though it would be nice if it did as I am often typing incorrectly).
#3
Great little tool. It has helped me out many times when I'm out and about and there is a word I need. One major problem I have though is deciding which word to use. For any given word (I'm mainly talking about English to Simplified Chinese here), there are multiple translations, all of which have differing meanings and usages.

The feature I would like to see: For a translated word let us know, among the results, what type of word it is, i.e. Verb, Noun, Adjective. I realize this would require re-writing the dictionaries, but no dictionary is complete without this feature.

As to the existing feature update. When entering pinyin to begin a search, all words must be written in lowercase and separated by a space. Most phones today explicitly set the first letter of a word to be upper case, requiring us to switch. Converting the search parameter to lower case before performing the search will fix this problem. Pinyin without spaces can be fixed by first parsing the string. Since pinyin is simple compared to English, a simple lookup table can speed up the parsing.

Regards