Release of 3.1.1 for testing

Started by Gert, 09. May 2007, 20:28:47

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

dreamingsky

I uploaded the "Thai NIU" in version 3.1.2beta1 to Sourceforge.  It is in the "dictionary ThaEng (NIU), 3.1.2" directory.  The file name is "DictionaryForMIDs_3.1.2beta_ThaEng_NIU_Thai.zip".

I found a few more problems with the bitmap fonts.

I.
I ran the bitmap font generator with font size 12 (only 1 size).  Then I started the program.  I went to "Settings" and turned on the bitmap font setting.  Then I went to the "font size" and selected "12" (it was already selected).  Then I got this error (while on the setting screen):

Thrown de.kugihan.dictionaryformids.general.g:
Incorrect bitmap font size setting: 14
Incorrect font size setting: 14

I think the problem is from an earlier setting I had.  Before I had the bitmap fonts set to size 14.  Then I turned off the bitmap fonts.  Then I made size 12 bitmap fonts and re-ran the program.  This is when I got the above error.

I also got that error in a 2nd way.  I ran the program with the bitmap fonts turned on with size 14.  Then I ran the bitmap font generator with only size 16.  When I started the program again (the bitmap fonts option was still turned on from the previous time).  I got the same error about size 14 (I didn't have size 14, only size 16).  I couldn't even get to the search page until I re-ran the font generator with size 14.

This 2nd problem shouldn't actually happen in the real world (because we can only use 1 dictionary at a time now.  I found the problem with the Wireless Toolkit.)  But, later when we get the dictionary loader working, then the problem will arise.

So, maybe we need code that runs when the program is started to check if the "font size" setting currently saved actually exits in the dictionary that is loaded.

II.
The bitmap fonts don't display correctly with the "Arial Unicode MS" font.  Every entry is shifted right half-way in the screen.

Also, most of the text seems to have disappeared.  The entire 1st search result for "table" disappeared when using the bitmap fonts.  It had a long example sentence in it.  With the bitmap fonts, the search result started at the next search result.

III.
I can't scroll down the screen with the bitmap fonts.  A search may have 10 hits.  A page will show 5.  But I can't scroll down to see the other 5.  It just scrolls down into small empty white boxes.

IV.
I tried to verify error II with another font.  So I ran the bitmap font generator with the "Cordia New" font (this is the default font for Thai.  It is installed by Windows).  This time the font was displayed on the left side of the screen correctly.  So only the bitmaps from "Arial Unicode MS" were making errors.  However, I can't use the "Cordia New" font because of the "complex scripts" limitation I mentioned before.  I must use "Arial Unicode MS".

Error III was also fixed by using the "Cordia New" font.  Scrolling worked fine.

V.
The "Cordia New" fonts looked OK.  But, I couldn't actually change the font size.  I'd start with size 12.  Then I'd go in the settings and select size 16.  However, the screen still showed size 12.

VI.
Then I ran the font generator for "Arial Unicode MS" for the Hindi dictionary.  I searched for "temple".  None of the Hindi or anything of the example sentences or grammar tags (in coloured text) showed up.  Only the English in black showed up.

Then I searched for "tree".  Everything looked OK, except the example sentence was on the same line as the search result.  It should be on the next line (there is an "\n" in the source dictionary file).

Then I searched for "house".  Then I got Error II.  The search results were shifted to the right.  And none of the Hindi or example sentences showed up.  Only the English search results were shown.


I uploaded the Hindi source files for the developers.  It is in the "dictionary EngHin (IIIT), 3.1.2" directory.  The file is titled "DfM_Hindi_source_312beta.zip".

You will need Hindi set up on your computer to use it.  For WinXP:
Control Panel -> Regional and Language Options -> Languages -> select "Install files for complex script and right-to-left languages"

Then open the zip file and extract it to "C:\".  The directory structure is already set as:
C:\Temp\Dict\Hindi\

Then run C:\Temp\Dict\Hindi\setup.bat
Then run the bitmap font generator
Then run C:\Temp\Dict\Hindi\jar.bat

The directories are hard-coded in the BAT files.  Feel free to change the environment how you'd like.

The fonts to use for the font generator are "Arial Unicode MS" (an optional install with Microsoft Office 2000 and higher) and "Mangal".  "Mangal" is the default font for Hindi on Windows, but "Arial Unicode MS" looks better.

Jeff

Tomcollins

1) seems to be a problem with the settings store. Do you have an idea Gert?

2 & 3) & 4) is all the same; This one is unfortunately really a bug of the bitmapFontFeature; I already located it, but it'll take some more investigation how to fix it. Strange that it never happend with the chinese, since there are much more characters in it...
I always use ArialUnicodeMS...

5) I don't really know. Maybe a bug in DFM generally which I've with the settings too. Please try: first change to 10 and then change to 14.
On W700 I've this problem too with the interface language. I always have to switch back to english before i can choose another language. Do you have this problem too?

6) I think the same as 2,3 & 4.

Thanks for your detailed testing.

Sebastian

Gert

Fantastic to see your work for testing / improvements !!! :)

1)
Jeff, I assume that you did do all your tests with Sun's WTK, right ?

That problem with storage of settings in Sun's WTK is well-known (I also included it the FAQ).
However, this is not only related to the font size settings. It also shows up in other situations. For example:
- you run a dictionary with 3 languages
- select language 3
- re-run with a 2 languages dictionary
-> you have an exception because of an illegally selected language (the application will not even start)

These problems do not show up in any real device - only in the WTK development environment.

To avoid this, when you re-build a dictionary with different bitmap fonts sizes, etc., just delete the WTK storage files (somewhere in the FAQ there is a description how to do this).

Ok, I see, because people keep running on these problems on WTK, I think we should try to make something like an 'erroneous storage settings detection', where we try to detect, for example, invalid bitmap font sizes. Hmmm, we need to think about all possible error situations.

And yes, oops, we need to consider this also for loadable dictionaries !! This is not yet done - thank you for this hint Jeff !!


2 & 3) & 4)
Sebastian, I am glad to read that you already looked at this !
Just one request: if you have a fix, can you also include that fix in the All_3_1_1_branch ? And then ;) re-build and upload everything ... ?

Thanks to you !!
Gert

Tomcollins

Jeff:
I think the problem is with the "byte order mark"/header-character, which is not removed in your dictionary. (if you use e.g. hexplo, then you can see an EFBBBF). The Font Generator cannot handle this "character" properly yet.
I'll do an update of the Font Generator these days, which takes care of this "character".

Gert: Maybe we should remove the header by dictionary generation automaticly, if it exists, since many people forget to remove it.

Sebastian

Gert

Hmmm, never heard about that character - where does it come from ? Is it a legal Unicode / UTF-character ?

Well, if the right solution is to remove this character, then we could do it in DictionaryGeneration, so BitmapFontGenerator will not have to bother about this. Just that I will not have time to work on source code during the next weeks.

Gert

dreamingsky

The BOM is causing the problem?  Interesting.  The BOM (byte order mark) http://en.wikipedia.org/wiki/Byte_order_mark isn't an illegal character.  Basically it's a code to tell programs the file is encoded as UTF-8.  UTF-16 uses another character.

It is a normal character: "zero-width no-break space".  If you don't save a UTF-8 file with a BOM, then the next time the program opens the file it must guess what encoding the file has.  If you save with a BOM, then when a program opens the file then it knows the file is UTF-8.

I think it's a good idea to remove the BOM with the font generator not with DictionaryGeneration.  I think having the .csv files with the BOM would be a good idea.  Then you can open the files easier with a program.  I wouldn't recommend asking the users to manually remove it, since it is a good idea to save UTF8 files with a BOM.

I can manually remove the BOM and do some more testing for the time being.  I'll do some more testing tomorrow.
Jeff

Gert

Jeff,

thank you for your link ! Yes, indeed it seems that this is a legal character. Hmmm, so why could this cause a problem in the font generation ?

Gert

Tomcollins

I think DFM, as it is now, doesn't 'know' this bom, so the first entry may not show up properly.. (at least I think I had problems once)

But you are right jeff, i also think that we should work with boms! Also dictionaryInputFiles with boms should be possible, since there maybe users who don't know how to easily remove it.

jeff: I think there is more then one problem, so I think you can wait till I have located them and updated the bitmapfontGeneration, before you test it again. Strange that it works with other fonts!?!

Sebastian