Can someone help me with Russian transcription? I wanted to write a readme to help users with writing Russian transcription in DfM. But, I don't know Russian myself.
DfM has 4 Cyrillic transcriptions Normation classes:
1. NormationRus2.java
2. NormationUkr.java
3. NormationRusC.java
4. NormationUkrC.java
A description of the normation classes is here:
http://dictionarymid.sourceforge.net/newdictNormationLang.html
NormationRus2.java:
Allows you to search words both in Cyrillic and Latin transcription (according to the GOST 1971 - but yards are 'x' and there are used no apostrophes).
I found GOST 16876-71 here:
http://en.wikipedia.org/wiki/GOST_16876-71
But, NormationRus2 is a little different from GOST 16876-71.
Cyrillic GOST 16876-71 Rus2 Ukr RusC UkrC
а a a a a a
б b b b b b
в v v v v v
г g g h g h
д d d d d d
е e e e e e
ё jo yo yo jo jo
ж zh zh zh z z
з z z z z z
и i i i i i
ї yi ji
й jj y y j j
к k k k k k
л l l l l l
м m m m m m
н n n n n n
о o o o o o
п p p p p p
р r r r r r
с s s s s s
т t t t t t
у u u u u u
ф f f f f f
х kh kh kh ch ch
ц c c c c c
ч ch ch ch c c
ш sh sh sh s s
щ shh shh shh sc sc
ъ x x x x
ы y y y y y
ь ' x x x x
э eh eh eh e e
ю ju yu yu ju ju
я ja ya ya ja ja
ґ g g
Here are the 4 changes:
Cyrillic GOST NormationRus2
ё jo yo
й jj y
ю ju yu
я ja ya
Were these 4 changes intentional? Or, are they a mistake?
Also, NormationRusC and NormationUkrC state "according to the Czech ISO norm". Does anyone know the ISO number?
Just a remark at the side: these Normation-classes were set up by Michael Kopecky, maybe you could try to contact him ?
Gert
Sounds good.
Gert
Can you please add these 3 normation classes to the next version of DfM? They are for Cyrillic.
- NormationCyr1.java (Russian, Ukrainian, Macedonian)
- NormationCyr2.java (Russian, Ukrainian, Macedonian)
- NormationBul.java (Bulgarian)
NormationCyr1.java replaces NormationRus2.java and NormationUkr.java.
NormationCyr2.java replaces NormationRusC.java and NormationUkrC.java.
I fixed a few errors in the transcriptions and added some missing transcriptions from the old normation classes. We can keep NormationRus2.java, NormationUkr.java, NormationRusC.java, and NormationUkrC.java in DfM for the Czech dictionaries. But, we can remove the information from newdictNormationLang.html.
Once the new normation classes are added to DfM. I'll edit newdictNormationLang.html to show the new normation classes.
Jeff
Just for reference, here are the normation classes. This might save someone the work of learning it later.
Cyrillic NormationCyr1 NormationCyr2 NormationBul
а a a a
б b b b
в v v v
г g g g
д d d d
е e e e
ё jo yo
ж zh zh zh
з z z z
и i i i
й j j y
к k k k
л l l l
м m m m
н n n n
о o o o
п p p p
р r r r
с s s s
т t t t
у u u u
ф f f f
х h h h
ц c c ts
ч ch ch ch
ш sh sh sh
щ shh shh sht
ъ " " aj aj
ы y y
ь x x x
э eh eh
ю ju yu yu
я ja ya ya
і ij ij
ѳ fh
ѣ je
ѵ yh
ґ gj gj
ѓ gj gj
є ye ye
ї yi yi
ѕ dz dz
ј jj jj
љ lj lj
њ nj nj
ќ kj kj
џ dj dj
ў uj uj
Jeff,
Ok, great - I will add these classes, probably in about two weeks.
Best regards,
Gert
Jeff,
I just tried to build a new version (so you wouldn't have to wait 2 weeks ...).
But I encountered a problem:
[wtkbuild] Compiling 4 source files to C:\Projects\DictionaryForMIDs\Build\DictionaryForMIDs\classes
[wtkbuild] C:\Projects\DictionaryForMIDs\DictionaryForMIDs\src\de\kugihan\dictionaryformids\translation\normation\NormationBul.java:1: illegal character: \65279
[wtkbuild] /*
[wtkbuild] ^
[wtkbuild] C:\Projects\DictionaryForMIDs\DictionaryForMIDs\src\de\kugihan\dictionaryformids\translation\normation\NormationCyr1.java:1: illegal character: \65279
[wtkbuild] /*
[wtkbuild] ^
[wtkbuild] C:\Projects\DictionaryForMIDs\DictionaryForMIDs\src\de\kugihan\dictionaryformids\translation\normation\NormationCyr2.java:1: illegal character: \65279
[wtkbuild] /*
[wtkbuild] ^
[wtkbuild] 3 errors
Hmmm, maybe I still can find the problem.
Regards,
Gert
I did some research. I noticed there was 1 difference between the new normation classes and the old NormationRus2.java:
NormationRus2.java does not have a BOM
I put a BOM in the new normation classes
The UCN for the BOM is \uFEFF, the HTML is .
I re-saved the normation classes with no BOM now. They should work OK now.
Maybe sometime in the future, it would be nice to support a BOM in DfM normation classes. But, it's a low priority.
Jeff
Jeff,
thanks for the update - I will try with your new files.
QuoteMaybe sometime in the future, it would be nice to support a BOM in DfM normation classes.
You mean that the normation classes should filter out the BOM-character ?
Best greetings,
Gert
QuoteYou mean that the normation classes should filter out the BOM-character ?
Yes, I think that would be good. Other people in the future may write normation classes with a BOM. Personally, I save all UTF-8 files with a BOM. Then the file is guaranteed to open correctly in programs.
Jeff,
QuoteQuote
You mean that the normation classes should filter out the BOM-character ?
Yes, I think that would be good. Other people in the future may write normation classes with a BOM. Personally, I save all UTF-8 files with a BOM. Then the file is guaranteed to open correctly in programs.
Hmmmm ...
1) The Java compiler seems not to handle the BOM-character in the .java files. Actually I am surprised about that; anyway, we cannot change the Java compiler. Well, maybe searching for a compiler configuration that handles the BOM may be an option.
2) The implementation of the normation class may filter the BOM-character. Didn't I already add that to NormationLib.defaultNormation ? Hmmmm, I intended to do so, but I guess I did not do that yet ... too deep in the night now to think and remember ...
Ok, I will compile your updated files soon.
Best regards,
Gert
Thanks. Filtering out the BOM is not a high priority. Just, if you have the chance, it'd be nice to add the feature.
Jeff
QuoteThanks. Filtering out the BOM is not a high priority. Just, if you have the chance, it'd be nice to add the feature.
Wasn't that already implemented ... ?
Gert
You just added the feature to remove the BOM from dictionary_input.csv files because the first entry of every dictionary was not indexed before (this was fixed while we were updating the Chinese normation class). But, this new problem is to remove the BOM from reading the normation java files (NormationEng2.java, normationRus2.java, etc).
Jeff,
I did run your 3 Normation classes through the compiler and uploaded version 3.5.6 of DictionaryGeneration_empty and JarCreator to the File Release System (the JarCreator update is required because of a 'bad' dependency to the Normation classes that I still need to remove some time in the future).
The compile worked fine; however I did not yet test the new version. Maybe you could have a look at that ?
Thank you for these 3 classes !!
And sorry for the long time that I needed to compile your files.
Best regards,
Gert
Hmm, it did not work. I got this error:
Thrown de.kugihan.dictionaryformids.general.DictionaryClassNotLoadedException:
Class could not be loaded: de.kugihan.dictionaryformids.translation.normation.Nor
mationBul / Class could not be loaded: de.kugihan.dictionaryformids.translation.
normation.NormationBul
I looked inside verison 3.5.6 of DictionaryForMIDs.jar:
DictionaryForMIDs.jar\de\kugihan\dictionaryformids\translation\normation\
But, I did not see normationBul.class, normationCyr1.class, or normationCyr1.class in the directory.
Should the new normation files be there? Or did I make a mistake? Or do I need a new version of DictionaryGeneration too?
Ahhhh, I think I forgot to add these files to the build file. The name of the Normation files need to be omitted from obfuscation in the build file.
Will correct that.
Gert
Jeff,
I am sorry for the inconvenience !
Now I uploaded 3.5.7 (also of JarCreator) - hope that version is ok now.
Best greetings,
Gert
Hmm, it still doesn't work. I got this error again:
Thrown de.kugihan.dictionaryformids.general.DictionaryClassNotLoadedException: C
lass could not be loaded: de.kugihan.dictionaryformids.translation.normation.Nor
mationBul / Class could not be loaded: de.kugihan.dictionaryformids.translation.
normation.NormationBul
I looked inside DictionaryForMIDs.jar and saw the 3 new files.
Do I need a new version of DictionaryGeneration with the new normation classes? At the end of the error message it refers to DictionaryGeneration.java:
at de.kugihan.dictionaryformids.dictgen.DictionaryGeneration.main(DictionaryGeneration.java:95)
Here is the full error message:
Thrown de.kugihan.dictionaryformids.general.DictionaryClassNotLoadedException: C
lass could not be loaded: de.kugihan.dictionaryformids.translation.normation.Nor
mationBul / Class could not be loaded: de.kugihan.dictionaryformids.translation.
normation.NormationBul
de.kugihan.dictionaryformids.general.DictionaryClassNotLoadedException: Class co
uld not be loaded: de.kugihan.dictionaryformids.translation.normation.NormationB
ul
at de.kugihan.dictionaryformids.dataaccess.DictionaryDataFile.getObjectF
orClass(DictionaryDataFile.java:306)
at de.kugihan.dictionaryformids.dataaccess.DictionaryDataFile.initValues
(DictionaryDataFile.java:257)
at de.kugihan.dictionaryformids.general.UtilWin.readProperties(UtilWin.j
ava:36)
at de.kugihan.dictionaryformids.dictgen.DictionaryGeneration.main(Dictio
naryGeneration.java:95)
Thrown de.kugihan.dictionaryformids.general.DictionaryClassNotLoadedException: C
lass could not be loaded: de.kugihan.dictionaryformids.translation.normation.Nor
mationBul / Class could not be loaded: de.kugihan.dictionaryformids.translation.
normation.NormationBul
de.kugihan.dictionaryformids.general.DictionaryClassNotLoadedException: Class co
uld not be loaded: de.kugihan.dictionaryformids.translation.normation.NormationB
ul
at de.kugihan.dictionaryformids.dataaccess.DictionaryDataFile.getObjectF
orClass(DictionaryDataFile.java:306)
at de.kugihan.dictionaryformids.dataaccess.DictionaryDataFile.initValues
(DictionaryDataFile.java:257)
at de.kugihan.dictionaryformids.general.UtilWin.readProperties(UtilWin.j
ava:36)
at de.kugihan.dictionaryformids.dictgen.DictionaryGeneration.main(Dictio
naryGeneration.java:95)
Jeff,
I am sorry - in the future I really need to test things before I throw them out ...
Yes, the dependency also exists for DictionaryGeneration. I will provide an update version there also.
Gert
... I just uploaded DictionaryGeneration 3.5.7 .... just in case you are keen to test it (honestly speaking I just did put it there out of the compiler, without testing; I will have time to test in a few days).
Best greetings,
Gert
I tested it. Everything works great. Thank you very much
Jeff