Author Topic: Is this file good for conversion?  (Read 141 times)

0 Members and 1 Guest are viewing this topic.

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Is this file good for conversion?
« on: 17. July 2017, 12:51:29 »
Hi, here is a file which comes out of a stardict dictionary. I dont know if its good for converting or not. Can you please have a look on the attached image?
Thanks


Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #1 on: 17. July 2017, 14:25:38 »
I just had a look at the screenshot.

I guess it is not easy to convert this dictionary for DictionaryForMIDs. What words/expression does it lookup? I mean, is it for looking up expressions such as "anders denkende". How is such an expression identified? Is it following a newline/carriage return?

Well, some preprocessing will certainly be required: the tags such as <blockquote> or <c> would need to be converted into something that is understood by DictionaryForMIDs (some "content" there).

In total I think, yes it will be possible to convert this dictionary for DictionaryForMIDs, but it will be some effort.

Besides, what 'dictionary' is it?

Regards,
Gert

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #2 on: 17. July 2017, 15:05:53 »
Thank you for your answer. Its a normal Duden German German dictionary.
In .dsl format it looks like  this:



Since I am creating .dsl dictionaries myself, it is not a problem to convert it to a different format manually (by regular expressions). How is the DFM format built? What tags are you using?

Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #3 on: 17. July 2017, 16:11:38 »
The DfM format is documented in our web pages:
You could look at
http://dictionarymid.sourceforge.net/DfM-Creator/index.html, there "Complete Documentation"
http://dictionarymid.sourceforge.net/DfM-Creator/newdict.html
And for tags: http://dictionarymid.sourceforge.net/DfM-Creator/newdictContent.html

Duden, I assume, would be copyrighted material and could not be made available for public download (the licensing conditions would tell more).

Regards,
Gert

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #4 on: 17. July 2017, 16:25:19 »
I downloaded the merriam webster dictionary. Is there a possibilty to convert it back to txt?

And how do I mark italics and bold text and colors in the txt file?
« Last Edit: 17. July 2017, 16:41:21 by Itman »

Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #5 on: 17. July 2017, 16:46:28 »
1.  merriam webster
Say, what do you mean by "convert it back to txt"? Does this mean how to convert it in a format that is readable by DfM-Creator ("Input CSV file").


2. Italics etc.
You can define italics etc. via the Contents. From http://dictionarymid.sourceforge.net/DfM-Creator/newdictContent.html:

languageXContentNNFontStyle
Defines the font style for the content. Allowed values are provided in the ComboBoxes as follows:
bold
italic
underlined
plain

Examples from the screen-shot:

language1Content01FontStyle:plain
language1Content02FontStyle:italic

language2Content01FontStyle:bold
language2Content02FontStyle:italic

language3Content01FontStyle:underlined
language3Content02FontStyle:bold

This property is optional, the default value is plain.

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #6 on: 17. July 2017, 16:51:04 »
Sorry I dont get it. Look at my first screenshot. I mark bold with <b> </b> how should I tell the languageXContentNNDisplayText to mark it bold?

Yes I mean back to the input file.

Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #7 on: 17. July 2017, 17:22:55 »
Example from your last screenshot:

Source:
[ b][c blue]Abartung[/c][/b]

Convert for DfM to:
[01Abartung]

To make "Abartung" bold and blue (assuming it is the column for language2):
language2Content01FontStyle:bold
language2Content01FontColour:0,0,255



Here is the complete example from http://dictionarymid.sourceforge.net/DfM-Creator/newdictContent.html:

Content tags for the dictionaries

In the dictionaries the content parts are marked with the following syntax:

Each content has a start delimiter at the beginning and an end delimiter at the end.

Start delimiter:
[NN
where NN is the content number. This needs to be a two-digit number !

End delimiter:
]

To use a [ or ] character in the text (without content syntax) a \ (backslash) must be prepended:  \[ and \]
A newline-character is \n and a tab-character is \t

Here is an example for a language2 column:

dictionary [01dikshionari] [02noun] [03\nA book that contains translations for words.]
(Content numbers are boldfaced only for didactical purposes)
In that example the following properties are declared:

language2Content01DisplayText:contentPronunciation
language2Content02DisplayText:contentGrammaticalCategory
language2Content03DisplayText:contentNotes
 

Contents can also be nested. Example:

dictionary [01dikshionari] [03\nA book that contains translations for words. [02noun]\nAlso exists in electronic form]

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #8 on: 17. July 2017, 17:58:16 »
What am I doing wrong?



« Last Edit: 17. July 2017, 17:59:49 by Itman »

Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #9 on: 17. July 2017, 18:11:53 »
language1 (= column 1) does not have content declarations -> Language-1 number of content declarations = 0
language2 (= column 2) does have content declarations -> Language-2 number of content declarations = 2 or 3 (or other value)

Strings such as "<c>", "<blockquote>" and "&lt" need to be replaced in column 2.

Regards,
Gert

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #10 on: 17. July 2017, 18:14:47 »
language1 (= column 1) does not have content declarations -> Language-1 number of content declarations = 0


Its not possible. Each time I save it, it jumps back to Language -1 (3) and Language -2 (0)
« Last Edit: 17. July 2017, 18:21:42 by Itman »

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #11 on: 17. July 2017, 18:27:29 »
The creator doesnt let me to have 0 declarations in Language 1

Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #12 on: 17. July 2017, 18:32:14 »
Hmmm, that is probably something that should be looked at by Karim (the developer of DfM-Creator).

You simple could set it to 1 for language1; this should not disturb.

Regards,
Gert

Itman

  • Newbie
  • *
  • Posts: 12
    • View Profile
Re: Is this file good for conversion?
« Reply #13 on: 17. July 2017, 18:54:29 »
It did not change anything. But maybe I am doing something else wrong?




Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 859
    • View Profile
    • DictionaryForMIDs
Re: Is this file good for conversion?
« Reply #14 on: 17. July 2017, 19:13:02 »
From a quick look the configuration in DfM-Creator seems ok for me.

Still, in the dictionary strings such as "<c>", "<blockquote>" and "&lt" need to be replaced in column 2. DictionaryForMIDs does not know any of these strings and when you use the dictionary then this will be displayed as plain text.

When you convert the dictionary from the source, you need to convert/filter strings like "<c>", "<blockquote>", ...

Regards,
Gert