Author Topic: Very big dictionaries  (Read 1873 times)

0 Members and 1 Guest are viewing this topic.

jn0101

  • Developer
  • *****
  • Posts: 85
    • View Profile
Very big dictionaries
« on: 03. May 2010, 16:00:07 »
Hi, Im compiling a reasonably big wordlist (1.6 MB - 34000 entries).

The dictionary/ subdir gets to 20MB uncompressed and the compressed JAR file is 7MB.

Have there been any attempts to make a more compressed format than a ZIP of csv-files?
I'm thinking of  http://en.wikipedia.org/wiki/Trie and such things.

Take a look at the sizes at http://www.tinylex.com/download.php  (a sister project, but lacks good GUI and a reasonable amount of supported platforms)


Jacob

Gert

  • DFM J2ME/Mobile Developer and Project Leader
  • Administrator
  • *****
  • Posts: 862
    • View Profile
    • DictionaryForMIDs
Re: Very big dictionaries
« Reply #1 on: 03. May 2010, 19:44:30 »
Oh, I did not know about TinyLex before ! And it has so nice dictionaries ...   :o

Hmmm, reads interesting about that Trie. Do you have a clue what space savings it could have compared to zip-files ? Are there ready-to-use Java implementations available ?

About getting 20 MB out of a 1.6 MB inputdictionaryfile, I just made a general posting here http://dictionarymid.sourceforge.net/forum/index.php?topic=223. Maybe that could help in your case too ?

Best regards,
Getr