Very big dictionaries

Started by jn0101, 03. May 2010, 17:00:07

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

jn0101

Hi, Im compiling a reasonably big wordlist (1.6 MB - 34000 entries).

The dictionary/ subdir gets to 20MB uncompressed and the compressed JAR file is 7MB.

Have there been any attempts to make a more compressed format than a ZIP of csv-files?
I'm thinking of  http://en.wikipedia.org/wiki/Trie and such things.

Take a look at the sizes at http://www.tinylex.com/download.php  (a sister project, but lacks good GUI and a reasonable amount of supported platforms)


Jacob

Gert

Oh, I did not know about TinyLex before ! And it has so nice dictionaries ...   :o

Hmmm, reads interesting about that Trie. Do you have a clue what space savings it could have compared to zip-files ? Are there ready-to-use Java implementations available ?

About getting 20 MB out of a 1.6 MB inputdictionaryfile, I just made a general posting here http://dictionarymid.sourceforge.net/forum/index.php?topic=223. Maybe that could help in your case too ?

Best regards,
Getr