DictionaryForMids Forum

Dictionaries => General discussions => Topic started by: jn0101 on 03. May 2010, 17:00:07

Title: Very big dictionaries
Post by: jn0101 on 03. May 2010, 17:00:07
Hi, Im compiling a reasonably big wordlist (1.6 MB - 34000 entries).

The dictionary/ subdir gets to 20MB uncompressed and the compressed JAR file is 7MB.

Have there been any attempts to make a more compressed format than a ZIP of csv-files?
I'm thinking of  http://en.wikipedia.org/wiki/Trie and such things.

Take a look at the sizes at http://www.tinylex.com/download.php  (a sister project, but lacks good GUI and a reasonable amount of supported platforms)


Jacob
Title: Re: Very big dictionaries
Post by: Gert on 03. May 2010, 20:44:30
Oh, I did not know about TinyLex before ! And it has so nice dictionaries ...   :o

Hmmm, reads interesting about that Trie. Do you have a clue what space savings it could have compared to zip-files ? Are there ready-to-use Java implementations available ?

About getting 20 MB out of a 1.6 MB inputdictionaryfile, I just made a general posting here http://dictionarymid.sourceforge.net/forum/index.php?topic=223 (http://dictionarymid.sourceforge.net/forum/index.php?topic=223). Maybe that could help in your case too ?

Best regards,
Getr