New user registration is currently disabled due to spam abuse / Регистрация новых пользователей в настоящее время приостановлена из-за злоупотреблений спаммерами

how to create a *.syn file

All about dictionaries

how to create a *.syn file

Postby Vanilme » Tue Sep 04, 2012 4:50 pm

Hi all.
I'd like to use both a monolingual italian dictionary and a bilingual italian dictionary (italian -> french) on a ebook reader.
I have already these 2 dictionaries on stardict format.

The problem is that these 2 dictionaries don't handle morphologies.
For example, they can't find "trattenne" which is a verbal form of "trattenere". Ideally, looking up "trattenne", a italian dictionary would have to find "trattenere".

But I find a Italian-English dictionary (stardict format) which handles morphologies.
I thought to use the morphology data from Italian-English dictionary and insert them in both monolingual italian dictionary and a bilingual italian->french dictionary.
It seems that *.syn is the file which handles morphologies.
I tried to build a dictionary adding the Italian-English *.syn file to the monolingual italian dictionary files and the line "synwordcount=420349" to *.ifo file. But i failed (when I search a word, i find another one, completely different!).

How to add morphological data (*.syn) to my two dictionaries? Can I build a *.syn file from stardict tools and ispell/aspell?
Vanilme
 
Posts: 6
Joined: Tue Sep 04, 2012 4:04 pm

Re: how to create a *.syn file

Postby Tvangeste » Tue Sep 04, 2012 5:17 pm

You need to install the morphology dictionaries (in myspell/hunspell format), they atypically come with GoldenDict in folder "morphology".

You could download it from here: http://sourceforge.net/projects/goldend ... ogies/1.0/

(Don't forget to unzip them).

Once you unzip and install them, Goldend dict would recognize them, then add the appropriate morphology dictionary to the shelf for the specific language.
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

Re: how to create a *.syn file

Postby Vanilme » Tue Sep 04, 2012 5:40 pm

@Tvangeste: there is a misunderstanding. I've already used the morphological goldendict dictionaries on my desktop (KDE). They are great. But morphological goldendict dictionaries are not recognized by ebook readers (like Onyx Boox, Sony PRS, Bebook, Cybook, Kobo...). That's why I'm trying to insert morphological data directly to stardict dictionary as Italian->English Babylon dictionary does it.
Vanilme
 
Posts: 6
Joined: Tue Sep 04, 2012 4:04 pm

Re: how to create a *.syn file

Postby Vanilme » Tue Sep 04, 2012 5:45 pm

It would be possible to create a *.syn file from a morphological goldendict dictionary, but I don't know why.
Vanilme
 
Posts: 6
Joined: Tue Sep 04, 2012 4:04 pm

Re: how to create a *.syn file

Postby Vanilme » Tue Sep 04, 2012 5:57 pm

http://code.google.com/p/babiloo/wiki/StarDict_format
This file [".syn"] is optional, and you should notice tree dictionary needn't this file.
Only StarDict-2.4.8 and newer support this file.

The .syn file contains information for synonyms, that means, when you input a
synonym, StarDict will search another word that related to it.

The format is simple. Each item contain one string and a number.
synonym_word; // a utf-8 string terminated by '\0'.
original_word_index; // original word's index in .idx file.
Then other items without separation.
When you input synonym_word, StarDict will search original_word;

The length of "synonym_word" should be less than 256. In other
words, (strlen(word) < 256).
original_word_index is a 32-bits unsigned number in network byte order.
Two or more items may have the same "synonym_word" with different
original_word_index.
The items must be sorted by stardict_strcmp() with synonym_word.

I don't know exactly how to build it.
Vanilme
 
Posts: 6
Joined: Tue Sep 04, 2012 4:04 pm

Re: how to create a *.syn file

Postby Vanilme » Wed Sep 05, 2012 4:56 pm

This is a quote of a *.syn.file.
"\xfaalcoolica\x00\x00\x00\n\xfealcooliche\x00\x00\x00\n\xfealcoolici\x00\x00\x00\n\xfealcoolismi\x00\x00\x00\n\xffalcooliste\x00\x00\x00\x0b\x00alcoolisti\x00\x00\x00\x0b\x00alcoolizza\x00\x00\x00\x0b\x01alcoolizzai\x00\x00\x00\x0b\x01alcoolizzammo\x00\x00\x00\x0b\x01alcoolizzando\x00\x00\x00\x0b\x01alcoolizzano\x00\x00\x00\x0b\x01alcoolizzante\x00\x00\x00\x0b\x01alcoolizzanti\x00\x00\x00\x0b\x01alcoolizzarono\x00\x00\x00\x0b\x01alcoolizzasse\x00\x00\x00\x0b\x01alcoolizzassero\x00\x00\x00\x0b\x01alcoolizzassi\x00\x00\x00\x0b\x01alcoolizzassimo\x00\x00\x00\x0b\x01alcoolizzaste\x00\x00\x00\x0b\x01alcoolizzasti\x00\x00\x00\x0b\x01alcoolizzata\x00\x00\x00\x0b\x01alcoolizzata\x00\x00\x00\x0b\x02alcoolizzate\x00\x00\x00\x0b\x01alcoolizzate\x00\x00\x00\x0b\x02alcoolizzati\x00\x00\x00\x0b\x01alcoolizzati\x00\x00\x00\x0b\x02alcoolizzato\x00\x00\x00\x0b\x01alcoolizzava\x00\x00\x00\x0b\x01alcoolizzavamo\x00\x00\x00\x0b\x01alcoolizzavano\x00\x00\x00\x0b\x01alcoolizzavate\x00\x00\x00\x0b\x01alcoolizzavi\x00\x00\x00\x0b\x01alcoolizzavo\x00\x00\x00\x0b\x01alcoolizzerai\x00\x00\x00\x0b\x01alcoolizzeranno\x00\x00\x00\x0b\x01alcoolizzerebbe\x00\x00\x00\x0b\x01alcoolizzerebbero\x00\x00\x00\x0b\x01alcoolizzerei\x00\x00\x00\x0b\x01alcoolizzeremmo\x00\x00\x00\x0b\x01alcoolizzeremo\x00\x00\x00\x0b\x01alcoolizzereste\x00\x00\x00\x0b\x01alcoolizzeresti\x00\x00\x00\x0b\x01alcoolizzerete\x00\x00\x00\x0b\x01alcoolizzer\xc3\xa0\x00\x00\x00\x0b\x01alcoolizzer\xc3\xb2\x00\x00\x00\x0b\x01alcoolizzi\x00\x00\x00\x0b\x01alcoolizziamo\x00\x00\x00\x0b\x01alcoolizziate\x00\x00\x00\x0b\x01alcoolizzino\x00\x00\x00\x0b\x01alcoolizzo\x00\x00\x00\x0b\x01alcoolizz\xc3\xb2\x00\x00\x00\x0b\x01alcove\x00\x00\x00\x0b\x04alcun\x00\x00\x00\x0b\talcun\x00\x00\x00\x0b\nalcun'\x00\x00\x00\x0b\talcun'\x00\x00\x00\x0b\n"

The syntax is:
SYNONYM_WORD_1original_word_index_1SYNONYM_WORD_2original_word_index_2SYNONYM_WORD_3original_word_index_4...

For example:
alcoolica\x00\x00\x00\n\xfealcooliche\x00\x00\x00\n\xfealcoolici\x00\x00\x00\n\xfe

... with SYNONYM_WORD_1 = alcoolica, original_word_index_1 = \x00\x00\x00\n\xfe...

Consequently, you can first create a dictionary from tab file (⇒ *.dict.dz, *.idx, *.ifo) and then a *.syn file which will use data from *.idx.
"\x00\x00\x00\n\xfe" is maybe a C language address or a pointer...

This is a quote of the *.idx file:
alcool\x00\x00\x02\xf5k\x00\x00\x00,alcoolico\x00\x00\x02\xf5\x97\x00\x00\x004alcoolismo\x00\x00\x02\xf5\xcb\x00\x00\x00=alcoolista\x00\x00\x02\xf6\x08\x00\x00\x007alcoolizzare\x00\x00\x02\xf6?\x00\x00\x00ealcoolizzato\x00\x00\x02\xf6\xa4\x00\x00\x00=alcooltest\x00\x00\x02\xf6\xe1\x00\x00\x00\x0c


I continue to search the solution...
Vanilme
 
Posts: 6
Joined: Tue Sep 04, 2012 4:04 pm

Re: how to create a *.syn file

Postby Vanilme » Wed Sep 05, 2012 6:02 pm

stardict-index inspects StarDict index (.idx) files, synonyms (.syn) files, and resource database index (.ridx) files
But I'm using Debian Squeeze and "stardict-index" doesn't exist for this operating system yet (exists on Wheezy).
Vanilme
 
Posts: 6
Joined: Tue Sep 04, 2012 4:04 pm


Return to Dictionaries

Who is online

Users browsing this forum: No registered users and 15 guests