Thank you both for your answers !
First, I should explain where the contents of my dictionaries come from : when I translate a text, I do it in a 3-column table in Word.
- Column 1 is the source language (English).
- Column 2 is the target language (French).
- Column 3 contains various notes : bits of definitions and synonyms I'll use when I'm proofreading and trying to improve the translation - that way, I don't need to look up the words again-, alternative possibilities when I'm trying to translate a joke, excerpts of the Wikipedia (which can be quite long) when we're dealing with cultural matters, etc.
bradm wrote:why not try to parse/segment those big chunks of your translated text into so much shorter headword/synonym lines? You can try having 2-3, or more words syntagms or syntactic units as headwords. Maybe even whole sentence segments? Not whole paragraphs, though.
The problem with this is that A B C is not always translated as A B C. Sometimes, it's translated as C B A !
Take dialogues, for example. The typographic conventions of French and English being so different, this is the type of result you'll have for pretty much any line of dialogue :
"But", said Bob. He frowned. "I mean..." He paused, yadda yadda long description of what he does. "And then we should leave", he concluded.
— Mais, commença Bob. (Il fronça les sourcils.) Enfin... (Il s'interrompit.) Et ensuite, on devrait partir, conclut-il.
Translation of "yadda yadda long description of what he does" in a new paragraph, because in French, it was not acceptable to include a text this long in the middle of the dialogue.
And that's for the two "easy" columns, English and French ! At least there, the first cell of the row, containing one or more paragraphs, is the exact equivalent in content to the second cell of the row that contains one or more paragraphs (typically - but not always -, there's one paragraph in English and several in French).
Now, for the Notes column... In the example above, I scribbled notes about two words :
panicked and
ditched. But sometimes, I'll write as many notes as there are words in the source paragraph, and sometimes, I won't write a single one ! So I cannot segment this automatically.
And of course, I have tens of thousands of rows in my lovely tables ! So there's no way I'm doing this manually !
Tvangeste wrote:It would be hard and not very convenient to enter such huge headwords, to search for them, to see them in history and to distinguish between similar but not identical huge headwords.
to see them in history: I don't use it ! :p
to distinguish between similar but not identical huge headwords: With the type of text I translate, I believe they would all be unique.
to enter such huge headwords, to search for them: no, I don't think so. For example, if I want to know what I wrote the last time I had to translate the verb "ditch": I type "ditch", and see that there is an entry starting with "YYY " (that's why I added those codes, to distinguish these "special" entries from the rest). The headword is so long that I can't see the word "ditch" in the menu, but that's not a problem, since I probably want to browse all my previous translations of "ditch" anyway. I just click it and see both my translation and my other ideas :
Tvangeste wrote:This is GoldenDict "limitation", since it explicitly protects users from huge headwords that are basically not usable in GD UI. In the past, we had bugs related to broken dictionaries when due to format errors they contained garbage with huge headwords, etc. That could cause GD crash, since any binary data could be in headwords and articles. Hence this protection.
Well... what if I don't want to be protected ? It that possible ?