GoldenDict Forum

by **Tvangeste** » Mon Jan 07, 2013 9:37 pm

francesinha wrote:1) how do you insert a line break in a definition ? Is it possible to insert a line break in a headword ?

Just wrap your text with m0, m1, m2, m3... tags. These tags control the indentation of the text, and they make the text to appear on separate line:

Code: Select all: Headword [m1] Indentation 1 [/m] [m2] Indentation 2 [/m]

No, it is not possible to line break in a headword. Headwords must fit into a single line, which is natural.

francesinha wrote:There seems to be a limit to the length of headwords and definitions.

No, there should be no such limit. DSL allows for huge articles and GoldenDict can handle them with no problems.

by **francesinha** » Mon Jan 07, 2013 11:08 pm

Thanks for your answer, Tvangeste !

Tvangeste wrote:Just wrap your text with m0, m1, m2, m3... tags. These tags control the indentation of the text, and they make the text to appear on separate line

I should have mentioned that I wanted to create empty lines.
I tried writing [m0][/m0] or [m0] [/m0], but it didn't work. This, on the other hand, works :

Code: Select all: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt But I must explain to you how all this mistaken idea [m0][c white]white texte[/c][/m0] [m0][c white]white texte[/c][/m0] [m0][c dimgray]example 1[/c][/m0] [m0][c white]white texte[/c][/m0] [m0][c dimgray]example 2[/c][/m0] [m0][c white]white texte[/c][/m0] [m0][c dimgray]example 3[/c][/m0]

Not very elegant, but it does the job ! Here's a screengrab :

Tvangeste wrote:
francesinha wrote:There seems to be a limit to the length of headwords and definitions.

No, there should be no such limit. DSL allows for huge articles and GoldenDict can handle them with no problems.

Here's a file with headwords that don't show up in my Goldendict. Under Edit/Dictionaries/Dictionaries, the software lists the right number of total articles. But when I try to look up the longest entries, I can't find them. Did I do something wrong?

(I use v. 1.0.1-409-g96dfa23 of GoldenDict, the latest, and Windows 7.)

by **bradm** » Tue Jan 08, 2013 1:36 am

francesinha wrote:I should have mentioned that I wanted to create empty lines.

Here's how you insert an empty line: type an initial space, a slash "\", and a space after it.
As some text editors trim end of line spaces, I put another slash after the final space.

There is a fine user-friendly "sample.dsl" file at this address: https://github.com/VVSiz/SampleDSL/blob ... sample.dsl, as already mentioned in one of Tvangeste's recents interventions.

by **Tvangeste** » Tue Jan 08, 2013 8:56 am

francesinha wrote:Not very elegant, but it does the job ! Here's a screengrab :

As bradm said, you could use space-slash-space trick to insert empty lines. Look at the sample he linked, line 16 for example.

Btw, you seem to be using the default theme for GoldenDict. I find Lingvo-theme to be more pleasant though

francesinha wrote:Here's a file with headwords that don't show up in my Goldendict. Under Edit/Dictionaries/Dictionaries, the software lists the right number of total articles. But when I try to look up the longest entries, I can't find them. Did I do something wrong?

Interesting... I see that headwords bigger than 256 are not shown. And looking at the code, I see that it is done intentionally, with the following comment:

// Safeguard us against various bugs here. Don't attempt adding words
// which are freakishly huge.

So, I should update my statement. There is no limit for the cards content, but there is a limit of 256 for the headwords (which seems reasonable enough to me).

by **francesinha** » Tue Jan 08, 2013 2:17 pm

bradm wrote:Here's how you insert an empty line: type an initial space, a slash "\", and a space after it.

Thanks, bradm !

Tvangeste wrote:you seem to be using the default theme for GoldenDict. I find Lingvo-theme to be more pleasant though

Thanks for pointing that out. I'll give it a try !

Tvangeste wrote:I see that headwords bigger than 256 are not shown. And looking at the code, I see that it is done intentionally, with the following comment:
// Safeguard us against various bugs here. Don't attempt adding words
// which are freakishly huge.

So, I should update my statement. There is no limit for the cards content, but there is a limit of 256 for the headwords (which seems reasonable enough to me).

It does seem reasonable for regular glossaries, but in this case, I'm trying to create a dictionary from my past translations, like this :

As you can see, the dictionary contains 1) the English 2) the French 3) my comments/ideas.

Here's the corresponding code. I put both the English and the comments as headwords; that way, if in the future I search for "aghast", for example, I'll find the ideas I had already had when translating "panicked" :

Code: Select all: #NAME "ExampleNovel" #INDEX_LANGUAGE "English" #CONTENTS_LANGUAGE "French" XXX nervous, apprehensive, aghast, horror-stricken / en panique, aux abois XNEWPARX on se faisait déjà larguer par / larguait lourdait / jeter, lâcher, laisser choir, planter là, plaquer endXXX YYY “But – ” Mary began, sounding panicked. In Tokyo for two minutes and already ditched by her mother? Not cool at all. endYYY “But – ” Mary began, sounding panicked. In Tokyo for two minutes and already ditched by her mother? Not cool at all. \ — Mais..., commença Mary, qui avait l’air de paniquer. On n'était à Tokyo que depuis deux minutes et sa mère nous plantait déjà ? Vraiment pas cool. \ \ [m0][c dimgray]nervous, apprehensive, aghast, horror-stricken / en panique, aux abois[/c][/m0] \ [m0][c dimgray]on se faisait déjà larguer par / larguait lourdait / jeter, lâcher, laisser choir, planter là, plaquer[/c][/m0]

To do this, I definitely need longer headwords ! Sometimes I have very long comments, and even the source paragraphs can, in some cases, span
one, or maybe even two full pages of a novel ! (most often, it's 2-8 lines, but some paragraphs can be really long...)

So, back to my earlier questions :
1) Is it a .DSL limitation or a Goldendict limitation ? Should I try using another dictionary format ?
2) Is it possible to override that limitation with .DSL files in Goldendict ?

Thanks for your help !

by **bradm** » Tue Jan 08, 2013 3:46 pm

francesinha wrote:... if in the future I search for "aghast", for example, I'll find the ideas I had already had when translating... To do this, I definitely need longer headwords!

As I can tell, you're interested in some sort of full-text search workaround. As there exists a rather logical 256-characters line limitation, why not try to parse/segment those big chunks of your translated text into so much shorter headword/synonym lines? You can try having 2-3, or more words syntagms or syntactic units as headwords. Maybe even whole sentence segments? Not whole paragraphs, though. In recent EA builds, fortunately, there is no more limitations as to the number of headwords/synonyms in GD.

In this way, you'll end up by being able to search through your past translation solutions. I wouldn't suggest searching for another dictionary format.

by **Tvangeste** » Tue Jan 08, 2013 4:14 pm

francesinha wrote:Here's the corresponding code. I put both the English and the comments as headwords; that way, if in the future I search for "aghast", for example, I'll find the ideas I had already had when translating "panicked"

I see. This is a very interesting and unique way to create dictionaries!

And GoldenDict is definitely not optimized for such scenario. It would be hard and not very convenient to enter such huge headwords, to search for them, to see them in history and to distinguish between similar but not identical huge headwords.

So I'd suggest to try to adapt your way a little bit. How about having the headword "panicked"? *That* would be much, much easier and faster to find.
Moreover, you could have more than one headword for the same article. For example:

Code: Select all: nervous apprehensive aghast horror-stricken [b]nervous, apprehensive, aghast, horror-stricken / en panique, aux abois XNEWPARX on se faisait déjà larguer par / larguait lourdait / jeter, lâcher, laisser choir, planter là, plaquer[/b] “But – ” Mary began, sounding panicked. In Tokyo for two minutes and already ditched by her mother? Not cool at all. \ — Mais..., commença Mary, qui avait l’air de paniquer. On n'était à Tokyo que depuis deux minutes et sa mère nous plantait déjà ? Vraiment pas cool.

That way, you'd have 4 headwords (nervous, aghast, ...), all with the same content.

As bradm mentioned, the full text functionality would be handy here as well, but we are not there yet, the full text feature is being discussed at the moment, but not yet implemented.

francesinha wrote:1) Is it a .DSL limitation or a Goldendict limitation ? Should I try using another dictionary format ?

This is definitely not a format limitation. DSL does not have such limitations. This is GoldenDict "limitation", since it explicitly protects users from huge headwords that are basically not usable in GD UI. In the past, we had bugs related to broken dictionaries when due to format errors they contained garbage with huge headwords, etc. That could cause GD crash, since any binary data could be in headwords and articles. Hence this protection.

I don't think other formats would change anything here. Besides, DSL is the easiest to work with, and is a pure text dictionary format.

by **francesinha** » Tue Jan 08, 2013 6:53 pm

Thank you both for your answers !

First, I should explain where the contents of my dictionaries come from : when I translate a text, I do it in a 3-column table in Word.

Column 1 is the source language (English).
Column 2 is the target language (French).
Column 3 contains various notes : bits of definitions and synonyms I'll use when I'm proofreading and trying to improve the translation - that way, I don't need to look up the words again-, alternative possibilities when I'm trying to translate a joke, excerpts of the Wikipedia (which can be quite long) when we're dealing with cultural matters, etc.

bradm wrote:why not try to parse/segment those big chunks of your translated text into so much shorter headword/synonym lines? You can try having 2-3, or more words syntagms or syntactic units as headwords. Maybe even whole sentence segments? Not whole paragraphs, though.

The problem with this is that A B C is not always translated as A B C. Sometimes, it's translated as C B A !

Take dialogues, for example. The typographic conventions of French and English being so different, this is the type of result you'll have for pretty much any line of dialogue :

"But", said Bob. He frowned. "I mean..." He paused, yadda yadda long description of what he does. "And then we should leave", he concluded.

— Mais, commença Bob. (Il fronça les sourcils.) Enfin... (Il s'interrompit.) Et ensuite, on devrait partir, conclut-il.
Translation of "yadda yadda long description of what he does" in a new paragraph, because in French, it was not acceptable to include a text this long in the middle of the dialogue.

And that's for the two "easy" columns, English and French ! At least there, the first cell of the row, containing one or more paragraphs, is the exact equivalent in content to the second cell of the row that contains one or more paragraphs (typically - but not always -, there's one paragraph in English and several in French).

Now, for the Notes column... In the example above, I scribbled notes about two words : panicked and ditched. But sometimes, I'll write as many notes as there are words in the source paragraph, and sometimes, I won't write a single one ! So I cannot segment this automatically.

And of course, I have tens of thousands of rows in my lovely tables ! So there's no way I'm doing this manually !

Tvangeste wrote:It would be hard and not very convenient to enter such huge headwords, to search for them, to see them in history and to distinguish between similar but not identical huge headwords.

to see them in history: I don't use it ! :p

to distinguish between similar but not identical huge headwords: With the type of text I translate, I believe they would all be unique.

to enter such huge headwords, to search for them: no, I don't think so. For example, if I want to know what I wrote the last time I had to translate the verb "ditch": I type "ditch", and see that there is an entry starting with "YYY " (that's why I added those codes, to distinguish these "special" entries from the rest). The headword is so long that I can't see the word "ditch" in the menu, but that's not a problem, since I probably want to browse all my previous translations of "ditch" anyway. I just click it and see both my translation and my other ideas :

Tvangeste wrote:This is GoldenDict "limitation", since it explicitly protects users from huge headwords that are basically not usable in GD UI. In the past, we had bugs related to broken dictionaries when due to format errors they contained garbage with huge headwords, etc. That could cause GD crash, since any binary data could be in headwords and articles. Hence this protection.

Well... what if I don't want to be protected ? It that possible ? :twisted:

by **bradm** » Tue Jan 08, 2013 7:48 pm

francesinha wrote:And of course, I have tens of thousands of rows in my lovely tables ! So there's no way I'm doing this manually !

If you are into the translation work the heavy way, eager to keep history, commenting, and all the translation versioning in one place, I'd suggest you go and grab a CAT tool like SDL Trados or Across Personal Edition.

Let GoldenDict be what it is: a terrific genuine dictionary lookup program, an excellent work in progress, our everyday language companion and friend –- while there are translation environment tools specializing in managing translation memory and terminology. As a registered freelance translator, you can get a full version of Across Personal Edition for free.
(http://www.across.net/en/across-for-fre ... ators.aspx)
But beware of its rather steep learning curve! Good luck!

by **francesinha** » Thu Jan 10, 2013 10:36 pm

Thank you for the suggestion, bradm, but I don't want to translate using a translation memory. I just want glossaries, and if possible, glossaries that are all in the same place. And Goldendict is great for that.

It's a pity, because I created the exact right tool for my personal use, and I was really happy with it! And now, all that's blocking me is three little digits: 2, 5, 6...

Okay, how do I go about changing the code and compiling my own version of Goldendict? Is there any doc available about that? (I have obviously never done such a thing...)

Thanks to anyone who'll be able to help!

GoldenDict Forum

How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Re: How to Create a DSL dictionary for Goldendict

Who is online