New user registration is currently disabled due to spam abuse / Регистрация новых пользователей в настоящее время приостановлена из-за злоупотреблений спаммерами

making a dictionary from data

All about dictionaries

making a dictionary from data

Postby Dhammadarsa Bhikkhu » Wed Sep 21, 2011 5:55 am

Venerable/Kind Sirs and Ladies

I have just been recommended Golden Dictionary by a friend. He sent me mutually used dictionaries to install. It all works fine.

I have a html file with a one page dictionary, which I got from: http://mahajana.net/texts/kopia_lokalna ... odous.html. It has Chinese Character/s (sometimes a transliterated Sanskrit equivalent) then definitions.

I want to transform it to a dictionary I can use with Golden Dictionary. Is there an easy way to do that?

I don't know any programming language.

Unfortunately the links to the html.zip and xml.zip are broken. If they will help I could try to contact the author of the page.

Kind Regards
Dhammadarsa Bhikkhu
Dhammadarsa Bhikkhu
 
Posts: 13
Joined: Wed Sep 21, 2011 5:28 am

Re: making a dictionary from data

Postby C2BlEv » Thu Sep 22, 2011 6:29 pm

I don't think we have any Sirs or Ladies here :lol:

The simplest dictionary format, in my opinion, supported by GD is Lingvo DSL.

Code: Select all
Headword
<-tab->Article
Headword2
<-tab->Article2


Tags are formed similar to html but instead of <> use []. So,
Code: Select all
[i]italic[/i], [b]bold[/b]


It is pretty easy to convert an html file into dsl with the help of a Unicode text editor such as emeditor. You can study the following sample DSL dictionary (file sample.dsl) to see its formatting, https://github.com/VVSiz/SampleDSL/blob ... sample.dsl
C2BlEv
Модератор
 
Posts: 215
Joined: Tue May 05, 2009 3:45 pm

Re: making a dictionary from data

Postby Dhammadarsa Bhikkhu » Fri Sep 23, 2011 6:18 am

oh, that's really easy, thanks
Dhammadarsa Bhikkhu
 
Posts: 13
Joined: Wed Sep 21, 2011 5:28 am

Re: making a dictionary from data

Postby Alec » Fri Sep 23, 2011 12:07 pm

I have converted the Soothill Buddhist Dictionary and put the files (.dict , .idx, and .ifo plus the source file as .txt) in a Zip file, size 1.76 MB.

I'm working at home today, but, as soon as possible, I will post if for download on my personal website, http://www.personal.leeds.ac.uk/~ecl6tam/, which appears to be misbehaving at present. I'll fix that as soon as I can get access to it.

Until that is done, I can email the dictionary off-list, as an attachment, to anyone who wants it.

Alec
Alec
 
Posts: 57
Joined: Thu Apr 15, 2010 2:28 pm

Re: making a dictionary from data

Postby Dhammadarsa Bhikkhu » Fri Sep 23, 2011 2:09 pm

Wow! that's great. I'll contact you for a copy.

I had a look at the data and couldn't think of an easy way to separate the headword from the definition, as there was only a space between them.

You help is much appreciated.

Kind Regards
Dhammadarsa Bhikkhu
 
Posts: 13
Joined: Wed Sep 21, 2011 5:28 am

Re: making a dictionary from data

Postby Alec » Fri Sep 23, 2011 2:42 pm

I used a very simple technique:
1) open the file in a word processor;
2) record a macro that looks for the paragraph mark (^p in Word), then moves one word to the right (Control-right-arrow in Word), then selects the space and replaces it with a tab (^t in Word);
3) stop recording and run the macro on every paragraph in the file.

That was the easy bit. The time-consuming bit was in dealing with the many duplicated entries and the occasional malformed paragraph. Because of the limited formatting possibilities, I'm afraid I had to simply concatenate the duplicate entries under a single headword.

Alec.
Alec
 
Posts: 57
Joined: Thu Apr 15, 2010 2:28 pm

Re: making a dictionary from data

Postby Dhammadarsa Bhikkhu » Fri Sep 23, 2011 10:44 pm

ok, yes, I understand.

I was working on the information above, with the definition indented, not the headword.

Kind Regards
Dhammadarsa Bhikkhu
 
Posts: 13
Joined: Wed Sep 21, 2011 5:28 am

Buddhist Dictionary

Postby Alec » Mon Sep 26, 2011 9:15 am

The dictionary should now be downloadable from http://www.personal.leeds.ac.uk/~ecl6tam/

Alec.
Alec
 
Posts: 57
Joined: Thu Apr 15, 2010 2:28 pm

Re: making a dictionary from data

Postby Dhammadarsa Bhikkhu » Tue Sep 27, 2011 12:29 am

thanks for that.

I reread your steps for the formatting and now see how it works. :-)
Dhammadarsa Bhikkhu
 
Posts: 13
Joined: Wed Sep 21, 2011 5:28 am

Re: making a dictionary from data

Postby Dhammadarsa Bhikkhu » Tue Sep 27, 2011 12:38 am

but I'm guessing that to get the format suggested:

Headword
<-tab->Article
Headword2
<-tab->Article2

one has to replace the space with ^p^t

I don't see why one would have to do it for every paragraph, but rather just select the whole document and run the macro once.

Then cut out the lines such as "1. One Stroke", or do that beforehand.

Kind Regards
Dhammadarsa Bhikkhu
 
Posts: 13
Joined: Wed Sep 21, 2011 5:28 am

Next

Return to Dictionaries

Who is online

Users browsing this forum: No registered users and 17 guests

cron