Page 1 of 1

Unknown dictionary format (wb)

PostPosted: Thu Sep 15, 2011 8:00 am
by atordo
I've downloaded a Basque-Spanish dictionary which is supposed to work with Babylon here: http://es.freelang.net/diccionario/euskara.php (English version at http://www.freelang.net/dictionary/basque.php). After installing the file (had to use wine, no way to extract with 7zip or cabextract) I end with some binaries I'm not interested in and what it seems to be the main data:

Code: Select all
-rw-r--r-- 1 atordo users      67956 sep 13 20:42 Español_Euskara.wb
-rw-r--r-- 1 atordo users      70308 may 14  2004 Euskara_Español.wb


Those files are binaries and not recognized by "file" or goldendict, but running "strings" on them shows wordlists and definitions. Any clue to convert them to some format usable by goldendict? Or can someone suggest a Basque dictionary which works with goldendict?

TIA.

Re: Unknown dictionary format (wb)

PostPosted: Fri Sep 16, 2011 5:42 pm
by atordo
I've been able to figure out the first file and create a dsl. I include the script I used as the Freelang page has many dictionaries and someone might find it useful or refine the script. Note that I had to convert the resulting file to UTF-16, as goldendict didn't recognize the original iso-8859-1 text (use recode, iconv or whatever suits you).

Usage: ./free2dsl.pl name.wb > name.dsl.

The forum doesn't allow .pl or .txt attachments, so I'll just paste it here:

Code: Select all
#!/usr/bin/perl

use strict;
use File::Basename;

my $cp="1252";
my $c=0;
my $l;
my $txt;

die "Usage: $0 filename\n" if ($#ARGV<0);

my ($name,$path,$suffix) = fileparse($ARGV[0]);
&cabecera(substr($name,0,rindex($name,'.')),$cp);

open(ORIGEN,"<$ARGV[0]") or die "Can't open $ARGV[0]\n";
binmode(ORIGEN);
while (read (ORIGEN, $l, 1) != 0) {
        if (ord($l)>31) {
                $txt.=$l;
        }
        else {
                if (length($txt)) {
                        $c++;
                        print "  " unless ($c%2);
                        print $txt."\n";
                        print "\n" unless ($c%2);
                        $txt="";
                }
        }
}

close(ORIGEN);
print STDERR ($c/2)." definitions found.\n";

sub cabecera {
        my $nombre=shift;
        my $cp=shift;
        print "#NAME \"$nombre\"\n";
        print "#INDEX_LANGUAGE \"".substr($nombre,0,rindex($nombre,'_'))."\"\n";
        print "#CONTENTS_LANGUAGE \"".substr($nombre,1+index($nombre,'_'))."\"\n\n";
        #print "#SOURCE_CODE_PAGE \"$cp\"\n\n";
}

Re: Unknown dictionary format (wb)

PostPosted: Fri Sep 16, 2011 10:33 pm
by atordo
This version can convert both files. Note that those are quite short and simple dictionaries, don't know if the script could cope with other dicctionaries available at the page.

Code: Select all
#!/usr/bin/perl

use strict;
use File::Basename;

my $c=0;
my $l;
my $pal;
my $def;

die "Usage: $0 filename\n" if ($#ARGV<0);

my ($name,$path,$suffix) = fileparse($ARGV[0]);
&cabecera(substr($name,0,rindex($name,'.')));

open(ORIGEN,"<$ARGV[0]") or die "Can't open $ARGV[0]\n";
binmode(ORIGEN);
while (read (ORIGEN, $l, 31) != 0) {
        $pal=substr($l, 0, index($l,"\0"));
        print "$pal\n";
        if (read (ORIGEN, $l, 53) != 0) {
                $def=substr($l, 0, index($l,"\0"));
                print "\t$def\n\n";
                $c++;
        }
}

close(ORIGEN);
print STDERR "$c definitions found.\n";

sub cabecera {
        my $nombre=shift;
        print "#NAME \"$nombre\"\n";
        print "#INDEX_LANGUAGE \"".substr($nombre,0,rindex($nombre,'_'))."\"\n";
        print "#CONTENTS_LANGUAGE \"".substr($nombre,1+index($nombre,'_'))."\"\n\n";
}

Re: Unknown dictionary format (wb)

PostPosted: Sun Dec 07, 2014 12:52 pm
by Stirlitz
Hi! Can you tell me if it is possible to download Freeland dictionaries and use them in Goldendict? How do I convert them?