A free Italian dictionary with more than 100,000 words …

It is June, it is raining and our dog has very damp paws. His paws are like sponges and there are enormous wet paw marks on the floor. Happy days. (I’ll post of picture of said hound soon).

I haven’t posted for a while, and that is partly because my every waking moment (slight exageration) has been spent on trying to create dictionaries using wiktionary (as permitted by the creative commons license).


I looked into licensing dictionaries to add to SL and the cost is prohibitive. I naively thought that a big dictionary could be licensed for maybe £100 a year. It turns out that the real cost is prohibitive. Add some 00’s kind of prohibitive. Welcome to the real world.

But I need dictionaries …

1. I need them for part of my secret and as yet unstarted project. But dictionaries will be required.

2. I need them so I can create free dictionary aps for iPhone and Android devices.

3. There are a lack of free dictionaries (of a decent size) on the internet for languages with fewer speakers.

4. I want a dictionary page on SL with dictionaries and dictionary games – designed to work well on tables, pads and phones.

And so my only option was to create dictionaries myself. The way to do this is to use the data from wiktionary, parse it using some whizzy coding, create a flat file, index it and create a database.

I started with Italian (as I’m learning the language), and assumed that I would crack this relatively trivial task within days. An hour here. An hour there. Bing. I would have a dictionary.

It turns out, that while the Wiktionary is very easy for a human to read, it is a non-trivial task to write a program to parse it, and spit out a dictionary. In fact, it is a tedious, unforgiving and difficult.

Anyway, I’m a l33t programmer and I have teh skillz:)

So, after a lot of frustration, I have almost written some code to parse the Wiktionary and produce a dictionary.

I almost have an Italian dictionary with hundreds of thousands of words.  And when I do, I will add it to SL. I hope it will be as good or better than any of the expensive branded dictionaries.

It is, as they say, coming soon …



And for all you Welsh language learners …

I was sent this today. A new series of cariad@iaith (love4language) is starting this Sunday, 19 May 2013 on S4C. It’s a series that follows  ten learners of Welsh (for a week), living in tents, being tutored in Welsh, performing challenges and so on.  I live near the border and it is raining hard here (the gulf stream has apparently jammed again), so good luck to them!

So if you are thinking about learning Welsh, or need a bit of inspiration, look no further :-

Welsh for beginners

I’ve never learnt any Welsh, but probably follow along. Only a generation or so ago there were Welsh speakers in my family. How quickly things change …




Polish and some other stuff

I’ve added and now fixed a new word game on SL, and here is an example in  Afrikaans chosen at random.

This leads neatly into – because of the awesome programming of SL 😉 …

As a programmer I’m good enough and pretty much know what I’m doing, but when learning languages I feel much less certain. I’m writing this as I’ve just finished reading some motivational blogs re language learning to encourage me with my Polish struggles, and it amazes me how confident people are.

In real life, I’m quite outgoing, noisy even (maybe hard word if you are uncharitable?). But writing a blog I feel much less confident. I don’t like making definite statements in general and certainly not regarding languages. Life and learning isn’t black and white. There is often no right or wrong but subtleties, nuances and shades (of grey).  These shades are interesting. Apparently.

I don’t even like making definite statements about programming or running websites – areas in which I feel much more certain and have more experience. There is so often more than one approach to a problem. More than one way to skin a cat and One mans meat is another mans poison sums it up. I like maxims. Pithy and to the point.

So, regarding my slow journey with Polish, there is nothing definite, other than it is difficult but rewarding.


I’ve been learning Polish since the start of the year. Not full time. I have a family, job, dog, cat (low maintenance), this website and all sorts of activities going on – mainly arranged by my wife. Luckily. Otherwise I wouldn’t do much other than walk the dog.

The above are excuses  the setting within which I’m learning Polish – or trying to. Ok so where am I? The last time I wrote about this (a month ago?) I decided that I needed to focus on vocabulary acquisition.

Polish Vocabulary Acquisition

I learn Polish words and then forget them.

I’ve learnt about 1000 words. 🙂

I remember about 100 🙁

I can’t speak for anyone else, and I don’t think I am any worse at language learning than the rest of the world, but I am finding it difficult to retain Polish vocab.

What can I say in Polish?

It is not all doom and gloom.

I can introduce myself. I can say jestem rolnikiem (I am a farmer), jestem kelnerem (I am a waiter) or jestem studentem (I am a student). So I can almost say what I do!.

I can say where I am from ‘jestem z Anglii‘ and so on.

I can (more or less) order food – proszę frytki (chips please)

I can say what I like – lubię chodzić (I like walking)

So basically I can say a few things but it you were stuck in a lift with me (and only spoke Polish) it would rapidly become boring.

This will I think drive the next top secret  expansion of SL. This is sort of a hint as to how I hope to (time permitting) expand SL. Only five people read this blog yesterday, and if you five can keep a secret I’ll tell you now …

But what if one of you can’t? Hmmmm … the risk is too great.

So to change the subject :-

I’m about to add audio for the european portuguese pages. Yipee!!

With luck this should be done by next week – ready for the summer hols. So happy days for anyone going to Portugal – who doesn’t want to buy a language course!! And just wants to learn a few words and phrases. Actually quite a few. Over a thousand…




Or Madeira?