Here you can download Serbian language support package for GNU aspell. Lastest package version (0.02) is compatible with aspell version 0.6 and contains three dictionaries (Cyrillic, Latin and combined) and two word lists each generated from 343242 words and its forms.
Intro
Created and released by Горан Ракић, <gox [a t] devbase [d o t] net>.
Package is released under GNU LGPL license.
Package contains three dictionaries (Cyrillic, Latin and combined) and two word lists (Cyrillic and Latin). Combined dictionary is default. If you want to spellcheck only text written with Cyrillic or Latin script you can use "--variety" option. Words are encoded in UTF-8 code page, normalized to special 8bit code page l-sr. GNU aspell files l-sr.cset and l-sr.cmap are included in this package. You can find description of this code page in file misc/l-sr.txt. Support for accented vowels is built in, but in current release dictionary does not have any words accented vowels.
News
version 0.02, 11/09/2005
- First official release included in Dictionary list on GNU aspell homepage.
- Change in version names. Previous test release was 0.1, next release will be 0.03
- Renamed dictionary containing only words written in Cyrillic script from "cyr" to "cyrl".
- New code page with support for accented vowels. Code page is renamed from "u-serbian" to "l-sr". Further updates of code page are not planned.
- Uncorect words are removed from word list, and new words are added.
version 0.1 (first release), 08/08/2005
- First unofficial release with 229239 words, published on http://srpski.org/aspell
Sources
Word list also includes 133986 words from Corpus of Contemporary Serbian Language built on Faculty of Mathematics, University of Belgrade, with total size of 25MW created by Natural language processing group, Faculty of Mathematics. Word list is given under GNU LGPL license in order to create GNU aspell and MySpell Serbian language dictionary. Additionally, correctness of 284420 words from corpus formed from texts published on Internet, is checked by comparing it with Corpus of Contemporary Serbian Languages. With that, 150080 words are marked as potentionally incorrect. I am thankful to prof. Dusko Vitas from Faculty of Mathematics for all help and given resources. You can find more informations about Corpus of Contemporary Serbian Language at URL http://korpus.matf.bg.ac.yu
I am thankful to Viktor Kerkez
Contact
I will be glad if You, as a users show any initiative in intention to correct errors in words from dictionary or adding more new words to dictionary. Opening of new pages on package location where users will be avaible to import theirs personal dictionaries, and to take part in word validation is planned.
You can contact me over <gox [a t] devbase [d o t] net>