php-languid: statistical language guesser

In the not so good old time, when I hadn’t yet seen the green grass in the land of Python, I was building websites in PHP. And one of them needed a way to identify (i.e. guess) the language of arbitrary text.

One of the most well known open source tools for this seems to be Maciej Ceglowski’s languid, written in Perl. Three years ago, I had this ported to PHP through a RAC project (meaning: While I do own the copyright, I have not written the code). And today, while cleaning up my repositories, I stumbled over it again and decided I might just was well put it our there.

So here it is:

https://launchpad.net/php-languid
http://github.com/miracle2k/php-languid

I also briefly considered porting it to Python, but fortunately someone else has already done that:

http://code.google.com/p/guess-language/

3 thoughts on “php-languid: statistical language guesser

  1. Hi Michael,

    I would really need this class for some text-processing I’m doing. Could you send it to me or point me to a page where I can download it, since I only found a page where you can download all files one-by-one, and that takes a lot of time ๐Ÿ™‚

    Like

  2. Okay, thanks a lot! Didnt know about that versioning system.

    Now I’m off to analyzing the internet ๐Ÿ™‚ We’re making a database of all webpages with their respective languages , so this comes really handy!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s