Decoding Russian Spam and Mojibake

Just overnight, 50 new spam comments have appeared in my spam filter. Out of those, 25 were in the Russian language and the others were all in English, although 2 or 3 of those had Russian addresses. Russian spam seems to be increasing exponentially at the same time that I’m getting bored with it.

For a while I was letting a few through on one thread.  From this I was able to determine that the Russian word блог is blog, Спасибо is thank you, сексапил is seksapil (sex appeal?), and finally from a comment on another post, I was terribly excited to find out that  гавнокоментов (спама), is the word for orgavnokomentov (spam).

Translation is simple enough. Just highlight, copy, and paste it in the box of a machine translation tool like Google Translate or Foxlingo, and hit the “translate” button to see a quick approximation.  If you like to play with languages a lot, you can even install the Firefox language toolbar.

Once in a while a bit of Russian spam turns up that’s completely unintelligible. In the WordPress spam filter this comes across as a series of question marks and can’t be decoded, but in some formats it has distinguishing symbols and can be decoded.  Here is an example from another website:

Ñïàñèáî çà ñòàòüþ.. Àêòóàëüíî ìíå ñåé÷àñ.. Âçÿëà ñåáå åùå ïåðå÷èòàòü.

mojibake1This is called mojibake and is what happens when your software can’t decode non-standard character sets in Cyrillic, etc. For this you need a decoder, like:

which  tells us this is written in ISO 8859-15=>windows1251 and yields the following Cyrillic characters:

Спасибо за статью.. Актуально мне сейчас.. Взяла себе еще перечитать

The Cyrillic characters can now be translated with your preferred system.  In Google Translate this gives:

Thanks for the article .. News to me now .. Took another reread.

In all fairness I would have to say that not all the Russian spam has been useless. So far I have discovered a very amusing Russian horoscope ( page,  a blog about herbs: Lechimsya herbs–recipes and tips for travolecheniyu ( one for Therapeutic nutrition: all about a healthy diet (, and this one with interesting pictures of presumably Russian buildings Stroyblog: blog on construction and real estate (, all brought by the same one or two spam bots and none with any comments.  If this is someone’s business plan, I sure can’t figure it out, but I’m not about to give them any links either.

Note: Within five minutes of publishing, this post was published in an English language, Russia-based splog, a blog created solely of content ripped off of other posts, without crediting the original. Maybe that’s the business plan, that the material from their blogs is scraped.