Myths about captcha breaking from McAfee


Written on 28. November 2008 – 06:49 | by admin

I just had to write a short comment now, since i was reading an article about the new xrumer 5 and i found the following comment totally retarded especially coming from mcafee. Its in german but i translate it:

Toralv Dirro der Sicherheitsexperte von McAfee sagt:”Programme, um Captchas automatisch zu analysieren, gibt es seit etwa eineinhalb Jahren”

Toralv Dirro security expert from McAfee says: “Programs which automatically break captchas exist since about one and a half years”

Ok mr smart mouth should get his facts straight and google for OCR :P Optical Character Recognition is much older, which is the base of all the captcha breaking algorythms.

While captchas may be a new thing, the tools used to break these images existed already ages ago. For example GOCR a popular tool on the Linux platform for captcha breaking, existed before captchas where even invented. In fact the whole OCR stuff dates back to as far as 1929 as you can read here:

The History of Optical Character Recognition


In 1929, Gustav Tauschek obtained a patent on OCR in Germany, followed by Handel who obtained a US patent on OCR in USA in 1933 (U.S. Patent 1,915,993). In 1935 Tauschek was also granted a US patent on his method (U.S. Patent 2,026,329).

Tauschek’s machine was a mechanical device that used templates. A photodetector was placed so that when the template and the character to be recognised were lined up for an exact match and a light was directed towards them, no light would reach the photodetector.

In 1950, David H. Shepard, a cryptanalyst at the Armed Forces Security Agency in the United States, was asked by Frank Rowlett, who had broken the Japanese PURPLE diplomatic code, to work with Dr. Louis Tordella to recommend data automation procedures for the Agency. This included the problem of converting printed messages into machine language for computer processing. Shepard decided it must be possible to build a machine to do this, and, with the help of Harvey Cook, a friend, built “Gismo” in his attic during evenings and weekends. This was reported in the Washington Daily News on 27 April 1951 and in the New York Times on 26 December 1953 after his U.S. Patent Number 2,663,758 was issued. Shepard then founded Intelligent Machines Research Corporation (IMR), which went on to deliver the world’s first several OCR systems used in commercial operation. While both Gismo and the later IMR systems used image analysis, as opposed to character matching, and could accept some font variation, Gismo was limited to reasonably close vertical registration, whereas the following commercial IMR scanners analyzed characters anywhere in the scanned field, a practical necessity on real world documents.

Original article about xrumer and the stupid mcafee comment can be found here (in german):http://www.virenschutz.info/beitrag-neu-Hacker-Software-XRumer-5-2556.html

Tags: , , , ,

The 10 worst Spam ISPs today 11-2008


Written on 27. November 2008 – 04:12 | by admin

The tools at Spamhouse compiled a list of the 10 worst spam infected ISP’s. Funny enough to see Microsoft listed there on place 5. Not bad. Spammers know that the anti Spam lists won’t block IP’s from Microsoft so they obviously use their social services as favorite target to place the 3 P’s (Pills,Poker;Porn). Here you can find the list:




According to Microsoft they remove all spammy stuff, but to me it seems they don’t care as much as they should about their customers. Good for other marketers who don’t aggressively post hundreds of spammy posts, but selected advertisement there *hint* *hint*

Tags: , , , ,

Captcha Breaking can be so easy


Written on 18. November 2008 – 21:13 | by admin

Really why bother using difficult methodes like OCR with neural networks when it could all be so easy. Of course there are really (well kind of) *secure* captcha implementations, but lets face the facts - most of the current captcha systems have basic design flaws which allows us to bypass the captcha test or just decode the requested result by reversing the captcha generation algorythm like it was done with the pligg captcha. This is not as hard as it may sound if you look at the source of for example RegenAntiSpam which we use as example now, as it has a really basic design flaw that gives us the possibility to re-use a solved captcha over and over again, even on random websites. The RegenAntiSpam captcha is generated with a token that is supplied by the website and will be sent along with the solved captcha text in the POST request to the signup form.

Lets have a look at the captcha url:

http://blog.tld/wp-content/plugins/captcha.php/?token=jo26gz

So here we have the Token that is used to identify the captcha. Now all you have to do is to solve one captcha and save the token aswell as the result text. Then sniff out the HTTP POST request so you can built your own submission script and just add the token and solved text as static value. Yes, belive it or not, it will work. Its really that simple.

For wordpress this would look like:

stage=validate-user-signup&user_name=fsddsf&user_email=fdsfds%40fsdfds.com&spamCode=fb9f7c&mcode=jo26gz&signup_for=blog&chkread=&submit=Next+%C2%BB

Obviously spamCode and mcode will be used as static value in each request. This is just one example of a basic design flaw in a certain captcha implementation. I’m sure you can spot more out there ;)

Tags: , , , ,

XSS for SEO


Written on 13. November 2008 – 18:07 | by seo23

Here we go for some really dark arts. XSS (Cross Site Scripting) is a way of manipulating web pages to inject html or possibly java code. You can do much more with it such as stealing Cookies etc, but we are only interested in backlinks and redirects. I wont present any exploit details as you can find tons of them on the net and i’m not sure if it isnt illegal in some countries, so if you are really interested how this works in detail checkout this page. This article is just ment to send the interested newbie blackhat seo in the right direction ;)

So what can you do with XSS? You can create your own html code inside the webpage filled with keywords and your link(s) for google to pick it up. The better way however is to find 301 redirects in scripts that send this information with a normal GET request i.e. some.php?new_url=http://yourdomain.com . Hopefully you see the advantage of this trick. If not just move on, you arent ready yet :)

Tags: , , ,

Enable JavaScript in Wordpress


Written on 12. November 2008 – 16:14 | by seo23

Wordpress has javascript disabled for security reasons. That is good, however for our own blogs we certainly want to use javascript. So to enable it on your own blog, you have to edit a php file inside the folder wp-includes called kses.php and comment out the following lines:

// add_filter(’content_save_pre’, ‘wp_filter_post_kses’);
// add_filter(’excerpt_save_pre’, ‘wp_filter_post_kses’);
// add_filter(’content_filtered_save_pre’, ‘wp_filter_post_kses’);

Now JavaScript is enabled again :)

Tags: ,

Hotlink hijack


Written on 12. November 2008 – 15:39 | by seo23

So you get annoyed because too many people hotlink your images? Well here is a little solution. We just hijack those requests to display advertisement for our own page whenever someone requests your images from a remote page. We do this using .htaccess! ust put the following code into your .htaccess file (like we did in the redirects post). If you dont have one then just create it in your www directory and put the following lines in it:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://mydomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://mydomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.mydomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.mydomain.com$ [NC]
RewriteRule .*\.(jpg|jpeg|gif|png|bmp)$ REPLACEMENT.jpg [R,NC]

Obviously you have to change the values first to reflect your website. Put up the image that you want to display for hotlinkers - in this case i named it REPLACEMENT.jpg but you can name it anything you want. Here you go, now noone is able to hotlink your precious images anymore!

Tags: , , , , ,

Rich Content is golden


Written on 12. November 2008 – 14:53 | by seo23

While some people may argue that you can use any content as long as you throw enough backlinks at the site, it is better to have a good structured, content rich website with semantics. Put some thought behind deep linking/interlinking. Use the small tricks such as anchor or bold text. Write your content clever with use of semantics and write as much text as possible. Do not scrape content from other sites, unless for trackbacks maybe. Get some related pages that rank well for your keyword to link back to you. If a site that is already ranking high for your term and is linking back to you, thats always a good thing for the ranking of your own site - good neighborhood is nice ;) Of course i can’t argue about the backlink issue, if you throw just enough backlinks to the site it most likely ranks well even without rich content. But the results are not always the same, so while you can push one site really high with that methode, other sites will ust go down.

For the long run your always better off (especially for whitehat sites) with rich content and some thought put behind your linking.

Tags: , , , , , ,

Cloaking for SEO purposes


Written on 12. November 2008 – 14:39 | by seo23

Cloaking is a way of presenting different content to users that match a certain criteria such as for example the country he or she is coming from. That methode is called Geo-Cloaking to be more specific. Anyways we keep it more simple here. What i will show you is how to present different content to the google bot (or any other spider), while your normal users will see the default website. First of all you need to know what IP Addresses the google bot is coming from. Don’t worry there are existing spider IP lists out for free. Now that you have the list, you have to prepare your keyword filled spider page and save it as “index2.php” while you move your original index to “index1.php”. Of course you can give these files any name, just be sure to change the php accordingly. now put up your IP list somewhere - in this example domain.tld/ip.txt. Now put the following simple cloaking code up as index.php:

$ip = $_SERVER['REMOTE_ADDR'];
$file = file_get_contents(”http://domain.tld/ip.txt”);
$pos = strpos($file, “$ip”);

if ($pos === false) {
include(”index1.php”);
} else {
include(”index2.php”);
}

Great you have cloaked your first site. This is a very simple cloaking example, there is some more advanced software out there, but basicially it all boils down to that. Google is ok with cloaking as long as you dont present a completely changed site to the google bot only, like 1000’s of keywords in <H1> :P ..if you modify your original site to include more keywords etc, it seems to work fine. When you read the Terms of google you can see that cloaking isn’t evil in first place, only if its done in a spammy way. Still even done in a spammy way it seems to work most of the time, but can get your site banned from the google index, so use it at own risk.

Tags: , , , , , , , ,

Link Bait


Written on 12. November 2008 – 05:31 | by seo23

So whats link bait? It is a way of automagically getting backlinks to your site. How you can accomplish this, is to provide a reason for people to link to your site. Easier said as done you may think, but by providing some kind of service you can easily get people to link to you, or you can offer something else of value that people might share around i.e. a wordpress theme where you place your url in the footer. people mostly wont remove it out of respect, because you created the theme for free. That way you can constantly get good backlinks deppending on the quality of your theme. This is just one example, now go figure ;)

Tags: , , , ,

Captcha breaking using OCR


Written on 12. November 2008 – 05:09 | by seo23

What is OCR? OCR Stands for Optical Character Recognition and is usually used for things like transforming scanned documents into editable text again. When breaking a Captcha, we first have to train the OCR Software to recognize various aspects of the captcha image such as the used font or size of the Letters and Numbers. Artificial Neural Networks are of great help for this task and can even be implemented in PHP.

An example ocr software is gocr which can be used from the command line in linux. In combination with image manipulation software it can be trained to break certain captcha images. Basicially what you have to do is make the characters visible and aligned as much as possible. By removing background noise and colors you make it more readable for the ocr software later. You do this by applying filters to the captcha Image using your Image manipulation Software such as ImageMagic for linux.

Tags: , , , ,