From click fraud to click quality... or from web analytics to web mining

Seattle, February 15th, 2006.

Data Shaping Solutions, in partnership with Authenticlick, will present the first click quality seminar in San Francisco in 2006. Click quality is a better way to detect click fraud and optimize web traffic. State-of-the-art techniques and real click fraud cases will be discussed. The companies have expertise in credit card fraud, click fraud detection, click scoring, web mining, text mining, web analytics, search engine intelligence, impression and ad relevancy fraud, including detecting fraudulent attempts to eliminate one's competitor for paid or organic searches on Google (GOOG), Yahoo (YHOO), MSN (MSFT), Miva (MIVA), Kanoodle, Mamma (MAMA), Ask Jeeves (IACI), 7Search or AOL (TWX).

As the only companies to compute statistically sound patent-pending click scores, Data Shaping Solutions and Authenticlick offer solutions that avoid adding code (Javascript, clear gif) or tags to the advertiser websites. We are able to detect substantially higher percentages of the click fraud and poor quality traffic, and eliminate all false positives.

Examples of false positive that we were able to identify include a large corporation, let's call it Acme, and the US Army. In the case of Acme, an alarm was raised because of thousands of clicks per day, day after day, by the same IP and same browser, all seemingly coming from a same user. However the keywords associated with the clicks both paid and unpaid - the velocity and timing, the proportion of paid clicks and referrals did not show unusual patterns. It was found that Acme uses one IP and one browser for all its employees. Similarly, after investigating a bucket of clicks with highly suspicious spoofed IPs, it was found that the addresses were used by the US Army to hide their true origin. This prevents potential criminals from being indirectly informed (by checking IP addresses in their server logs) that they are being monitored by the Army. Again, the clicks were legitimate.

Conversely, we correctly identified another set of spoofed IP addresses as fraudulent with our powerful metric mix that incorporates proprietary keyword categorizations and multivariate statistical distributions. Email spammers accidentally clicking on paid clicks with web robots in their efforts to harvest email addresses made a few mistakes: they were using the same number of clicks per IP per day, at least on the IP addresses that they did not share with legitimate users. In another case, our linkage analysis revealed that thousands of IP addresses were switched off by one distribution partner caught in click fraud. When they reappeared, they were attached to a new partner, clearly showing that the fraud involved clickware or adware. The fraudster knew which computers were infected and possibly sold this information to another criminal.

Data Shaping and Authenticlick deal with counterfeit clicks, fake impressions and bogus conversions. Click scoring is a complex problem: bogus conversions involve purchases with stolen credit cards or users paid to fill in forms and provide fake information. They can make poor clicks look good if undetected. However, we have developed methodology that preserves the quality of our click scoring system. Interestingly, one of our clients, a law firm, was using a click fraud detection system that failed to capture these bogus conversions in a fraud scheme, because their previous click monitoring system relied on Javascript and clear gif.

Another critical issue is how to attach a conversion to a click. We have developed patent-pending technology that enables us to correctly identify a unique AOL user, whether genuine, bogus or spoofed. The algorithm even recognizes that the sale from one IP originates from a totally diiferent IP address. It will also detect when a sale and a click from a same IP are actually generated by unrelated users that share the same IP address. Or that a sale and a click from a same IP are actually not related as the users are different but temporarily share the same IP. In most cases, we are also able to explain the missing clicks: click listed in Google reports but not seen in server logs. This amounts to 50% of billed clicks in some cases. In one severe case of missing clicks, we were able to reduce the discrepancy from 50% to 0% and maximize savings to the client.

About Data Shaping Solutions

Data Shaping Solutions provides innovative analytical services and resources to clients in industry and academia, particularly in the internet, CRM, business intelligence and finance sectors, with clients around the world. The CEO and Principal Statistician held post-doctoral positions at Cambridge University (England) and University of North Carolina at Chapel Hill before moving to the private sector in 1997. He has published more than 40 papers, many in top statistical journals, and presented at numerous conferences worldwide. Recent clients include Visa, Wells Fargo (WFC), LowerMyBills, GoWoleSales, Grant Media ( and major search engines and internet companies. The company has published a number of white papers on advertising technology, fraud detection, decision trees, data mining and pattern recognition applied to stock trading, index trading and the technical analysis of the stock market.

Data Shaping Solutions also specializes in general fraud detection, predictive analytics, marketing mix optimization and many aspects of predictive modeling. The company has many years of experience extracting actionable conclusions from very large and weakly structured data sets, and has created patent-pending mechanisms to create efficient decision trees, boosted trees and tree forests 200 times faster than competitors, without RAM limitations. These techniques are used for fraud detection and in other problems such as automated user feedback categorization.

The company also offers click fraud, marketing and statistical seminars. Most recently, the company developed query intelligence, website ranking and copyright / trademark infringement detection tools. Details can be found on the company's website, at

Thanks to many critical partnerships and an efficient portfolio of advertising campaigns, attracts more than 150,000 targeted visitors per year, mostly from North America, Europe and Australia. According to Alexa, it is one of the most visited website in its category, offering advertising opportunities to analytical recruiters, statistical software companies and other data mining solution providers.

Data Shaping is currently developing a useful meta-directory for data mining resources, and will add several hundred quality vendors, consulting firms and software companies in the next few months.

