top | item 16242913

(no title)

I don't understand enough about the ad business to answer this myself. If there's a legitimate reason to allow 3p scripts to run code - it would seem like creating a domain specific language that Google safely translates into JS would be so much better. Allowing 3Ps to run arbitrary JS just seems so shockingly wrong.

No amount of manual auditing can catch malicious code. It's way too complex for a human to parse.

Is there a legitimate business need that anyone's aware of to have code run in Ads? If so, why not use a DSL?

discuss

wcfields|8 years ago

I used to work for https://www.interpolls.com/ in 2007 so tech may has changed quite a bit but here's how the business worked then:

We were allowed 30kb of JS file to load which could (depending on the ad network) serve ~300kb of a Flash SWF file. We ran Cold Fusion hooks in the SWF to radio home to our JS file to trigger 1x1 pixels for 3rd party trackers. We scraped our raw Akamai HTTP request logs after the fact on a CRON job to create our reporting system. There was a small cluster of FreeBSD servers that crunched the HTTP request logs. Every mouse-over / click was registered via these pixel HTTP GETs. We had timers too that would trigger every few seconds. The reporting system probably had a 2-3 hour delay due to the immense amount of traffic we received.

We specialized in "polls" which were plain old HTML radio buttons overtop of the SWFs which after you answered gave you a quick answer and sometimes had digital takeaways in the popup answer window (Icons, Wallpapers, etc..)

At the time all of our ads were handmade, we had a design team and a programming team that would create these together and code them specifically to the clients request. By the time I left we had started to automate it into a drag+drop system for clients.

Sidenote: Biggest job screw up that I've ever done was not putting in the correct 3rd party tracking pixel into an 300x250 that ran for 1 day on AOL.com homepage. It ended up being fine since we got the results back for the typo from the raw logs, but it could have been a $200k mistake!

nopriorarrests|8 years ago

I don't know anything about JS-based cryptomining, but I wonder if you can't stop such ads without breaking 90% of legit ads.

I mean, it's all probably boils down to number-crunching? So DSL you are envisioning should block really basic language parts, like cycles and math operations.

If I'm wrong and mining actually could be easily blocked on language level using some DSL, I'm all ears.

mcphage|8 years ago

It would be nice if things could be blocked by CPU usage... even if you’re not mining cryptocurrency, if your ad uses more then 5% of my CPU it should be killed.

tripzilch|8 years ago

I've come to decide that the only ads I consider "legit" are where the site owner strikes a deal with another business that is interested in advertising on their site, the site owner hosts the ad on their own server, as a picture banner or text or perhaps a nice block in a side column that says "sponsored content" or whatever, and just links to the other business.

Site owner controls all the content. Any tracking will be done mainly via server logs, if the site owner wants to they can use a bit of script to quickly shove in a redirect onmousedown, in order to track exactly when the user clicked what link. But frankly I've found even that technique a privacy insult ever since I noticed Google doing this in their own search results.

This is analogous to how paper newspapers used to manage their ad space. No third party shit, and if the magazine was proud of itself it would curate the ads to only deal with advertisers that wouldn't annoy their reader base (too much).

A bit of a hassle maybe, but it shows your readers that you actually care about what content is displayed on your site (let alone what code is run). But most importantly, no adblocker will block these kinds of ads. Because they're just image links, after all. Adblocker can't see if that's an ad banner or just a thumbnail linking to an external domain. And I would maybe even bother to whitelist those if they did (right until one shows me crap I don't want to see, like being confronted with nudity or sex when I'm not in the mood for it).

singingboyo|8 years ago

Evidently blocking math is probably not okay.

However, mining is useless without a way to send it back out to the network. I doubt ads need networking capabilities - so just prevent that bit. That should do it, as far as I can tell.

dmytrish|8 years ago

Ads could run on something like the Ethereum virtual machine, having a limited amount of "gas" (instructions) to execute.

yegle|8 years ago

Looks like https://developers.google.com/caja/ can be used for the purpose.

Disclaimer: I work for Google but not on ads side and have limited knowledge on Javascript or general frontend stuff.

tripzilch|8 years ago

It was only last week or so when I read some security researcher pulling some tricks to bypass part of caja's sandbox (while looking for something else, even). Sure this was a whitehat researcher and they got a (very) nice bug bounty from Google Project Zero. But if they use this to secure the 3rd party scripts that are apparently allowed these days on Google Ads, they're being hugely irresponsible.

I never heard of Caja until a few weeks ago, but apparently it started in 2007, could be that I forgot when I heard about it though. Back in 2007 though I was still a frontend web developer with a very keen eye on JS security and all the XSS/CSRF problems of those days. Back then JS/ECMAScript did not have sufficiently advanced features to properly sandbox code. This was a bit of a fool's errand back then. By now it's gotten a lot more of these features, mainly revolving around protecting the super-flexible fluid JS objects from modification and abuse. But I really kind of wonder if that locks everything really watertight? Because browsers are going to have unofficial/proprietary features, and you need just one to accidentally get this slightly wrong, get a reference to a non-sandboxed Window object via-via-via, and it all falls apart.

You don't let untrusted people run code on stuff you serve. Code is just too slippery, turing complete etc, and ECMAScript perhaps even more so than many other languages. Can't we just take that for a given by now, instead of trying to be cleverer than the previous smart person that failed at it?

abecedarius|8 years ago

Sort of: I don't think it's meant to restrict time/space usage, just access to capabilities. If you don't give the ad-code access to the network, it'd have no way to access the blockchain, but it could still chew up your CPU cycles.

f1notformula1|8 years ago

Thanks for this - That actually looks like a much more pragmatic approach than an all-new DSL.

Macha|8 years ago

Ultimately, it's because the advertisers don't trust Google or other middle men and insist on running code from third party vendors that grabs more information and promises better metrics, or to determine if the site or user are in some way fraudulent.