top | item 14385386

(no title)

gwu78 | 8 years ago

Long before amp, Google began prefixing search result urls with "google.tld?url=" and adding Google parameters as suffixes such as "sa=", "ved=", etc.

Unless I am mistaken this parasitic cruft only serves Google, not end users.

Below is quick and dirty program to filter out the above. Replace .com with .cctld as needed.

Requirements: cc, lex

Usage:

   curl -o 1.htm https://www.google.com/search?q=xyz
   yyg < 1.htm > 2.htm
   your-ad-supported-web-browser 2.htm
To compile this I use something like

   flex -Crfa -8 -i g.l;
   cc -Wall -pipe lex.yy.c -static -o yyg;
Save text below as file g.l Then compile as above.

   %%
   [^\12\40-\176]
   \/url[?]q= 
   "http://www.google.com/gwt\/x?hl=en&amp;u=" 
   "&amp;"[^\"]* 
   %%
   main(){yylex();}
   yywrap(){}
As for amp, I read that it needs to use iframes (and Javascript). Yikes. We can easily write a program to strip out iframe targets as well as links to Javascript.

amphtml does look great in a text-only browser that does not load iframes automatically.

discuss

order

SomewhatLikely|8 years ago

It's really annoying trying to copy and paste URLs from Google results. It also seems largely unnecessary, can't they detect clicks using javascript? I have noticed they have started doing this with links sent through Google Hangouts messages as well. I do remember a time when they weren't doing this and it was very refreshing because everyone else was.