top | item 22504180

(no title)

maggit | 6 years ago

>Also encoding ampersands into a URI (URL) using HTML encoding schemes is also common, but that is incorrect.

To encode any string (for example a URL) containing & in HTML, you must HTML-encode that &. Using & in the value of the href attribute for an a-tag must result in a URL containing just & in place of the entire entity. This is a property of HTML that has nothing to do with URLs or URL encodings.

discuss

order

austincheney|6 years ago

So let's say you have a raw address with an ampersand that needs to be encoded (the second one) so as not to confuse a URI parser with into thinking there are 3 query string data segments when there are only 2 as the second ampersand is part of a value and not a separator:

    http://domain.com/?name=data&tag=c&t
You will need to encode that ampersand so that it is interpreted as something other than syntax:

    http://domain.com/?name=data&tag=c%26t
Now the first ampersand is not encoded but the second one is. You are correct that ampersands are also syntax characters in HTML/XML so if you wanted to place that address in HTML code it would need to be escaped in HTML:

    http://domain.com/?name=data&tag=c%26t
That address can now be inserted as the value of an HTML anchor tag as such:

    <a href="http://domain.com/?name=data&amp;tag=c%26t">somewhere</a>
The important point to distinguish is that addresses are often used in contexts outside of HTML, even in the browser. For example the address bar at the top of the browser is outside the context of the view port that displays HTML content, and so the appropriate text there is:

    http://domain.com/?name=data&tag=c%26t
This is so because URI only has one encoding scheme, which is percent encoding.