0

While I was writing a custom RSS feed for my PHP program, I've come across an issue that the ampersand (&) character has to be converted to &. I'm wondering if there are other characters like this. Thanks for your information.

This is invalid:

<?xml version="1.0" encoding="UTF-8" ?>         
<rss version="2.0">
<channel>
    <title>custom user feed</title>                 
        <item>
            <description>
                <div>a & b</div>
            </description>
        </item>
</channel>      
</rss>

Reference: Why can't RSS handle the ampersand?

Community
  • 1
  • 1
Teno
  • 2,524
  • 3
  • 32
  • 56

1 Answers1

3

Yes, at a bare minimum, it should be obvious that < will cause you issues, since it would be taken as a tag start. It is usually encoded as &lt;.

See http://en.wikipedia.org/wiki/XML#Escaping for more detail.

paxdiablo
  • 814,905
  • 225
  • 1,535
  • 1,899
  • Not only "usually": `The ampersand character (&) and the left angle bracket ( – fvu Oct 14 '12 at 11:08
  • @paxdiablo Thanks for such a quick reply and the information. I tried this `
    a & b < c > "d" 'e'
    ` and it seems that double and single quotes and `>` are okay to be used without not escaping. So `&` and `
    – Teno Oct 14 '12 at 11:09
  • @fvu, I agree with you. I was just saying that it was usually encoded in that way, not that it can sometimes be in there as the literal ` – paxdiablo Oct 14 '12 at 11:23
  • @Teno, those are the two key syntax markers that are forbidden. Most of the others are conveniences rather than dictates. – paxdiablo Oct 14 '12 at 11:26
  • @paxdiablo indeed - `If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " & " and " < " respectively.` (XML spec). – fvu Oct 14 '12 at 11:28