You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As sitemap spec mentioned, the xml itself should do a xml entity escape, which the gem already have.
But the url should first do the RFC-3986 standard for URIs or the RFC-3987 standard for IRIs, and xml entity escape at last. sitemap generator seems didn't follow RFC-3986 now.
add 'linkTestEntityEscape&<> and RFC3986ü中文'
# output: <loc>https://website.test/linkTestEntityEscape&<> and RFC3986ü中文</loc>
# should be: <loc>https://website.test/linkTestEntityEscape%26%3C%3E%20and%20RFC3986%C3%BC%E4%B8%AD%E6%96%87</loc>
add 'ü中文?aaa=bbb'
# output: <loc>https://website.test/ü中文?aaa=bbb</loc>
# should be: <loc>https://website.test/%C3%BC%E4%B8%AD%E6%96%87?aaa=bbb</loc>
can someone help me to check if my conclusion is right since I'm just a junior programmer and not sure it's right.
If everything is OK, a PR for this issue will be sent later.
Best Regards,
Lisbeth
The text was updated successfully, but these errors were encountered:
Hi @lisbethw1130 I think you're right. When I wrote this gem years ago it wasn't internationalized to handle UTF-8 and that wasn't as prevalent as it is today. It would be great if you could add that functionality, with tests :)
url escape can't be done in sitemap generator, so I wrote the tips in readme.
e.g., we can't accurately split the query part and path part with a unescaped uri
https://example.com/dd?dd=?aa=vv can be https://example.com/dd%3Fdd=?aa=vv or https://example.com/dd?dd=%3Faa=vv
Ruby doesn't escape single quote as xml spec mentioned, I just opened an issue in order to find out the real issue.
As sitemap spec mentioned, the xml itself should do a xml entity escape, which the gem already have.
But the url should first do the RFC-3986 standard for URIs or the RFC-3987 standard for IRIs, and xml entity escape at last. sitemap generator seems didn't follow RFC-3986 now.
can someone help me to check if my conclusion is right since I'm just a junior programmer and not sure it's right.
If everything is OK, a PR for this issue will be sent later.
Best Regards,
Lisbeth
The text was updated successfully, but these errors were encountered: