24

I'm using URI.encode to generate HTML data URLs:

visit "data:text/html,#{URI::encode(html)}"

After upgrading to Ruby 2.7.1, interpreter started warning:

warning: URI.escape is obsolete

Recommended replacements of this are CGI.escape and URI.encode_www_form_component. However, they're not doing same thing:

2.7.1 :007 > URI.escape '<html>this and that</html>'
(irb):7: warning: URI.escape is obsolete
 => "%3Chtml%3Ethis%20and%20that%3C/html%3E"
2.7.1 :008 > CGI.escape '<html>this and that</html>'
 => "%3Chtml%3Ethis+and+that%3C%2Fhtml%3E"
2.7.1 :009 > URI.encode_www_form_component '<html>this and that</html>'
 => "%3Chtml%3Ethis+and+that%3C%2Fhtml%3E"

Result of these slight encoding differences - html page where spaces are replaced by +. My question is - what's a good replacement of URI.encode for this use case?

engineersmnky
  • 22,010
  • 2
  • 32
  • 47
Tadas Sasnauskas
  • 1,985
  • 20
  • 24
  • 1
    Take a look at [`ERB::Util.url_encode`](https://ruby-doc.org/stdlib-2.7.1/libdoc/erb/rdoc/ERB/Util.html#url_encode-method) – it encodes space as `%20` and also encodes `/` as `%2F` (which is perfectly fine) – Stefan Dec 23 '20 at 12:53

3 Answers3

25

There is actually a drop in replacement.

s = '<html>this and that</html>'    
p = URI::Parser.new
p.escape(s)
=> "%3Chtml%3Ethis%20and%20that%3C/html%3E"

Docs: https://docs.w3cub.com/ruby~3/uri/rfc2396_parser

Found this through a comment under this article https://docs.knapsackpro.com/2020/uri-escape-is-obsolete-percent-encoding-your-query-string

Also tested this against some other strings in my setup, this also seems to retain commas the same way URI.escape does, in contrast to ERB::Util.url_encode.

NOTE: As this answer became so popular now, it's probably worth to mention that you should not blindly change your code to use URI::Parser unless you are certain your project doesn't need a standards compliant encoder. As URI.escape was actually deprecated for a reason. So before simply switching to URI::Parser make sure you have read and understood https://stackoverflow.com/a/13059657/6376353

Stefan Horning
  • 729
  • 12
  • 16
  • 3
    Using `URI::Parser#escape` is _exactly_ the same as using `URI::escape`, except without the warning. And that's fine, if that's what you want, but it's important to realize that this is not a different solution. If you look at the source code, `URI::escape` calls `URI::DEFAULT_PARSER.escape`, and `URI::DEFAULT_PARSER` is an instance of `URI::Parser`. – Steve Oct 05 '21 at 01:53
3

There is no official RFC 3986-compliant URI escaper in the Ruby standard library today.

See Why is URI.escape() marked as obsolete and where is this REGEXP::UNSAFE constant? for background.

There are several methods that have various issues with them as you have discovered and pointed out in the comment:

  • They produce deprecation warnings
  • They do not claim standards compliance
  • They are not escaping in accordance with RFC 3986
  • They are implemented in tangentially related libraries
D. SM
  • 12,345
  • 3
  • 10
  • 19
3

From Apidock.com

require "erb"
include ERB::Util

puts url_encode("Programming Ruby:  The Pragmatic Programmer's Guide")

Generates

Programming%20Ruby%3A%20%20The%20Pragmatic%20Programmer%27s%20Guide

gordie
  • 1,251
  • 2
  • 15
  • 29