Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [servlet-dev] Path Parameters



On Wed, 17 Feb 2021 at 13:47, Mark Thomas <markt@xxxxxxxxxx> wrote:
On 17/02/2021 12:16, Greg Wilkins wrote:

> However, that is not compliant with RFC 3986 which says that
> normalization should happen before decoding.

Where does it say that? I just looked through all the references to
normalization and couldn't find it.

I think it says so in 2.4 https://tools.ietf.org/html/rfc3986#section-2.4 which says:

   When a URI is dereferenced, the components and subcomponents
   significant to the scheme-specific dereferencing process (if any)
   must be parsed and separated before the percent-encoded octets within
   those components can be safely decoded, as otherwise the data may be
   mistaken for component delimiters.  The only exception is for
   percent-encoded octets corresponding to characters in the unreserved
   set, which can be decoded at any time. 

>  That is all fine if you
> remember the segment boundaries so that segments like "%2e%2e", "%2f"
> and "..;" would be seen as decoded segments after normalization of "..",
> "/" and "..".

My reading of section 2.2 is that reserved characters should not be %nn
decoded prior to normalization.

Exactly, so if we decode %2f or %2e%2e after normalization we end up with a string that if normalized again could be wrong.
Now if our implementations are consistent with their algorithms, this will not be a problem.  But if we give an application a path that contains
.. or a / that is a segment not a divider, then it is ambiguous and that application can rightly get confused... or just give that path back to
the container and the container will normalize again and get confused.
 

I think the Servlet spec is the place to be more explicit about this.
This has been on my radar for a while:

https://github.com/eclipse-ee4j/servlet-api/issues/18


Good one.  I'll note a summary of this email there.

 
--

Back to the top