To further echo some issues I brought up earlier, I think we need to start rejecting proposed standards that include grammars that are more complicated than required. For example, the SIP RFC is littered with examples like this:
from-spec = ( name-addr / addr-spec )
*( SEMI from-param )
from-param = tag-param / generic-param
tag-param = "tag" EQUAL token
For this example, just pay attention to the *( SEMI from-param ) and its fallout. This is saying that "there can be zero or more "SEMI" followed by a from-param, where an example of a "from-param" is "tag" EQUAL token.
When I first encountered this about 3 years ago I was annoyed, but it has recently become more annoying the more I think about it. The problem is in the definitions of "SEMI" and "EQUAL". They are:
SEMI = SWS ";" SWS ; semicolon EQUAL = SWS "=" SWS ; equalAnd "SWS" is:
LWS = [*WSP CRLF] 1*WSP ; linear whitespace SWS = [LWS] ; sep whitespaceSo, something as simple as:
From: sip:me@example.com;tag=12345can be written as:
From: sip:me@example.com
;
tag
=
12345
and are expected to be equivalent.
Am I insane to think that this flexibility in the input is insane and unreasonable? Now, it is not really that hard to parse those to be equivalent, but my question is this. Why make a standard that says that either form is acceptable? Certainly proponents of text-based protocols (and I am not one of them) would argue that it allows you to have better readability in some cases, but I think this type of flexibility is a little too much.