Monday, February 5, 2007

Requirements MUST be strict

To further echo some issues I brought up earlier, I think we need to start rejecting proposed standards that include grammars that are more complicated than required. For example, the SIP RFC is littered with examples like this:

  from-spec   =  ( name-addr / addr-spec )
               *( SEMI from-param )
  from-param  =  tag-param / generic-param
  tag-param   =  "tag" EQUAL token
 
For this example, just pay attention to the *( SEMI from-param ) and its fallout. This is saying that "there can be zero or more "SEMI" followed by a from-param, where an example of a "from-param" is "tag" EQUAL token.

When I first encountered this about 3 years ago I was annoyed, but it has recently become more annoying the more I think about it. The problem is in the definitions of "SEMI" and "EQUAL". They are:

   SEMI    =  SWS ";" SWS ; semicolon
   EQUAL   =  SWS "=" SWS ; equal
 
And "SWS" is:
   LWS  =  [*WSP CRLF] 1*WSP ; linear whitespace
   SWS  =  [LWS] ; sep whitespace
 
So, something as simple as:
  From: sip:me@example.com;tag=12345
 
can be written as:
  From: sip:me@example.com
    ; 
    tag
    =
    12345
 
and are expected to be equivalent.

Am I insane to think that this flexibility in the input is insane and unreasonable? Now, it is not really that hard to parse those to be equivalent, but my question is this. Why make a standard that says that either form is acceptable? Certainly proponents of text-based protocols (and I am not one of them) would argue that it allows you to have better readability in some cases, but I think this type of flexibility is a little too much.