8 - Schema
Schema parsers
We use a set of classes to parse schema elements. There are 11 flavors of schema elements, 8 of them being described in a RFC, 3 of them being ApacheDS proprietary:
- AttributeType
- DitContentRule
- DitStructureRule
- LDAPSyntax
- MatchingRule
- MatchingRuleUse
- NameForm
- ObjectClass
and
- LdapComparator
- Normalizer
- SyntaxChecker
We need to be able to parse those schema elements because they can be added into the server as a description (ie, a String representing one of those schema elements as defined by the RFC). For the same reason, the LDAP API need to validate that those schema elements are valid before sending them to a LDAP SERVER, or to be able to properly parse what it gets from a LDAP server.
Strict vs quirks mode
Here we have a problem : most of the LDAP server implementation violate the RFC. We can’t simply expect the String representing a schema element to be compliant with the RFC. Some typical deviations are :
- OpenLDAP uses some macro instead of OIDs. This is convenient, as it allows to define the root OID with a name, and reuse it in the associated schema elements
- AD and many other servers expect some specific characters to be accepted, like ‘_', ‘:', ‘#', …
- Sometime, the values may come without quotes, when it’s required
- etc.
We will define the strict mode a mode which follows the RFC tightly, and the quirks mode a relaxed version of the parser, more permissive. One can use either the strict or relaxed mode using a flag.
Strict mode
The only thing we will relax is the order in which the various parts of each description is present in a schema description : we don’t expect them to be ordered as described in the RFC.
The various parts are defined using a few syntaxes :
-
NAME: qdescrs
-
DESC: qdstring
-
SUP (ObjectClass), MUST, MAY, APPLIES, AUX, NOT: oids
-
SUP (AttributeType), EQUALITY, ORDERING, SUBSTR, FORM, OC: oid
-
SYNTAX (AttributeType): noidlen
-
SYNTAX (MathingRule): numericoid
-
SUP (DitStructureRule): ruleids
-
descr: oid, qdescrs
-
qdescr: qdescrs, qdescrlist
qdescrs and oids may contain one or many qdescr and oid.
descr, strict
The descr construct is used by oid and qdescrs (an OID can be a name). The strict mode will use this grammar :
descr ::= keystring
keystring ::= leadkeychar keychar*
leadkeychar ::= ALPHA
keychar ::= ALPHA | DIGIT | HYPHEN
ALPHA ::= ['A'..'Z'] | ['a'..'z']
DIGIT ::= ['0'..'9']
HYPHEN ::= '-'
SQUOTE ::= '\''
qdstring, strict
A qdstring can contain any type of UTF-8 characters, except the simple quote or the backslash, which must be encoded. It’s always surrounded by simple quotes :
qdstring ::= SQUOTE dstring SQUOTE
dstring ::= ( QS | QQ | QUTF8 )*
QQ ::= ESC %x32 %x37
QS ::= ESC %x35 ( %x43 / %x63 )
QUTF8 ::= QUTF1 | UTFMB
QUTF1 ::= %x00-26 | %x28-5B | %x5D-7F
qdescr, strict
qdescr is a quoted name, where the first char must be alphabetic, and the following chars must be alphabetic, digits or hyphen. Here is the ABNF for qdescr :
qdescr ::= SQUOTE descr SQUOTE
noidlen, strict
Relaxed mode
qdstring, relaxed
There
descr, relaxed
The relaxed descr accepts more characters, like underscore, semi-colon, dot, colon or sharp. The leadkeychar will not be mandatory, too. Here is the ABNF we will accept :
relaxed-descr ::= relaxed-keystring
leaxed-keystring::= keychar+
relaxed-keychar ::= ALPHA | DIGIT | HYPHEN | UNDERSCORE | SEMICOLON | DOT | COLON | SHARP
ALPHA ::= ['A'..'Z'] | ['a'..'z']
DIGIT ::= ['0'..'9']
HYPHEN ::= '-'
UNDERSCORE ::= '_'
SEMI_COLON ::= ';'
COLON ::= ':'
SDOT ::= '.'
SHARP ::= '#'
qdescr, relaxed
Compared to the strict mode, we will accept a non-quoted String, or a String using double quotes.
relaxed-qdescr ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | relaxed-descr
oid, relaxed
We will accept quoted and double quoted OIDs and Names, in relaxed mode. Here is teh supported ABNF :
oid-relaxed ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | descr-relaxed |
SQUOTE numericoid SQUOTE | DQUOTE numericoid DQUOTE | numericoid
noidlen, strict
Here, we will allow textual syntax name to be used, not only OIDs. For instance, something like SYNTAX IA5String will be allowed.
We also allow quoted and double quoted OIDs.