10

One more challenge to the XSD capability,

I have been sending XML files by my clients, which will be having 0 or more undefined or [call] unexpected tags (May appear in hierarchy). Well they are redundant tags for me .. so I have got to ignore their presence, but along with them there are some set of tags which are required to be validated.

This is a sample XML:

<root>
  <undefined_1>one</undefined_1>
  <undefined_2>two</undefined_2>
  <node>to_be_validated</node>
  <undefined_3>two</undefined_3>
  <undefined_4>two</undefined_4>
</root>

And the XSD I tried with:

  <xs:element name="root" type="root"></xs:element>
  <xs:complexType name="root">
    <xs:sequence>
      <xs:any maxOccurs="2" minOccurs="0"/>
      <xs:element name="node" type="xs:string"/>
      <xs:any maxOccurs="2" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType

XSD doesn't allow this, due to certain reasons.
The above mentioned example is just a sample. The practical XML comes with the complex hierarchy of XML tags ..

Kindly let me know if you can get a hack of it.

By the way, The alternative solution is to insert XSL-transformation, before validation process. Well, I am avoiding it because I need to change the .Net code which triggers validation process, which is supported at the least by my company.

InfantPro'Aravind'
  • 11,692
  • 23
  • 79
  • 113

5 Answers5

5

In case your not already done with this, you might try the following:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="root" type="root"></xs:element>
  <xs:complexType name="root">
    <xs:sequence>
      <xs:any maxOccurs="2" minOccurs="0" processContents="skip"/>
      <xs:element name="node" type="xs:string"/>
      <xs:any maxOccurs="2" minOccurs="0" processContents="skip"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Under Linux this works fine with xmllint using libxml version 20706.

alk
  • 68,300
  • 10
  • 92
  • 234
  • however it still doesn't allow first element to be ANY! :( – InfantPro'Aravind' Nov 15 '11 at 06:51
  • 1
    What exactly is the issue, please? – alk Nov 15 '11 at 07:46
  • this is the error I am getting : **Wildcard '##any' allows element 'node', and causes the content model to become ambiguous. so and so** – InfantPro'Aravind' Nov 15 '11 at 12:45
  • 1
    This looks like a conceptunal problem. For details please see here: http://www.w3.org/TR/xmlschema-1/#cos-nonambig An interesting fact to me is that obviously differnt tools handle this case differently. As I wrote, the solution provide by my does work with the tools mentioned. – alk Nov 16 '11 at 07:59
  • Well. Thanks for the info and ur valuable time :) it was helpful though I cannot implement it coz I have got to deal only with .net :) – InfantPro'Aravind' Nov 16 '11 at 10:26
3

Conclusion:

This is not possible with XSD. All the approaches I was trying to achieve the requirement were named as "ambiguous" by validation-tools, accompanying bunch of errors.

InfantPro'Aravind'
  • 11,692
  • 23
  • 79
  • 113
2

You could make use of a new feature in XML 1.1 called "Open Content". In short, it allows you to specify that additional "unknown" elements can be added to a complex type in various positions, and what the parser should do if it hit any of those elements.

Using XML 1.1, your complex type would become:

<xs:element name="root" type="root" />
<xs:complexType name="root"> 
  <xs:openContent mode="interleave">
    <xs:any namespace="##any" processContents="skip"/>
  </xs:openContent>

  <xs:sequence> 
    <xs:element name="node" type="xs:string"/> 
  </xs:sequence> 
</xs:complexType>

If you have a lot of complex types, you can also set a "default" open content mode at the top of your schema:

<xs:schema ...>
  <xs:defaultOpenContent mode="interleave">
    <xs:any namespace="##any" processContents="skip"/>
  </xs:defaultOpenContent>

  ...
</xs:schema>

The W3C spec for Open Content can be found at http://www.w3.org/TR/xmlschema11-1/#oc, and there's a good writeup of this at http://www.ibm.com/developerworks/library/x-xml11pt3/#N102BA.

Unfortunately, .NET doesn't support XML 1.1 as of yet I can't find any free XML 1.1 processors - but a couple of paid-for options are:

tony19
  • 99,316
  • 15
  • 147
  • 208
Colin Smith
  • 75
  • 1
  • 6
1

Maybe its is possible to use namespaces:

<xs:element name="root" type="root"></xs:element> 
  <xs:complexType name="root"> 
    <xs:sequence> 
      <xs:any maxOccurs="2" minOccurs="0" namespace="http://ns1.com" /> 
      <xs:element name="node" type="xs:string"/> 
      <xs:any maxOccurs="2" minOccurs="0" namespace="http://ns2.com"/> 
    </xs:sequence> 
  </xs:complexType>

This will probably validate.

W van Noort
  • 373
  • 2
  • 16
1

I faced the same problem.

Since I called the validation from .NET; I decided to suppress the specific validation error in ValidationEventHandler as a workaround. It worked for me.

    private void ValidationEventHandler(object sender, ValidationEventArgs e)
    {
        switch (e.Severity)
        {
            case XmlSeverityType.Warning:
                // Processing warnings
                break;
            case XmlSeverityType.Error:
                if (IgnoreUnknownTags
                    && e.Exception is XmlSchemaValidationException
                    && new Regex(
                        @"The element '.*' has invalid child element '.*'\."
                        + @" List of possible elements expected:'.*'\.")
                       .IsMatch(e.Exception.Message))
                {
                    return;
                }
                // Processing errors
                break;
            default:
                throw new InvalidEnumArgumentException("Severity should be one of the valid values");
        }
    }

It is important that Thread.CurrentUICulture must be set to English or CultureInfo.InvariantCulture for the current thread for this to work.

Andrej Adamenko
  • 1,540
  • 14
  • 29