Hi,
I have a problem validating a XML file against the schema that is linked to it. I opened that file with XMLSpy and Liquid XML, and both have Problems validating it. The errors seem to center around a specific part in the XSD structure, where a XSD file imports a DTD file. Also, this XML structure is already used by many companies, that's why I don't believe that this is a syntax error.
So the elements of this Problem are as follows (I will post the files and the links):
XML test file:
<?xml version="1.0" encoding="UTF-8"?><dic:eclass_dictionary xmlns:dic="urn:eclass:xml-schema:dictionary:2.0" xmlns:ontoml="urn:iso:std:iso:is:13584:-32:ed-1:tech:xml-schema:ontoml" xmlns:cmn="urn:eclass:xml-schema:common:2.0" xmlns:hea="urn:eclass:xml-schema:header:2.0" xmlns:idt="urn:iso:std:iso:ts:29002:-5:ed-1:tech:xml-schema:identifier" xmlns:cat="urn:iso:std:iso:ts:29002:-10:ed-1:tech:xml-schema:catalogue" xmlns:basic="urn:iso:std:iso:ts:29002:-4:ed-1:tech:xml-schema:basic" xmlns:val="urn:iso:std:iso:ts:29002:-10:ed-1:tech:xml-schema:value" xmlns:ext="urn:x-ontoml-extensions:schema:core" xmlns:dt="urn:eclass:template:xml-schema:data-type:2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:eclass:xml-schema:dictionary:2.0 http://www.eclass.eu/static/eClassXML/2.0/eCl@ssXML/dictionary.xsd"><ontoml:ontoml><dictionary><contained_classes><ontoml:class xsi:type="ontoml:CATEGORIZATION_CLASS_Type" id="0173-1#01-AEI956#002"><date_of_original_definition>2011-11-18Z</date_of_original_definition><date_of_current_version>2012-11-27Z</date_of_current_version><date_of_current_revision>2012-11-27Z</date_of_current_revision><revision>1</revision><status>66</status><source_language country_code="US" language_code="en"/><preferred_name><label country_code="DE" language_code="de">Notebook</label></preferred_name><definition><text country_code="DE" language_code="de">kleiner tragbarer Personal Computer, der üblicherweise klappbar ausgeführt ist</text></definition><its_superclass class_ref="0173-1#01-ACL245#004"/><hierarchical_position>19010202</hierarchical_position><keywords><label country_code="DE" language_code="de">Sub-/Ultra-Portable-Notebooks</label><label country_code="DE" language_code="de">Mini Notebook</label><label country_code="DE" language_code="de">Laptop</label><label country_code="DE" language_code="de">Subnotebook</label><label country_code="DE" language_code="de">Tragbarer PC</label><label country_code="DE" language_code="de">Sub-Notebook</label></keywords></ontoml:class></contained_classes></dictionary></ontoml:ontoml></dic:eclass_dictionary>
XSD1:
<xs:schema xmlns="urn:iso:std:iso:is:13584:-32:ed-1:tech:xml-schema:ontoml" xmlns:idt="urn:iso:std:iso:ts:29002:-5:ed-1:tech:xml-schema:identifier" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:iso:std:iso:is:13584:-32:ed-1:tech:xml-schema:ontoml" elementFormDefault="unqualified" attributeFormDefault="unqualified"><xs:import namespace="urn:iso:std:iso:ts:29002:-5:ed-1:tech:xml-schema:identifier" schemaLocation="./ISO29002/identifier.xsd"/><xs:simpleType name="APosterioriSemanticRelationId"><xs:restriction base="idt:IRDI_type"><xs:pattern value="([0-9]{4})\-([A-Z0-9:_\.]{1,35})(\-([A-Z0-9:_\.]{1,35})((\-[019])(\-([A-Z0-9]{1,10})_([A-Z0-9]{0,10})_([0-9]{1,5}))?)?)?#CE\-([A-Z0-9:_\.]{1,71})#[0-9]{1,9}"/><xs:pattern value="([0-9]{4})\-([A-Z0-9:_\.]{1,35})\-([A-Z0-9:_\.]{1,35})\-\-([A-Z0-9]{1,10})_([A-Z0-9]{0,10})_([0-9]{1,5})#CE\-([A-Z0-9:_\.]{1,71})#[0-9]{1,9}"/><xs:pattern value="([0-9]{4})\-([A-Z0-9:_\.]{1,35})\-\-\-([A-Z0-9]{1,10})_([A-Z0-9]{0,10})_([0-9]{1,5})#CE\-([A-Z0-9_:\.]{1,71})#[0-9]{1,9}"/></xs:restriction></xs:simpleType> ...</xs:schema>
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE xs:schema [<!ENTITY % identifier.dtd SYSTEM "identifier.dtd"> %identifier.dtd; ]><xs:schema targetNamespace="urn:iso:std:iso:ts:29002:-5:ed-1:tech:xml-schema:identifier" xmlns:id="urn:iso:std:iso:ts:29002:-5:ed-1:tech:xml-schema:identifier" xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"><!-- IRDI --><xs:element name="IRDI" type="id:IRDI_type"/><xs:element name="IRDI_list" type="id:IRDI_list_type"/><xs:simpleType name="IRDI_type"><xs:restriction base="xs:string"><xs:pattern value="&irdi1;"/><xs:pattern value="&irdi2;"/><xs:pattern value="&irdi3;"/></xs:restriction></xs:simpleType><!-- IRDI sequence --><xs:complexType name="IRDI_sequence_type"><xs:sequence><xs:element ref="id:IRDI" minOccurs="0" maxOccurs="unbounded"/></xs:sequence></xs:complexType><!-- IRDI list--><xs:simpleType name="IRDI_list_type"><xs:list itemType="id:IRDI_type"/></xs:simpleType></xs:schema>
Both XML Editors (XMLSpy and Liquid XML) seem to have a problem with this XSD, specifically with this part:
<xs:pattern value="&irdi1;"/>
where "&irdi1;", "&irdi2;" and "&irdi3;" are described in the imported DTD.
DTD:
<!-- Digits: 0-9 --><!ENTITY digit "0-9"><!-- Internal separator character --><!ENTITY res ":_\."><!-- Alphanumeric character --><!ENTITY alnum "0-9A-Z"><!-- Safe character --><!ENTITY safe "&alnum;&res;"><!-- International Code Designator (ICD) --><!ENTITY icd "[&digit;]{4}"><!-- Organization Identifier (OI) --><!ENTITY oi "[&safe;]{1,35}"><!-- Organization Part Identifier (OPI) --><!ENTITY opi "[&safe;]{1,35}"><!-- Organization Part Identifier Source (OPIS) --><!ENTITY opis "[&alnum;]{1,1}"><!-- Additional Information (AI)--><!ENTITY ai "[&safe;]{1,70}"><!-- Code Space Identifier (CSI) --><!ENTITY csi "[&alnum;]{2,2}"><!-- Item Code (IC) --><!ENTITY ic "[&safe;]{1,131}"><!-- Registration Authority Identifier (RAI) --><!ENTITY rai1 "&icd;-&oi;(-&opi;(-&opis;(-&ai;)?)?)?"><!ENTITY rai2 "&icd;-&oi;(-&opi;)?--&ai;"><!ENTITY rai3 "&icd;-&oi;---&ai;"><!-- Data Identifier --><!ENTITY di "&csi;-⁣"><!-- Version Identifier (VI) --><!ENTITY vi "[0-9]{1,10}"><!-- International Registration Data Identifier (IRDI) --><!ENTITY irdi1 "&rai1;(#&di;#&vi;)?"><!ENTITY irdi2 "&rai2;(#&di;#&vi;)?"><!ENTITY irdi3 "&rai3;(#&di;#&vi;)?">
So when I try to read the XML file like that:
XmlReaderSettings xmlReaderSettings = new XmlReaderSettings(); xmlReaderSettings.ValidationType = ValidationType.Schema; xmlReaderSettings.IgnoreComments = true; xmlReaderSettings.IgnoreWhitespace = true; xmlReaderSettings.IgnoreProcessingInstructions = true; xmlReaderSettings.ValidationFlags = System.Xml.Schema.XmlSchemaValidationFlags.ProcessSchemaLocation; XmlReader reader = XmlReader.Create("test_file.xml" ,xmlReaderSettings); while (reader.Read()) {}
I get an exception on the second element of the XML file:
'urn:iso:std:iso:ts:29002:-5:ed-1:tech:xml-schema:identifier:IRDI_type' is not declared, or is not a simple type SourceUri "http://www.eclass.eu/static/eClassXML/2.0/ontoML/identifier.xsd" (The URL for XSD1)
So when my program tries to read XSD1, it encounters the idt:IRDI_type which is described in XSD2 that imports the DTD. Some error happens (probably because the DTD didn't get imported), so that XSD2 doesn't get processed, therefore the program has no definition
for idt:IRDI_type.
If I open XSD2 in Visual Studio 2010 Web Browser, &irdi1, &irdi2 and &irdi3 are replaced with regular expressions defined in the DTD. So I assume that the syntax is OK. Then why do I get a validation error?
Is there something I can do to get the XML file validated?