Analytics

RelaxNG to JSON-LD: How XML Schema Validation Evolved into Modern Structured Data

Timeline showing DTD, XSD, RelaxNG, and JSON-LD eras

Schemas have been telling us our data is wrong since the 1980s. The names changed — DTD, XSD, RelaxNG, JSON-LD — but the discipline is identical: declare the shape, validate the payload, fail fast before consumers see broken data. XML schema validation didn’t disappear when the web pivoted to JSON; it migrated. The questions a RelaxNG pattern asked in 2003 are the same questions a Schema.org validator asks in 2026, just with different syntax and a different consumer at the end of the pipe.

This pillar walks the bridge. It connects the legacy RelaxNG textbook this domain has hosted since Eric van der Vlist’s RELAX NG book at books.xmlschemata.org/relaxng/ to the structured-data tools we ship today. If you came in through a RelaxNG chapter, the workflow you already know transfers directly. If you came in through SEO, understanding where this discipline came from will make your JSON-LD better.

The Schema Validation Problem (Then and Now)

Every system that exchanges structured data faces the same problem: the producer and the consumer have to agree on a shape. Without that agreement, the consumer either crashes, silently drops fields, or — worst case — happily accepts garbage and propagates it downstream. In 1998, this manifested as a B2B EDI feed rejecting an order because the date format was off by a colon. In 2026, it manifests as a Product rich result not showing up in Google because priceCurrency is missing from the Offer node.

Different decade, same failure mode. The fix has always been the same: a contract — a schema — that both sides can validate against. XML schema validation formalized this for the document-exchange world. JSON-LD validation continues it for the web-search world. The tooling changed; the engineering judgment didn’t.

For example, an XSD that requires <orderDate> to be of type xsd:dateTime is doing exactly what a Schema.org validator does when it requires an Event‘s startDate to be of type DateTime. The constraint, the failure mode, and the remediation are isomorphic.

A Brief History: DTD to XSD to RelaxNG to JSON-LD

Here is the lineage, compressed. Each generation kept what worked and dropped what didn’t.

Timeline showing DTD 1986, XSD 2001, RelaxNG 2003, and JSON-LD 2014 schema language evolution

DTD (1986). Born inside SGML and inherited by XML in 1998. Element and attribute declarations only. No real datatype system — everything was effectively a string. Useful for shape, weak on content.

XSD / XML Schema (2001). The W3C’s answer. Introduced a rich datatype hierarchy (xsd:string, xsd:dateTime, xsd:decimal, custom restrictions), namespaces, and structural constraints. Verbose, but expressive. Still in production at every enterprise that ships SOAP or document-based B2B.

RelaxNG (2003). ISO/IEC 19757-2. Designed by James Clark and Murata Makoto as a cleaner, pattern-based alternative to XSD. Two equivalent syntaxes — XML and the readable compact form — with the same semantics. Eric van der Vlist’s textbook, hosted on this domain at books.xmlschemata.org/relaxng/, remains the canonical practitioner reference. RelaxNG won the readability argument; XSD won the corporate inertia argument.

JSON-LD + Schema.org (2014+). The W3C published JSON-LD 1.0 in January 2014. Schema.org, founded by Google, Microsoft, Yahoo, and Yandex in 2011, supplied the vocabulary. The combination skipped most of XSD’s formal apparatus and optimized for one consumer above all: the search engine. Pragmatism beat purity, and pragmatism was correct for the use case.

What XML Schema Got Right (and Wrong)

RelaxNG and XSD did three things genuinely well, and two things badly. Honest evaluation matters here, because the right things carried forward and the wrong things didn’t.

What they got right:

  • A real datatype system. xsd:dateTime is unambiguous in a way that “ISO date string” never was. Every modern schema language since has either copied this or paid the cost of skipping it.
  • Pre-publication validation. Catching shape errors before a document hit a parser saved untold hours of post-mortem debugging.
  • Composability. RelaxNG patterns and XSD types could be referenced and reused. This is what JSON Schema and Schema.org’s @type hierarchy now imitate.

What they got wrong:

  • Verbosity. A 200-line XSD to describe a 20-line document is a poor tradeoff for non-enterprise use cases. Developers fled to JSON for a reason.
  • Tooling fragmentation. XSD validators disagreed on edge cases. RelaxNG had Jing and rnv but never reached XSD’s market share. The web needed one stack everyone agreed on, and XML never delivered it.

The Web Pivots: Why JSON-LD Won

By 2010 the web was running on JSON, not XML. SOAP was retreating; REST + JSON had taken its place. The semantic web community had spent years pushing RDF/XML and getting nowhere. JSON-LD’s insight was minimal: take RDF’s semantics, hide them inside a familiar JSON object via @context, and ship it. Developers got JSON. Search engines got linked data. Nobody had to learn Turtle.

Schema.org provided the vocabulary that made the format useful in practice. Google started rewarding sites with rich results — review stars, recipe cards, FAQ accordions — and adoption followed economic incentive. By 2020, JSON-LD was the dominant structured-data format on the web, with Microdata and RDFa shrinking quietly.

Importantly, JSON-LD did not replace XSD or RelaxNG. Enterprise XML pipelines still run on them. What JSON-LD did was occupy the web-publishing niche that XML had failed to capture. The two ecosystems coexist; they serve different consumers. If you’re shipping invoices to a tax authority, you’re still in XSD territory. If you’re trying to get a rich result on a Product page, you’re in JSON-LD territory.

Datatypes Across Schema Languages

The cleanest way to see the continuity is to line up equivalent datatypes side by side. The names changed; the constraints didn’t.

Concept XSD RelaxNG Schema.org / JSON-LD
ISO 8601 date+time xsd:dateTime inherits XSD types "@type": "DateTime"
Date only xsd:date inherits XSD types "@type": "Date"
Decimal number xsd:decimal inherits XSD types "@type": "Number"
Integer xsd:integer inherits XSD types "@type": "Integer"
URI / URL xsd:anyURI inherits XSD types "@type": "URL"
Boolean xsd:boolean inherits XSD types "@type": "Boolean"
Free text xsd:string text pattern "@type": "Text"

RelaxNG explicitly inherits the XSD datatype library — chapter 19 of van der Vlist’s book covers xsd:dateTime in detail, and that reference stays useful in 2026 because Schema.org’s DateTime follows the same ISO 8601 lineage. If you can read an XSD dateTime spec, you can debug a malformed Schema.org startDate in twenty seconds. The mental model transfers.

For a deeper continuation of this discipline applied to web pages, see our canonical URLs guide, which covers another contract between publisher and consumer that Google enforces.

Validation Workflow Evolution

The shape of the workflow hasn’t changed. The format and the consumer did.

Side-by-side comparison of XML validator workflow and JSON-LD validator workflow showing identical pipeline structure

In the XML era, you wrote a document, ran xmllint --schema schema.xsd doc.xml (or jing schema.rng doc.xml for RelaxNG), and either got doc.xml validates or a list of errors. Errors got fixed before the document ever reached a downstream consumer. Cheap and disciplined.

In the JSON-LD era, the cycle is identical. You embed JSON-LD inside a <script type="application/ld+json"> tag, paste the markup into a validator, and either get a clean pass or a list of errors. Errors get fixed before Googlebot sees the page. The Schema Markup Validator at xmlschemata.org/tools/schema-validator/ is the contemporary equivalent of xmllint — same job, different format.

If you’ve ever debugged an XSD complaint about cvc-datatype-valid.1.2.1, you already have the right reflexes for debugging Schema.org’s “missing required field” warning. The error categories map almost cleanly: required field missing, wrong type, malformed value, unknown property.

Common Validation Errors That Survived the Transition

The same five mistakes that plagued XML documents in 2005 plague JSON-LD payloads in 2026. The error messages got friendlier; the underlying mistakes did not change.

  1. Missing required field. XSD’s minOccurs="1" error becomes Schema.org’s “Missing field ‘name'” for a Product. Fix: add the field. There is no clever workaround.
  2. Wrong type. XSD’s cvc-datatype-valid becomes Schema.org’s “The value provided for ‘price’ is not of expected type Number.” Fix: use "price": "29.99" (string per Schema.org convention) or "price": 29.99, never "price": "$29.99" with a currency prefix.
  3. Malformed date. Both eras require ISO 8601. "2026-04-29T10:00:00-07:00" validates. "April 29, 2026" does not. Therefore: use a date library to format, never hand-write.
  4. Unknown property. XSD complained about elements not in the schema. Schema.org warns about properties not in the type definition. Both mean the same thing: you misspelled a field name or used a property from a different type. Fix: check the official type page on Schema.org.
  5. Inconsistent nesting. An XSD complexType required exact child sequence; a Schema.org Recipe requires recipeIngredient as an array, not a string with commas. Fix: read the type definition, mirror its structure exactly.

The continuity is the lesson. If your developers learned XML schema validation in 2008, they already know how to debug JSON-LD in 2026. Don’t retrain them — re-point them.

Modern Tooling: Schema Validator, SERP Preview, Structured Data Planner

The contemporary toolchain compresses the XML-era workflow into four free utilities. Each addresses a specific phase of the validation discipline.

Modern structured data toolchain showing Schema Validator, Structured Data Planner, SERP Preview, and Hreflang Auditor connected to a central JSON-LD payload

Plan first, then write. Use the Structured Data Coverage Planner to decide which Schema.org types each page template needs before you write a line of markup. This is the JSON-LD equivalent of designing your XSD before authoring documents — the cheapest stage at which to fix mistakes.

Validate before publish. The Schema Markup Validator catches type errors, missing required properties, and malformed values. Paste your JSON-LD, get a structured pass/fail report. This is your xmllint for the web.

Preview the result. Use the SERP Snippet Preview to see what Google is likely to render. Schema validation only confirms the markup is well-formed; SERP preview shows the visible outcome. Both checks matter.

Audit international targeting. If your structured data spans multiple languages or regions, the SERP Preview tool validates language and region targeting. Multilingual schemas have a higher error rate than monolingual ones, and silent hreflang errors will let Google index the wrong locale.

Crawl directives. If you’re shipping new structured data behind staging or partial-rollout URLs, use the robots.txt generator to keep crawlers out of test paths until validation passes. Same discipline: don’t expose unfinished data to the consumer.

When You Still Need XML Schema (and When JSON-LD Suffices)

JSON-LD is not a replacement for XSD or RelaxNG. It’s a complement, optimized for a different consumer. Choose the right tool for the actual use case.

Use case Right choice Why
B2B document exchange (orders, invoices, EDI) XSD or RelaxNG Strict typing, formal contracts, regulatory audit trail
Government / tax filings XSD Mandated by most authorities; non-negotiable
Configuration files (build tools, CI) JSON Schema or YAML Lightweight, developer-friendly, fast feedback
Web pages targeting search engines JSON-LD + Schema.org The only format Google fully documents and rewards
API contracts (REST, GraphQL) OpenAPI / GraphQL SDL Purpose-built for API surface validation
Scientific data exchange XSD or domain DTDs Established community schemas (e.g., MathML, CML)

In practice, most modern web teams need JSON-LD for SEO and JSON Schema or OpenAPI for their APIs. RelaxNG and XSD show up at the boundary with legacy enterprise systems, and they’re not going away. Therefore, knowing both ecosystems is still a real skill in 2026 — not an academic curiosity.

Where to Go Deeper

The depth of the XML schema literature is one of its underappreciated strengths. Forty years of practitioners debugged hard cases and wrote them down. That body of work remains useful even as the active development frontier moved to JSON-LD.

For RelaxNG and XSD fundamentals: Eric van der Vlist’s RELAX NG textbook, hosted on this domain since 2003, remains the practitioner reference. The chapters on datatypes and the comparison with W3C XML Schema age remarkably well. If you understand those two chapters, JSON-LD will feel like a familiar problem in different clothes.

For Schema.org and JSON-LD: The official Schema.org documentation is exhaustive and authoritative. Google’s structured data developer docs explain which types trigger rich results and which validation rules Google enforces beyond the Schema.org spec.

For broader page-quality context: Structured data is one signal among many. Our Core Web Vitals guide covers the performance signals that sit alongside schema validation in modern SEO.

Bottom Line

The validation discipline RelaxNG and XSD codified in the early 2000s did not die when the web pivoted to JSON. It migrated. Schema.org and JSON-LD are doing the same job — declare the shape, validate the payload, fail fast — for a different consumer (search engines and AI assistants) and a different audience (web publishers, not enterprise integrators). XML schema validation is a continuous tradition, not a deprecated technology.

If you’re publishing structured data today, the lineage is in your favor: forty years of accumulated practice on how to specify, validate, and debug structured data. Use it. Plan with the Structured Data Planner, validate with the Schema Markup Validator, preview the result, and ship. The discipline pays for itself the first time it catches a missing required field before a crawler does.

Next step: open the Schema Markup Validator, paste your JSON-LD, and see what survived.

Practitioner-grade writing on web analytics, SEO, and structured data. No fluff, just mechanism.