EOOXML objections

From Grokdoc

Revision as of 02:26, 23 January 2007; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Contents

Navigation

Disclaimer

WARNING: This document is under collaborative development and anyone can edit it after registering on the Grokdoc site and logging in. The document may contain serious errors at this time and therefore should not be quoted or relied upon before it is completed.


Notice of related cases

Reviewers of Ecma 376 should be aware that issues raised by the Ecma 376 proposal -- Microsoft's refusal to release the specifications for its legacy file formats and/or its refusal to support ISO 26300 -- variously are or may become involved in as many as four antitrust cases on two continents, thus raising a need for JTC-1's heightened scrutiny of the legal landscape before further processing of the Ecma 376 proposal. Not only National Standard Bodies are involved in the issues Ecma 376 raises. The courts and antitrust regulators are also involved. Thus, reviewers need to involve legal counsel in their review.

These cases include a European Commission antitrust investigation of Microsoft Office, European Union antitrust litigation, U.S. v. Microsoft, and Tangent Computer v. Microsoft. See the text below on related cases for more information.

Conventions used in this document

1. Ecma 376

"Ecma 376" is used to denote the ECMA-376 Office Open XML File Formats as submitted to the ISO JTC-1 committee.

2. ISO 26300

"ISO 26300" is used to denote ISO/IEC 26300:2006 'Open Document Format for Office Applications (OpenDocument) v1.0'

3. Page numbers

All page numbers are with respect to the PDF version of Ecma 376. The page numbers on the page footer the spec do not match the page on the PDF file. For example, the page labeled "304" is actually page 620 in the PDF. In this document we will give "page 620".

4. Compatibility Note

The sole justification given for Ecma 376 is that of compatibility with existing Microsoft Office documents. In this document we include several notes on the topic of compatibility, and they look like this:

Compatibility Note
Information pertinent to compatibility with existing Microsoft binary documents goes here.

5. SECURITY WARNING

Some portions of the Ecma 376 specification pose significant security risks to either end users or the industry. These are indicated with the following notation:

SECURITY WARNING #n
Information describing the security warning goes here.

Where "n" is a unique integer. This document contains 2 security warnings. They should be taken seriously by the JTC-1.

Questions to be decided

  1. Does the proposed standard contradict, conflict with or duplicate existing international standards, in particular ISO standards?
  2. Would the existence of OOXML as an International Standard in addition to ISO 26300 cause user confusion?
  3. Can national bodies reasonably be expected to competently evaluate a proposed standard exceeding 6,000 pages within the time and constraints of fast-track procedures, particularly given the host of issues raised by the particular proposal at issue?
  4. Would the proposed standard create 'obstacles to international trade' within the meaning of the Agreement on Technical Barriers to Trade?
  5. If so, is the proposed standard necessary nonetheless, i.e., is there a market requirement for a duplicative standard that contradicts ISO 26300?
  6. Would fast track processing of the proposed standard place JTC-1's reputation at risk?

For all questions, along with an answer and an argument, some degree of impact analysis would be useful. For example, how confused would users be? What is the practical impact?

Introduction and Summary

This document records grounds for objection to the preparation of Ecma 376 -- the Ecma Office Open XML standard -- as an ISO/IEC standard by Joint Technical Committee 1 ("JTC-1"). Specifically, Ecma 376 should be diverted from its present fast-track processing and should be remanded to Ecma International for: (i) harmonization with ISO/IEC 26300:2006 and numerous other standards that it contradicts; (ii) development of more suitable intellectual property documents;

The overarching controlling law is the international Agreement on Technical Barriers to Trade Agreement on Technical Barriers to Trade ("TBT"). It provides in Article 2:

2.1 Members shall ensure that in respect of technical regulations, products imported from the territory of any Member shall be accorded treatment no less favourable than that accorded to like products of national origin and to like products originating in any other country.

2.2 Members shall ensure that technical regulations are not prepared, adopted or applied with a view to or with the effect of creating unnecessary obstacles to international trade. For this purpose, technical regulations shall not be more trade-restrictive than necessary to fulfill a legitimate objective, taking account of the risks non-fulfilment would create. Such legitimate objectives are, inter alia: national security requirements; the prevention of deceptive practices; protection of human health or safety, animal or plant life or health, or the environment. In assessing such risks, relevant elements of consideration are, inter alia: available scientific and technical information, related processing technology or intended end-uses of products.

ISO, IEC and JTC-1 have faithfully executed those instructions over the years, as reflected in the ISO primary goal of "one standard, one test, and one conformity assessment procedure accepted everywhere,” producing a large body of international and open standards that have stimulated competition.

The JTC-1 decision whether Ecma 376 should remain on the fast track procedures under which it was submitted by Ecma presents issues of monumental importance within the standards development establishment, the software development industry, and for software users worldwide. Pared to its essentials, the issue is whether powerful vendors with established monopolies are thereby entitled to their own private and non-interoperable international standards, standards that can not feasibly be implemented by any other vendor. Viewed from another perspective, the issue is whether ISO/IEC and JTC-1 are willing to accept the loss of reputation that would inevitably accompany a decision to process and adopt Ecma 376 on the fast track. Viewed from yet another perspective, the issue is whether the Digital Divide will widen even further because the amazingly successful ISO 26300 standard, already widely supported by both free and proprietary software, is rendered non-interoperable with an international standard that can only be implemented by a single vendor that charges monopoly prices for its software.

There can be no blinking past the fact that adoption of Ecma 376 as an international standard would dramatically tilt the economic playing field for competitors and is therefore trade restrictive, an obstacle to international trade within the meaning of the TBT. Viewed most narrowly, the relevant legal question is whether an enormous obstacle to international trade -- and there can be no reasonable doubt on that issue -- is nonetheless necessary, per TBT section 2.2.

In the interests of transparency in decision-making, it should be bluntly acknowledged that Ecma 376 is in reality a partial specification for yet another round of office productivity application file formats implemented by a single vendor that monopolizes the relevant market, Microsoft Corp., widely known for maintaining a moving target for interoperability with competitors' applications by frequent changes in its secretive file formats. Ecma 376 is Microsoft's response to the success of ISO 26300.

Having deliberately boycotted the multiple-vendor effort to develop ISO 26300, Microsoft now petitions JTC-1 through Ecma International for its own wholly duplicative and contradictory standard, offering a massive specification for review in the 30-day period provided for filing contradictions, a specification that was not made publicly available until less than 30 days before it was submitted to JTC-1 and for which there is still no full-featured reference implementation available for testing and evaluation purposes. Quite simply, Microsoft is attempting to game the system to obtain standardization of a specification that flies in the face of the TBT and the very purposes of standardization.

So that Microsoft's leverage was clear, Ecma 376 offered only a single justification for a specification that wholly duplicates the functionality of ISO 26300 -- except for that standard's interoperability features. That justification is the claimed need for compatibility with billions of documents stored worldwide in Microsoft Office's legacy file formats. Microsoft attempts to hold those documents hostage, insisting on a right to maintain its vendor lock on its existing customer base through a new file format that is incompatible with the pre-existing International Standard.

However, Microsoft's leverage is illusory: (i) ISO 26300 -- designed for interoperability -- is well capable of handling the requirements of Microsoft's applications; Microsoft's refusal to support ISO 26300 is based upon it's desire to retain control over the massive investments of it's users in existing documents; (ii) even were there unique Microsoft requirements that ISO 26300 did not support, the answer would be to extend the existing standard, not to adopt a competing, duplicative, and non-interoperable standard; (iii) even were that impossible, Ecma 376 must nonetheless minimize its conflicts with, and duplication of, existing standards; (iv) the need to access these older documents is unrelated to support for Ecma 376, since all of these legacy documents are stored in different and undocumented proprietary binary formats; (v) fully backwards compatibility could be achieved if Microsoft chose to publish the formats used in each of it's 'legacy' products, and (vi) Ecma 376 must go through a normative process so that it can be implemented by other vendors.

Points & Authorities

??? To be written -- Research notes:


Ecma 376 contradicts numerous international standards

The Gregorian Calendar

The Gregorian calendar is the most widely used calendar in the world. A modification of the Julian calendar, it was decreed by Pope Gregory XIII on 24 February 1582. The Gregorian calendar forms the basis of many international standards such as ISO 8601.

Ecma 376 section 3.17.4.1 page 3305, “Date Representation”, conflicts with the Gregorian calendar in the calculation of dates. Specifically, it requires spreadsheet implementations to incorrectly treat the year 1900 as a leap year. This contradicts the Gregorian calendar, ISO 8601 and the civil calendar adopted by most nations of the world.

Compatibility Note
There is a known bug in Microsoft Excel that treats the year 1900 as a leap year. Changing the Gregorian calendar is not necessary (or the best way) to achieve compatibility with spreadsheets that depend on this bug.
A better solution is to define the spec correctly, and when converting old binary files to the new format, Microsoft Office would (for example) replace WEEKDAY() by WEEKDAY()+1 for any dates affected by this bug. Alternatively, since they have compatibility flags for several other legacy bugs, this could be handled that way as well, e.g., when importing a legacy Excel document, set a flag "LeapYearBug=true", but when creating a new OOXML document this flag would not be set and dates would be described correctly.

ISO 8601 (Representation of dates and times)

ISO 8601 is the ISO standard for date and time representations.

Ecma 376 section 3.17.4.1 page 3305, “Date Representation” stipulates that dates must be represented as numeric codes counting from 1900 or 1904. This is in conflict with ISO 8601.

This section also forbids applications from supporting years before 1900, also in conflict with ISO 8601.

Commentary:

ISO 639 (Codes for the Representation of Names and Languages)

ISO 639 is the set of ISO standards that lists short codes for language names (such as ISO 639-1 and ISO 639-2).

Ecma 376 section 2.18.52 page 2530, "ST_LangCode (Two Digit Hexadecimal Language Code)" requires the use of a fixed list of numeric language codes rather than the already existing set provided by ISO 639. This is a conflict with ISO 639. The codes standardized by ISO 639 include the use of a Registration Authority to process requests for new language codes. This is preferable to a fixed list attached to a document standard.

In addition, if the "two digit hexadecimal language code" type is taken seriously (although it is likely an error as discussed below), the nonstandard encoding used in Ecma 376 is not able to encode the world's languages. ISO-639-3 already encodes most of the world's approximately 6,700 languages, but a 2-digit hexadecimal value can only encode at most 256 languages. A format unable to represent most of the world's languages is unacceptable for an international standard.

Commentary:

ISO/IEC 8632 (Computer Graphics Metafile)

ISO/IEC 8632 is the ISO standard for computer graphics metafiles: "2D graphical (pictorial) information" consisting of "vector graphics", "raster graphics", and "text" (NIST, 1998). (Another possibility is the W3C SVG standard discussed below, which has similar capabilities.)

Ecma 376 section 6.2.3.17 page 5679, "Embedded Object Alternate Image Requests Types" and section 6.4.3.1 page 5738, "Clipboard Format Types" refer to Windows Metafiles or Enhanced Metafiles instead of using ISO/IEC 8632 or W3C SVG.

ISO/IEC 26300:2006 (Open Document Format for Office Applications (OpenDocument) v1.0)

ISO 26300 is the ISO/IEC standard for office productivity applications. It covers the functionality needed for text documents, spreadsheets, drawings and presentations for office applications.

Ecma 376 duplicates the functionality of the pre-existing ISO 26300 standard as its core purpose is to support text documents, spreadsheets, drawings and presentations for office applications. Ecma 376 contradicts ISO 26300.

Compatibility Note
As far as we can see, all the functionality in Ecma 376 can be represented (better) in ISO 26300, and the much greater length of the former specification is due to its poor design and failure to leverage existing standards. For example:
  • Where Ecma 376 specifies attributes like useWord97LineBreakRules, ISO 26300 uses more generic (and hence more powerful) attributes like style:line-break (used to select the set of line breaking rules to use for text).
  • Where Ecma 376 specifies 7 pages of clip art, ISO 26300 allows the insertion of an arbitrary image into the document.
However, as the specification is over 6,000 pages long, it is impossible to say this authoritatively.
Compatibility Note
If there really is functionality in Ecma 376 that legitimately needs to be standardized at the ISO level (which we doubt), it should be standardized as an extension of the existing ISO 26300 and not as a new, incompatible standard.
This point is discussed further in the section Ecma 376 is not 'necessary' as defined by the Agreement on TBT

W3C SVG (Scalable Vector Graphics)

SVG is the W3C standard "for describing two-dimensional vector and mixed vector/raster graphics in XML".

Ecma 376 section 14 page 132, "DrawingML" defines a vector drawing XML format in conflict with the industry standard W3C SVG.

Ecma 376 section 8.6.2 page 24, "VML", requires support for another drawing XML format in conflict with W3C SVG. Note that VML was proposed by Microsoft as a W3C standard in 1998, but was rejected in favour of SVG.

Compatibility Note
Defining a new standard for vector drawings is not necessary for compatibility with existing Microsoft Office documents. Even if Ecma 376 legitimately needs drawing functions not available in SVG, it should reference SVG for all features that are provided by SVG.
ISO 26300 illustrates how to solve this problem:
  • There are functions in ISO 26300 that are not present in SVG (3D drawings).
  • There are functions in SVG not present in ISO 26300 (cubic bezier curves).
Therefore, ISO 26300 uses an SVG-compatible namespace for all drawing functions that can be provided by SVG, and a separate namespace for 3D drawing functions. Hence avoiding any conflict with SVG.

Commentary:

W3C MathML (Mathematical Markup Language)

MathML is the W3C standard for "describing mathematical notation and capturing both its structure and content".

Ecma 376 section 7.1 "Math" (page 747) covers mathematical expressions, and defines a format in conflict and incompatible with the W3C Recommendation MathML.

Note: MathML is included in ISO 26300 in section 12.5 "Mathematical Content". As a result, Ecma 376 conflicts with an ISO specification for mathematical notation.

Compatibility Note
If there is functionality in Ecma 376 that legitimately cannot be represented in MathML, Ecma 376 must still use MathML-compatible tags for all the features that can be expressed in MathML. See the previous section ("W3C SVG") for an illustration of how ISO 26300 solves a similar situation with the SVG standard.

ISO/IEC 10118-3, W3C XML-ENC, and other cryptographic hash standards

The Ecma 376 standard ignores accepted standards for cryptographic hashes and defies expert standards for cryptography, by proposing its own hash algorithm which is almost certainly flawed.

Cryptography, including the constructure of secure hash functions, is very difficult. Weaknesses are regularly discovered even in publicly-vetted cryptographic algorithms long thought secure, whereas proprietary cryptographic methods not subjected to intensive public scrutiny are nearly always found to be seriously flawed (see e.g. Schneier, 1999). Because this is accepted wisdom in the cryptographic community, all cryptographic algorithms that have been recommended by government agencies and standards bodies with expertise in encryption have first been subjected to extensive public scrutiny. For example, in the area of secure hash functions:

  • ISO has chosen the "Whirlpool" algorithm as standard ISO 10118-3.
  • The W3C, in its XML-ENC standard, includes a list of algorithms: SHA1, SHA256, SHA512, RIPEMD-160.
  • The European NESSIE project recommends: ISO 10118-3 ("Whirlpool"), SHA-256, SHA-384 and SHA-512.
  • In the USA, NIST recommends SHA1, SHA224, SHA256, SHA384, and SHA512.
  • In Japan, CRYPTREC recommends: MD5, RIPEMD-160, SHA1, SHA256, SHA384, and SHA512.

Ecma 376 section 2.15.1.28 (page 1941) does not follow the advice of any of these organizations. Instead, it defines a new hashing algorithm that has not undergone scrutiny by the cryptographic community. Section 3.3.1.69 page 2786 "protectedRange" has yet another implementation called 'GetPasswordHash'

The Emca 376 hash function is almost guaranteed to be flawed and insecure. This poses two grave security risks:

SECURITY WARNING #1
The immediate risk is that hashed document passwords may be determinable from the hashed value. Since users often reuse document passwords for other documents and other systems (whether they should or not), including an inadequately reviewed hash function risks enabling forgery and identity theft of many other systems by attackers.
SECURITY WARNING #2
Defining a new hash function inside an ISO standard (giving it the ISO seal of approval) creates the expectation that this hash function has received proper scrutiny by the cryptographic community (like ISO 10118-3 has) and is secure. This is likely to lead the industry into using this new insecure hash function in a variety of security-critical applications, making many other security-critical applications directly vulnerable as well.
Compatibility Note
The rationale given for the insecure hash in 2.15.1.28 is "compatibility with legacy word processing applications which hashed their password solely using this mechanism." However, compatibility with these "document protection restrictions" could instead be assured by requiring the users to re-enter their password at the time the document is converted to Ecma 376, at which time the password could be verified against the old hash, and then re-hashed using a secure hash to be stored in the new document if desired. A similar approach could be used for 3.3.1.69.

W3C SMIL (Synchronized Multimedia Integration Language)

SMIL is the W3C standard for "synchronized multimedia presentation". As the Recommendation states, with SMIL an author can:

  1. Describe the temporal behavior of the presentation.
  2. Describe the layout of the presentation on a screen.
  3. Associate hyperlinks with media objects.

Ecma 376 section 4.4 "Animation" (page 565) covers presentation animations (slide transitions), in conflict with the W3C Recommendation SMIL.

Compatibility Note
It is not necessary to define a new standard for slide transitions to achieve compatibility with existing Microsoft Office documents. Even if Ecma 376 legitimately needs slide transition functions not available in SMIL, it should reference SMIL for all features that are provided by SMIL.
ISO 26300 illustrates this point. ISO 26300 uses SMIL-compatible attributes for slide transitions whenever such an attribute exists. In a similar way, if there is functionality in Ecma 376 that legitimately cannot be represented in SMIL, Ecma 376 must still use SMIL-compatible tags for all the features that can be expressed in SMIL.

Ecma 376 is immature and inconsistent

Fabricates units of measurement

Many attributes throughout the Ecma 376 spec take values in "English Metric Units" (EMU). For example, attributes of type ST_PositiveCoordinate (5.1.12.42, page 4505) are measured in EMUs. This is not a known unit in existing literature. It is only defined inside a paragraph in section 5.9.2.1 page 655, so that "91440 EMUs/U.S. inch, 36000 EMUs/cm". Similarly, p 1836 (2.18.105) specifies "twips"—twentieths of a point (1/1440th of an inch). This seems to be used primarily by Visual Basic.

Internal inconsistencies: the w:sz element

The w:sz element is an example of major internal inconsistencies in the spec:

  • For fonts, the w:sz element specifies the size in half points (19.2.2.37, page 758 ).
  • For frameset, the w:sz element has a string value that could be the ID of a style or a decimal number which is the caption of the parent structured document tag (19.14.2.40, page 1585).
  • However, as the child of rPr (20.4.19, page 2104), its value is in points.
  • For table borders, the w:sz attribute (not an element!) is specified in eighths of a point, unless the border style is an art border, in which case the width is in points (page 817).
  • When used as an attribute of restoredLeft (21.2.3.23, page 2542), it specifies the size of a dimension in normal view as a percentage of the screen.
  • In presentations, as an attribute of the ph element (21.3.1.65, page 2594), it is an enumerated value with choices "full", "half", and "quarter".
  • When sz is used as an attribute of defRPr (default character properties 21.1.4.1.4, page 2845), it is the size of a font in hundredths of a point (22.1.10.68, page 3166).

Internal inconsistencies and omissions: ST_Border

Section 2.18.4 page 2414 lists numerous styles such as apples, scaredCat, heebieJeebies, etc. However, the specification does not fully define these styles (e.g missing height, width, color-depth, orientation). The style basicThinLine describes behavior for horizontal, vertical and corner scenarios but many styles (e.g babyRattle, balloonsHotair, etc) provide no such details. The problem with this is that a single style can be interpreted differently by different vendors/implementors.

Internal inconsistencies: the lengths of hexadecimal numbers

Ecma 376 is self-contradictory in its description of the lengths of several hexadecimal number types, such as ST_LangCode (2.18.52, page 2531), ST_ShortHexNumber (2.18.86, page 2591), ST_LongHexNumber (2.18.57, page 2542), ST_HexColorRGB (2.18.45, page 2520), ST_Panose (2.18.72, page 2569 and 5.1.12.37, page 4502), ST_UcharHexNumber (2.18.106, page 2620), ST_UnsignedIntHex (3.18.86, page 3712), ST_UnsignedShortHex (3.18.87, page 3713), and ST_HexBinary3 (5.1.12.28, page 4483).

There seems to be a consistent confusion in the specification between hexadecimal digits (each of which is expressed as a single character in a string representation) and octets (pairs of hexadecimal digits, which together can encode 8 bits, or a 1-byte character, but are expressed as a two-character string).

ST_LangCode

In addition to the introduction of ST_LangCode instead of only supporting ISO 639, the definition of ST_LangCode (2.18.52, pages 2531-2538) is not internally consistent.

  • It is described in the text as two digit hexadecimal, but the translation table is decimal (in that it lines up with the decimal values for languages as defined by Microsoft).
  • The decimal values in the table can not be represented in two hexadecimal digits, but rather four.
  • The example (page 2537) shows the value represented in decimal (1033), which is stated as being in hexadecimal.
  • The example does not have contents which are not a length of exactly 2 characters (page 2537).
  • The XML Schema definition given states that length="2". In hexBinary schema types this corresponds to two octets, or 4 hex digits. However the text earlier states that this should be two hex digits.

Commentary:

Other hexadecimal numeric types

ST_ShortHexNumber and ST_LongHexNumber two types are described as representing a "Two Digit Hexadecimal Number Value" and a "Four Digit Hexadecimal Number Value", respectively, with their contents restricted to "have a length of exactly 2 characters" and "exactly 4 characters", respectively. However, all of the examples given show them as using 4 and 8 hexadecimal digits, respectively, and the description of ST_LongHexNumber explicitly calls it a "four octet (eight digit) hexadecimal number". For example, when these quantities are used to represent bitmasks (see the section on bitmasks, below), they are used to represent 16-bit and 32-bit quantities, respectively, which require 4 and 8 hexadecimal digits, not 2 and 4. For example, section 2.4.51 (p. 1211), uses ST_ShortHexNumber to represent numbers as large as 0x0400 (decimal 1024), which requires at least three hexadecimal digits, and shows examples with four digits. As another example, section 2.8.2.16 (p. 1541) uses ST_LongHexNumber to represent "a four digit hexadecimal encoding of the first 32 bits of the 64-bit code-page bit field" — although it describes it as 4 digits, 8 hexadecimal digits are required to represent a 32-bit quantity, and the examples shown use 8 digits.

Similar problems are found in the definitions of numerous other hexadecimal types. For ST_HexColorRGB, the description states that the "contents must have a length of exactly 3 characters", but the XML Schema fragment defines it as 3 hexadecimal octets, or 6 hexadecimal digits. For ST_Panose the description states that the "contents must have a length of exactly 10 characters", but the XML Schema fragment defines it as 10 hexadecimal octets, or 20 hexadecimal digits. (The XML Schema definition is consistent with the example in 2.18.72; there is no example in 5.1.12.37.) For ST_UcharHexNumber, the description states both that the "contents must have a length of exactly 1 characters" and that it is "specified as a two digit (one octet) hexadecimal number"; the example and the XML Schema definition both support the latter interpretation. For ST_UnsignedIntHex, the description states that the "contents must have a length of exactly 4 characters", but the XML Schema fragment defines it as 4 hexadecimal octets, or 8 hexadecimal digits. (ST_UnsignedIntHex is meant to hold a hexadecimal representation of a unsigned integer, implying a length of 4 bytes (8 hexadecimal digits) on all modern architectures.) For ST_UnsignedShortHex, the description states that the "contents must have a length of exactly 2 characters", but, the XML Schema fragment defines it as 2 hexadecimal octets, or 4 hexadecimal digits. For ST_HexBinary3, the description states that the "contents must have a length of exactly 3 characters", but, the XML Schema fragment defines it as 3 hexadecimal octets, or 6 hexadecimal digits. (Separately, this type is used for srgbClr@val and sysClr@lastClr where in both cases it is being used for storing an RGB value. ST_HexColorRGB already exists specifically for this purpose.)

Unspecified terms: plain text

Ecma 376 section 11.3.1 "Alternative Format Import Part" (page 38), allows content in "plain text". The encoding for "plain text" is not specified (is it 7-bit ASCII? ISO 8859-1? UTF-8?). As specified it will not allow international interoperable use.

Inconsistent naming conventions for elements and attributes

Ecma 376 contradicts the goals of XML which are:

6. XML documents should be human-legible and reasonably clear.
10.Terseness in XML markup is of minimal importance.

Ecma 376 has inconsistent abbreviated naming conventions likely to cause confusion among developers:

  • It is not necessary to have programming-centric naming conventions in XML specifications.
  • Abbreviations, truncations and vowel removals restrict readability.
  • International standards should be as accessible as possible to non–English-speaking developers, and confusing names will create misunderstadings.

Some examples:

  • in VML (5.1.10.45, page 4413) "outerShdw (Outer Shadow Effect)" has its second word devoid of vowels. And yet its Child Elements and Attributes have different naming conventions, e.g. scrgbClr, algn, blurRad, dir, dist, rotWithShape
  • in WordprocessingML (2.15.1.78, page 2020) "settings(Document Settings)" has a large list of Child Elements, and within that it has significant contradictory naming conventions, e.g. ActiveWritingStyle, attachedSchema, documentType, docVars, endnotePr, hdrShapeDefaults.

Commentary:

Inconsistent and inflexible notation for percentages

Ecma 376 uses four inconsistent notations for percentage units, at least one of which is particularly inflexible:

  • Section 2.15.1.95 (p. 2053) uses a decimal number giving the percentage
  • Section 2.18.85 (p. 2583) uses predefined symbols (like "pct15" for 15%) in 5 or 2.5 percent increments (which is inflexible and difficult to process with standard XML tools, compared to a generic number-valued field)
  • Section 2.18.97 (p. 2608) uses a number in 50ths of a percent
  • Section 5.1.12.41 (p. 4505) uses a number in 1000ths of a percent

In contrast, for example, the W3C SVG and W3C CSS standards both consistently use a single notation—decimal percentages followed by the "%" symbol—as described in section 7.10 of the W3C SVG 1.1 specification and section 4.3.3 of the CSS 2.1 specification.

Compatibility Note
There is no need for this inconsistency or inflexibility to achieve compatibility with pre-existing Microsoft Office documents. A generic decimal percentage would suffice to express all of the above numeric values, while being much easier to process with existing XML tools. The precision with which a number is expressed in the file can easily be independent of the precision with which it is implemented (e.g. a particular implementation may be limited to distinguishing 5-percent increments, and could achieve this by rounding the percentage internally as needed).

Commentary:

Inappropriate non-document settings (application settings)

Ecma 376 section 2.15.3.16 "doNotLeaveBackslashAlone" (page 2180). "This element specifies whether applications should automatically convert the backslash character into the yen character when it is added through user keyboard input".

This is an application setting, not a document setting.

Non-XML formatting codes

In Section 2.16.5.79 page 2355 "XE" (full name not defined) defines "\b", "\i" as bold and italic, which is contrary to XML, CSS and self-contradictory to Ecma 376 formatting codes. Similarly for other sections in 2.16.5, such as 2.16.5.76–2.16.5.78 (p. 2353–2354), which define "\* Caps", "\* FirstCap", "\* Lower", and "\* Upper" to format the capitalization of preceding text.

Inflexible numbering format

Section 2.18.66 page 2554, ST_NumberFormat, Numbering Format for number lists (2.9.18 page 1581), footnotes (2.11.17 page 1645), endnotes (2.11.18 page 1646), captions (2.15.1.16 page 1912) and Page numbers (2.6.12 page 1412).

  • Fixed to a few countries. Many regions are not included.
  • Contradicts W3C XSLT which ISO 26300 uses.
  • Contradicts Unicode ISO 10646.

Uses a Microsoft-specific namespace

Section 6.2.3.23 page 5197 Attribute "href" (Hyperlink Target) uses a Namespace "urn:schemas.microsoft.com:office:office".

An Ecma standard must not reference company-specific namespaces. This should be replaced by an Ecma namespace.

Ecma 376 uses bitmasks, inhibiting extensibility and use of standard XML tools

Ecma's extensive use of bitmasks is non-extensible, inconsistent, and prevents use of standard XML tools.

Background: bitmasks

The boolean (yes/no) type in most programming languages, such as C and C++, or other types used for the same purpose (such as "char" in ISO/IEC 9899:1990), corresponds to at least a single byte (8 bits) on all modern systems. In memory-constrained situations (as was common in the past), however, the inefficiency of using an 8-bit type to store 1 bit of information was a problem.

A bitmask is a technique to encode multiple values inside a single variable, by assigning a meaning to each individual bits of the variable. For example, the binary 10110001 (decimal 177) would mean Yes/No/Yes/Yes/No/No/No/Yes and contain the answers to 8 different yes/no questions.

Modern XML-based formats generally avoid using bitmasks. Because the bitmasks specified by Ecma 376 are mostly of fixed length, bitmaps create extensibility problems because one cannot add bits, much less insert another bit into the "middle" of a bitmask. The small space savings are irrelevant, especially since these files are typically compressed (both Ecma 376 and ISO 26300 are typically compressed in a "zip" file). In addition, many XML processing standards do not include any capabilities for handling bitmasks.

Bitmasks in Ecma 376

Many element attributes in Ecma 476 are defined as bitmasks. For example:

Ecma 376 section 2.8.2.16 (page 1541) "sig (Supported Unicode Subranges and Code Pages)" describes the <w:sig> element whose attributes are all bitmasks. For example, take the attribute csb1:

"Specifies a four digit hexadecimal encoding of the upper 32 bits of the 64-bit code-page bit field that identifies which specific character sets or code pages are supported by the parent font"

This attribute takes the following values:

BitDescription BitDescription
0-15Reserved for OEM 24IBM Turkish
16IBM Greek 25IBM Cryillic
17MS-DOS Russian 26Latin 2
18MS-DOS Nordic 27MS-DOS Baltic
19Arabic 28Greek (former 437G)
20MS-DOS Canadian French 29Arabic (AMSO 708)
21Hebrew 30WE/Latin 1
22MS-DOS Icelandic 31US
23MS-DOS Portuguese

The other attributes of <w:sig> have similar definitions as bitmasks.

Many other element attributes in Ecma 376 have similar definitions as bitmasks. For example:

  • Section 2.3.1.18, Paragraph conditional formatting (page 842).
  • Section 2.4.7, Table cell conditional formatting (page 1085).
  • Section 2.4.8, Table row conditional formatting (page 1087).
  • Section 2.4.51, Table style conditional formatting settings (page 1211).
  • Section 2.4.52, Table style conditional formatting settings exceptions (page 1213)
  • Section 2.15.1.86, Suggested filtering for list of document styles (page 2034)
  • Section 2.15.1.87, Suggested sorting for list of document styles (page 2036)
  • Section 6.1.2.7, tableproperties attribute of shape group (page 5227)

The above mentioned Conditional Formatting Bitmask (ST_Cnf) is defined in Section 2.18.11, (page 1695).

Bitmasks are not extensible

The bitmasks specified by Ecma 376 are mostly of fixed length (a fixed number of bits). For example, the bitmasks used in sections 2.4.51, 2.4.52, 2.15.1.86, and 2.15.1.87 are all of type ST_ShortHexNumber (2.18.86, p. 2591), which is defined as consisting of exactly 4 hexadecimal digits (16 bits, see below regarding conflicting definitions). The bitmasks in section 2.8.2.16 are of type ST_LongHexNumber (2.18.57, p. 2542) which is defined as consisting of exactly 8 hexadecimal digits (32 bits, see below regarding conflicting definitions). The bitmasks in sections 2.3.1.8, 2.4.7, and 2.4.8 are of type ST_Cnf (2.18.11, p. 2478), which is defined as consisting of exactly 12 binary digits (12 bits).

Because it is not possible to add new bits to a fixed-length bitmask, extensibility is extremely limited.

Also, bitmasks require that some other data be encoded into numbers to be used in the bitmasks. For example, see the language encodings discussed earlier: every language must be assigned an arbitrary numeric code before it can be used. Keeping this mapping up-to-date requires constant maintenance by some body. If not carefully handled, a single vendor could end up having de facto control over this mapping, and as a result that vendor could determine what could be done or not by the format (by refusing to assign mappings useful to a competitor).

Compatibility Note
XML formats have no need for bitmasks, since XML provides more flexible data structures. The reasons for which bitmaps were invented (memory) do not apply to XML formats because the data is encoded in plain text and is surrounded by text tags, negating any memory benefit. Plus, the Ecma 376 format is zip-compressed anyways.

Bitmasks cause significant validation problems

Using bitmasks creates a new datamodel, separate from the XML data model. In particular, the bitmask cannot be described in or validated by XML Schema, Relax NG, Schematron or any XML schema language or validator.

Bitmasks defeat XSLT manipulation

XSLT is the W3C standard for manipulating and converting XML documents, and is by far the most popular tool for working with XML. XSLT has no tools for bitwise operators, since bitmasks are not part of the XML data model.

Bitmasks in Ecma 376 are not internally consistent

The formats used to describe bitmasks are not internally consistent within Ecma 376.

  • Several bitmasks (sections 2.3.1.8, 2.4.7, and 2.4.8) use ST_Cnf (2.18.11 p. 2478), expressed as a string of 12 binary digits.
  • Several bitmasks (sections 2.4.51, 2.4.52, 2.15.1.86, and 2.15.1.87) use ST_ShortHexNumber (2.18.86 p. 2591), expressed as a string of 4 hexadecimal digits (although the specification contradicts itself: it says that this "type's contents must have a length of exactly 2 characters", but all the examples shown have 4 characters).
  • Section 2.8.2.16 uses ST_LongHexNumber (2.18.57, p. 2542), expressed as a string of as 8 hexadecimal digits (although the specification contradicts itself: it says that this "type's contents must have a length of exactly 4 characters" and describes it as a "Four Digit Hexadecimal Number Value" but also says it is a "four octet (eight digit) hexadecimal number" and all of the examples shown have 8 digits).
  • At least one bitmask, in section 6.1.2.7 (p. 5227), uses an unspecified "string" format "represented as an integer", with a "decimal" number given as an example.

This internal inconsistency in not only the length, but the format (binary, hexadecimal, or decimal) of bitmasks makes uniform processing of Ecma 376 bitmasks even more difficult, in addition to the problems mentioned above.

Bitmasks conflict with the Ecma TC45 charter

The TC45 is the Ecma Technical Committee charged with developing the Ecma 376 specification. The charter of the TC45 includes the specific goal of:

"...enabling the implementation of the Office Open XML Formats by a wide set of tools and platforms in order to foster interoperability across office productivity applications and with line-of-business systems"

Since bitmasks cannot be implemented in any of the standard tools for XML data formats, their use is in conflict with the TC45's charter.

Ecma 376 relies on undisclosed information

Undisclosed proprietary specifications

Section 6.2.3.17 "Embedded Object Alternate Image Requests Types" (page 5679) requires implementors to support the proprietary Windows Metafiles.

Section 5.1.3.4 (page 4077) defines a "quicktimeFile" element that "specifies the existence of a QuickTime file" which can be played as specified "within the timing node list". QuickTime is a multimedia container file format originating with Apple Computer, which was the basis of ISO/IEC 14496-14:2003. Not only does the Ecma 376 standard not specify a version of QuickTime that is to be supported, but it does not specify which of the many audio and video encoding formats ("codecs") that can be found in QuickTime containers are to be supported. Many of these codecs, such as the Sorensen Video codecs, are undocumented proprietary formats that may not be implementable by an independent software application.

Cloning the behaviour of proprietary applications

Several sections require the implementor to clone the behaviour of a proprietary product, where the behaviour to clone is not specified in the specification. For example:

  • Section 2.15.3.6 page 2161, autoSpaceLikeWord95.
  • Section 2.15.3.26 page 2199, footnoteLayoutLikeWW8.
  • Section 2.15.3.31 page 2209, lineWrapLikeWord6.
  • Section 2.15.3.32 page 2210, mwSmallCaps.
  • Section 2.15.3.41 page 2225, shapeLayoutLikeWW8.
  • Section 2.15.3.51 page 2245, suppressTopSpacingWP.
  • Section 2.15.3.53 page 2250, truncateFontHeightsLikeWP6.
  • Section 2.15.3.54 page 2252, uiCompat97To2003.
  • Section 2.15.3.63 page 2264, useWord2002TableStyleRules.
  • Section 2.15.3.64 page 2265, useWord97LineBreakRules.
  • Section 2.15.3.65 page 2266, wpJustification.
  • Section 2.15.3.66 page 2268, wpSpaceWidth.

More can be found by searching Ecma 376 for the word "Guidance".

Specifications that say "clone this product", instead of explicitly stating what behavior is required, have no place in an international standard. It may also be illegal in some jurisdictions to determine what such a non-specification means, as discussed below regarding end-user license agreements (EULAs).

Compatibility Note
Attributes like these have no place in an international standard, and are not needed for compatibility with existing documents. The correct way to achieve compatibility is through generic tags and/or by generating multiple tags as needed during conversion from a legacy format. For example:
  • autoSpaceLikeWord95 should be replaced by a generic character-spacing attribute that takes a numeric value or set of numeric values.
  • wpSpaceWidth should be replaced by by a generic space-width tag that takes a numeric value or set of numeric values.
  • mwSmallCaps could be implemented by simply setting the appropriate font size for the small-caps text during conversion.
Even attributes as obscure as lineWrapLikeWord6 can be generalized into a line-wrap-style attribute. Using a more general solution offers far more extensibility and flexibility.

Relies on application-defined behaviors

Ecma 376 often relies on "application-defined" behaviors to support important functionality that should be documented or supported via existing standards. The reliance upon application-defined formats inhibits the goal of interoperability and prevents the exchange of valuable information contained within a document.

Examples include:

  • Section 6.1.2.19 p. 5436 defines the "equationxml" attribute of "shape" elements, "used to rehydrate an equation using the Office Open XML Math syntax". This information is apparently intended to allow mathematical equations in drawings to be edited and interpreted based on their underlying mathematical structure rather than as simple graphical objects, a critically important feature for users of equations in illustrations and presentations. However, the "actual format of the contents of this attribute are application-defined", which makes them impossible to exchange between applications. (Even though "they shall contain Office Open XML Math", this could be arbitrarily and unnecessarily obfuscated by the presence of other application-specific information, application-specific encodings, and so on.)
  • Section 6.1.2.19 p. 5438 defines a "gfxdata" attribute for the "shape" elements, which "contains DrawingML content" that is "base-64 encoded". However, the "contents of this package are application-defined", so even though they "shall use the Parts defined by this Standard whenever possible" there is not sufficient information for an independent implementation to read this data or display the "DrawingML content" contained therein. (The stated rationale for this attribute is to allow "VML to represent graphical content while still persisting DrawingML for consuming applications that support DrawingML" — but this only highlights the duplicative nature of Ecma 376, which defines two new vector-graphics XML formats, VML and DrawingML, instead of using a single standard one such as W3C SVG.)
  • Section 6.2.2.14 on p. 5596 defines an "ink" element which stores "ink annotations in an application-defined format." This is apparently intended to store Microsoft Ink annotations, used with tablet input devices to add hand-written annotations to documents. These annotations are often a vital part of documents and their specification is undefined in Ecma 376. Moreover, the use of unspecified formats is entirely unnecessary, as the W3C PNG specification could be used for transparent raster image data and the W3C SVG specification could be used for vector or mixed vector/raster data. Microsoft, in contrast, reports that it uses one of two proprietary formats for Ink content: an Ink Serialized Format (ISF) encoding the user's pen-stroke information as well as other metadata (using an undocumented compressed format), as well as a "fortified" GIF format including ISF meta-data.
  • Numerous elements are not required by the standard, but if omitted lead to "application-defined" default behaviors—a completely unnecessary barrier to interchange between applications (causing the same document with "default" styles to appear completely different in two conforming programs), as opposed to simply defining the defaults in the standard. For example, sections 2.7.4 (p. 1482) defines elements to specify default paragraph and run properties (docDefaults, pPr, pPrDefault, rPr, and rPrDefault); if these are omitted "the defaults are therefore application-defined". Similarly, section 2.16.4.1 (2280) defines a date-and-time formatting switch that, if not present, leads to "a date or time result is formatted in an implementation-defined manner."

Relies on unspecified multimedia file formats

In several places, Ecma 376 specifies ways in which a document can incorporate external graphics, audio, and video files, without specifying even a minimal set of file formats that should be supported. This immediately creates a barrier to interoperability, because there is no reason to expect that different implementations of Ecma 376 will support the same multimedia file types.

For example:

  • Section 2.16.5.33 (page 2320) defines an "INCLUDEPICTURE" field that "retrieves the picture contained in" a named document. However, it does not specify what "picture" formats should be supported, despite the fact that there are many standard graphics formats that could be reasonably supported (such as W3C PNG, ISO 10918-1 JPEG, or W3C SVG).
  • Section 5.1.3.2 (page 4075) defines an "audioFile" element that "specifies the existence of an audio file" that can be played as specified "within the timing node list". However, it does not specify what "audio" formats should be supported (as opposed to specifying ISO 11172-5 "MP3" files, and/or some other well-documented formats such as Ogg Vorbis or FLAC).
  • Section 5.1.3.6 (page 4079) defines a "videoFile" element that "specifies the existence of a video file" which can be played as specified "within the timing node list". However, it does not specify what "video" formats should be supported (as opposed to specifying ISO/IEC 14496 MPEG-4 and/or some other documented standard formats).

Ironically, Ecma's Open XML White Paper touts Ecma 376's "independence from any particular type of source content" as a "requirement" for "interoperability": "OpenXML contains no restriction on image, audio or video types. For example, images can be in GIF, PNG, TIFF, PICT, JPEG or any other image type." The white paper does not say how implementations can possibly interoperate if they are required to support "any" conceivable image, audio, or video type.

Ecma 376 cannot be adequately evaluated within the 30-day evaluation period

At over 6,000 pages, the Ecma 376 specification is 10 times larger than the ISO 26300 specification. It is not possible to review over 200 pages per day with any hope of finding all the major problems in the specification.

The Ecma 376 specification lacks any pre-existing industry review with the exception of this document:

  • Ecma 376 was prepared over-hastily, with a mean page review/edit/approve rate of more than 18 pages/day, approximately 20 times faster than that of other markup standards.
  • Completion was rushed; it was finalized less than 60 days before submission to ISO.
  • Ecma 376 was developed behind closed doors, severely limiting external review during its development cycle and making it unlikely to be the result of an industry consensus.
  • The rules of the Ecma 376 committee specifically required compatibility with pre-existing proprietary file formats of a single vendor (Microsoft) that are incorporated by reference but whose specifications are not available. This restriction and the unavailability of the specifications for these formats blocks review and evaluation of Ecma 376's success in achieving its core goal of compatibility with these legacy binary file formats.
  • No reference applications that implement even a majority of the features of Ecma 376 were available for testing and evaluation purposes at the commencement of the period for JTC-1 review, nor are they available to the present day.

In spite of the short time available and the other constraints, this document preparation process has already uncovered many issues with the specification. This review is far from comprehensive. A comprehensive review less constrained by a short review period will undoubtedly uncover many more flaws.

Stability

ISO/IEC JTC 1 Directives, Edition 5, Version 2.0 states that in relation to PAS submissions: "The specification shall have had sufficient review over an extended time period to characterise it as being stable." (JTC1 Directives, Annex M The Transposition of Publicly Available Specifications into International Standards - A Management Guide, M.7.4.1.3)

Since the specification was submitted for fast-track resolution almost immediately after its development, and its development was behind closed doors, this requirement has not been met.

Under similar circumstances, JTC-1 rejected fast-track processing

  •  ??? Work with the WAPI precedent.

Ecma 376 cannot be reasonably implemented by other vendors

For a variety of reasons, noted below, Ecma 376 cannot be reasonably implemented by other vendors.

Ecma 376 requires implementation of undisclosed specifications

See the section Ecma 376 relies on undisclosed information above.

The "compatibility with legacy formats" can only be implemented by Microsoft

  • As indicated above, Ecma 376 requires implementors to emulate the behaviour of previous Microsoft products. As the behaviour is not specified, and the products are proprietary, only Microsoft can implement those portions of the specification.
  • As indicated above, Ecma 376 requires implementors to support Windows Metafiles instead of ISO 8632. As Windows Metafiles are a proprietary technology, only Microsoft can implement this portion of the specification reliably.
  • Ecma 376 section 11.3.1 "Alternative Format Import Part" allows implementations to insert content in alternate file formats such as RTF. RTF is a Microsoft proprietary format. Microsoft can support old binary documents simply by embedding the RTF content. But other implementors cannot reliably support those documents because the specification for RTF is not included in Ecma 376.

Patent rights to implement the Ecma 376 specification have not been granted

Read literally, the intellectual property ("IP") documents accompanying Ecma 376 grant no rights for vendors other than Microsoft Corp. to implement the specification. Even ignoring that problem, the IP documents are at best ambiguous as to the extent of rights granted and convey no rights for any other version of the proposed standard such as an improved version reflecting JTC-1 criticism. A single vendor in effect retains veto rights over any changes to the specification. Such defects render Ecma 376 unsuitable as an international standard candidate.

The bottom line is that the relevant intellectual property documents present legal quicksand of a depth that could only be determined through litigation. They are an unsuitable legal foundation for an international standard.

Rights to implement Ecma 376 are governed by two Microsoft Corp. covenants not to sue, the Microsoft http://www.microsoft.com/interop/osp/default.mspx Open Specification Promise] ("OSP") and an earlier Microsoft Covenant Regarding Office 2003 XML Reference Schemas ("CNS") See Microsoft Open Specification Promise page: "We are giving potential implementers of Ecma Office Open XML the ability to take advantage of either the CNS or the OSP, at their choice."

The Microsoft covenants not to sue grant no rights

Both the OSP and the CNS are worded using what are for practical purposes identical grammatical constructs that grant no rights whatsoever whilst leaving the superficial appearance of a grant of rights. In the OSP, Microsoft states that the rights granted are for "patents that are necessary to implement [the specification]." In the CNS, the rights granted are for "patent claims necessary to conform to the technical specifications[.]" It would make equal sense to say, "apples necessary to conform to the technical specifications." The problem is that no patents or patent claims are necessary to implement or conform to a software specification and the rights thus granted consist of an empty set.

Software is written in code and implemented using methods and concepts. Software is not written or implemented in patents or in patent claims. An implementation of a software specification can fully conform to that specification regardless of whether or not patents are thereby infringed.

A patent is a legal instrument analogous to a deed of ownership for real property. Patent claims are analogous to the description of real property in a deed. But neither a deed nor its property description are what is actually owned; a deed is a legal instrument, not the property owned, which is on a separate plane of existence. The property identified in a deed's description may have a real house or tree upon it; the deed does not. Just so, software may employ methods and concepts described in a patent's claims, but the patent claims are not the methods and concepts described therein. The patent claims are only a description of those methods and concepts. The methods and concepts described in patent claims may be necessary to implement or conform to a specification; however, their mere description in the patent claims is not "necessary" to the implementation of the specification. The patent claims and the methods and concepts exist on separate planes.

Therefore, the enabling language of the OSP and the CNS, read literally, describe empty sets of rights expressly granted. Recognizing the unfairness of misleading language in such documents, courts will often remedy an otherwise unjust result by implying corrective language from established norms or industry practices or by recognizing a right by way of a waiver or estoppel. But both the OSP and the CNS conclude with a sentence stating:

"[n]o other rights except those expressly stated in this promise [respectively, covenant] shall be deemed granted, waived or received by implication, or estoppel, or otherwise."

Because the rights "expressly stated" are an empty set, the sentence just quoted has the effect of blocking any judicial attempt to prevent an unjust result. Therefore, a court would most likely be forced to rest a waiver or estoppel on Microsoft's public statements about the openness of Ecma 376 rather than on the IP documents themselves.

Moreover, the problem with the grammatical construct was brought to Microsoft's attention in the only published major legal critique of the CNS, that by Marbux. See also footnote accompanying the linked text. That the same grammatical construct was nonetheless carried over to the later OSP is therefore cause for concern.

The fact that the relevant IP documents, read literally, grant no rights whatsoever provides an unacceptable legal foundation for an international standard. The grant of rights to implement Ecma 376 should be made explicit. Ecma 376 should be diverted from fast-track processing so that revised legal documents, if any are forthcoming, can be reviewed.


Microsoft intellectual property documents are ambiguous

The lack of a grant of any rights whatsoever is not the only serious defects in the Microsoft IP documents. They also suffer from show-stopper ambiguities. Those ambiguities are compounded by conflicting language and ambiguities in the specification itself in regard to conformance. However, as drafted neither the OSP nor the CNS allow resort to a separate document such as the Ecma 376 specification as a source of rights because both IP documents prohibit the recognition of rights that are not expressly stated in the IP documents themselves. These are issues that can only be resolved by Microsoft.

Both the OSP and the CNS extend patent protection only to conformant implementations of the specification, with further ambiguous qualifiers discussed in later sections below. See OSP ("to the extent it conforms to a Covered Specification"); CNS ("claims necessary to conform to the technical specifications ... against those conforming parts of software products"). However, neither document defines the variants of the word conform that they use, apparently leaving that to be defined in the specification itself.

The documents should be redrafted to indicate unmistakably where the relevant definitions of conformance can be found, particularly given both covenants' inclusion of a sentence forbidding the implication of any rights not expressly stated in those documents. Both documents say:

No other rights except those expressly stated 'in this [covenant/promise]' shall be deemed granted, waived or received by implication, or estoppel, or otherwise.

As discussed in the preceding section, the rights granted are actually a null set. However, glossing over that problem by assuming arguendo that some rights were granted nonetheless, the existence and scope of those rights can only be determined by examining the specification itself to determine the meaning of conform and its variants. But those are rights identified in another document, the specification, not "in this [covenant/promise]." That is fatal to any attempt to use the Ecma 376 specification as a source of rights granted because the reader is forbidden from looking to another document as a source of implementer's rights not expressly stated in the IP documents.

Therefore, any revised IP documents should also unambiguously declare where the relevant definition of conform and its variants is located.


The Microsoft Open Specification Promise is ambiguous

Moreover, in the OSP we find additional language limiting rights:

Microsoft Necessary Claims” are those claims of Microsoft-owned or Microsoft-controlled patents that are necessary to implement only the required portions of the Covered Specification that are described in detail and not merely referenced in such Specification.

That sentence contains a four-step reduction of implementer rights. First, one must somehow penetrate the nonsensical "patents that are necessary to implement" phrase, which is forbidden by the sentence that prohibits the implication of rights not expressly stated. Ignoring that barrier and implying that which is forbidden, one could refashion the subject phrase into something like the more typical "patents that are necessarily infringed by implementing the specification."

However, the putative patents that would necessarily be infringed by implementation are nowhere identified so that developers -- or reviewers of the draft specification -- might determine whether implementation would infringe a Microsoft patent. See ISO/IEC Patent Policy ("The originator of a proposal for a document shall draw the attention of the committee to any patent rights of which the originator is aware and considers to cover any item of the proposal. Any party involved in the preparation of a document shall draw the attention of the committee to any patent rights of which it becomes aware during any stage in the development of the document.")

That problem is exacerbated by the fact that Microsoft has not granted rights to implement the entire Ecma 376 specification. In the second step of the OSP's narrowing of implementers' rights, the phrase "only the required portions" excludes from patent protection any implementation of any portion of the Ecma 376 specification that is not mandatory. The Microsoft Open Specification Promise offers no patent protection whatsoever for implementation of the multitude of optional features in the Ecma 376 specification.

In the third step of narrowing implementers' rights, the OSP carves off patent protection for implementations of any specification features that are required by the specification unless they "are described in detail." How much detail? The term is not defined. Viewed one way, a single alphanumeric character is a detail. Is that sufficient? Viewed another way, one can purchase books at nearly any bookstore that describe how to write software programs in detail. Is that enough detail? Absent far less vague definitions or identification of specific portions of the specification excluded from patent protection, one can only obtain answers to such questions through litigation.

In the fourth step of narrowing rights, the OSP carves off patent protection for all implementations of any mandatory requirement that is "merely referenced in such Specification." Thereby, Microsoft denied any patent protection for the required implementation of, e.g.:

  • the Unicode Standard (pg. 13)
  • the UTF-8 and UTF-16 encoding form, as required by XML 1.0 (pg. 13)
  • W3C XML 1.0 (passim)
  • ISO B4 (pg. 2772)
  • ISO B5 (pg. 2772)
  • ISO 639-1 (pg. 1040)
  • ISO 690 (pg. 5965)
  • ISO 3166-1 (pg. 2530)
  • ISO 8061 (pg. 575)
  • ISO 8601 (page 184)
  • ISO 10646 (pg. 1544)
  • ISO 8859-15 (pg. 2699)
  • ISO/IEC 2382.1:1993 (pg. 6002)
  • ISO/IEC 9594-8 (page 184)
  • ISO/IEC 10646 (page 184)
  • ISO/IEC 10646-1 (pg. 13)

There are many other, non-ISO published standards that are "merely referenced" in the specification.

Closer to home, many Microsoft legacy file formats are also required by the specification to be implemented and are "merely referenced." Rob Weir of IBM has collected and referenced several such instances and discussed them in the context of conflicting provisions of the specification that both require and forbid their implementation. While it would be difficult for developers to implement those requirements because the formats' specifications have not for the most part been disclosed, should they succeed in doing so, e.g., through reverse engineering the formats, they are given no patent protection by the OSP because they are "merely referenced" in Ecma 376. Moreover, they are granted no rights to reverse engineer the applications to determine required behavior.

In the same vein, Ecma 376 is replete with a series of tags for supporting deprecated features of Microsoft applications. Examples are discussed in earlier portions of this document. Each such tag is accompanied by the following boilerplate guidance and can be quickly identified by searching the specification for portions of the guidance's text:

Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance

Because such tags "merely reference" behavior of older applications rather than specifying the required behavior in detail, Microsoft has granted no patent rights to study and replicate their unidentified behavior, despite the use of the mandatory must in the "guidance".


The Microsoft Covenant Not to Sue is irrelevant and ambiguous in any event

Despite Microsoft's statement on its Open Specification Promise page that developers can take their choice of implementing EOOXML under the patent protection of the OSP or the earlier covenant Not to Sue, that supposed grant of rights is wholly ineffective and irrelevant. The CNS by its own terms is plainly limited to implementations of the Microsoft Office 2003 XML Reference Schemas. The CNS also bluntly states:

No other rights except those expressly stated in this covenant shall be deemed granted, waived or received by implication, or estoppel, or otherwise.

(Emphasis added.) Microsoft's statement on another web page that this CNS can be relied upon as a source of rights to implement Ecma 376 is ineffective. That would require that rights not expressly stated in the CNS (the right to implement a different specification) be "otherwise" granted, waived, or received. The CNS forbids its amendment as to rights granted by resort to another document. The CNS is irrelevant in determining what rights are granted to developers who implement Ecma 376.

Even ignoring that problem and those discussed earlier, the CNS is also highly ambiguous as to the extent of rights to implement, at least as ambiguous as the Open Specification Promise. However, rather than devote further text here to a wholly irrelevant IP document, we refer the reader to the only published only detailed legal analysis of the CNS for a discussion of ambiguities within it.


Microsoft has granted no rights for standard improvements

???


End-User License Agreements (EULAs) may forbid full implementation

As noted above, many portions of the specification inappropriately require duplication of the functionality of various proprietary products, without a definition of exactly what that behavior is. Even worse, in some jurisdictions it may be illegal for competitors to try to determine what the specification actually means.

Many of these products' End-User License Agreements (EULAs) forbid attempts to determine exactly what these products do. It is difficult to find the EULA for Word 6, but later versions are instructive. For example, the "Microsoft Office Standard Edition 2003" (retrieved January 22, 2007) states in "LIMITATIONS ON REVERSE ENGINEERING, DECOMPILATION, AND DISASSEMBLY" that, "You may not reverse engineer, decompile, or disassemble the Software, except and only to the extent that such activity is expressly permitted by applicable law notwithstanding this limitation."

Note that these involve copyright and/or contract issues, and thus Microsoft's patent grants do not appear to provide any relief from these provisions.

In some jurisdictions, these EULA statements are probably enforceable. Indeed, Virginia and Maryland in the United States have passed a law called "UCITA", and UCITA essentially gives EULAs the force of law. Some jurisdictions do permit reverse engineering for interoperability purposes, but this is not universally true, and in some cases it is not clear that these exceptions are enough to permit legal use. The U.S. Digital Millenium Control Act (DMCA) includes an exception permitting reverse engineering for interoperability purposes from its prohibitions, but it is unclear that this DMCA provision would override EULAs in this case. In any case, this is a legal issue that must be resolved before this specification can even be considered. It is inappropriate to consider an international standard in which some suppliers might be forbidden by law to determine what the specification is.

Ecma 376 is a vendor lock-in specification

  • Adoption of Ecma 376 in its current state would frustrate the ISO goal [PDF] of "one standard, one test, and one conformity assessment procedure accepted everywhere.” Yet Microsoft's Alan Yates has freely admitted that the primarily goal of Ecma 376's sponsor is to have two standards instead of one: "Microsoft is asking for is that Massachusetts adopt two standards...".
  • Ecma 376 adoption would in effect grant Microsoft a monopoly on the conversion of its binary formats to XML
  • Ecma 376 is at least arguably violative of an existing antitrust injunction issued by the European Commission DG Competition*
  • Ecma 376 is at least arguably violative of an antrust injunction issued in U.S. v. Microsoft
  • Microsoft's refusal to support ISO 26300 and to disclose specifications for its binary file formats is under anti-trust investigation by the European Commission
    • EC's DG Competition statements about the ECIS complaint.
    • Footnote that the same refusals are in litigation in Tangent Computers v. Microsoft.

Ecma 376's full name confuses the marketplace

The name "Office Open XML" is terribly confusing and is often confused with "Open Office XML". This will confuse many readers into thinking this is the ISO 26300 (OpenDocument) standard, or that it relates to the OpenOffice.org product (one of several products to implement the ISO 26300 standard). Even Microsoft press releases make this mistake. For more information, see:

In a similar situation JTC1 NB's alleged a contradiction against C++/CLI when it was submitted under Fast Track last year.

Related cases

As noted earler, reviewers of Ecma 376 should be aware that issues raised by the Ecma 376 proposal -- Microsoft's refusal to release the specifications for its legacy file formats and/or its refusal to support ISO 26300 -- variously are or may become involved in at least four antitrust cases on two continents, thus raising a need for JTC-1's heightened scrutiny of the legal landscape before further processing of the Ecma 376 proposal. Not only National Standard Bodies are involved in the issues Ecma 376 raises. The courts and antitrust regulators are also involved. Thus, reviewers need to involve legal counsel in their review.

These cases include the European Commission antitrust investigation, European Union antitrust litigation, U.S. v. Microsoft, and Tangent Computer v. Microsoft. See the Related Cases section below for further detail.

European Commission antitrust investigation

Acting on a February 22, 2006 complaint from the European Committee for Interoperable Systems ("ECIS") (ECIS press announcement, the European Commission's DG Competition is reportedly investigating whether Microsoft violated European Union antitrust laws by its refusal to support ISO 26300 in Microsoft Office and its refusal to divulge to competitors the specifications for its legacy Microsoft Office file formats. Both of those refusals are manifested in Ecma 376, as discussed in more detail below.

According to a transcript that was published by the press on July 12, 2006, Competition Director General Neelie Kroes was asked by a reporter:

[Y]ou've got another complaint before the Commission right now which looks at interoperability issues in other areas including, as I believe, Office. Have you made any decision on that complaint which was filed in February by the ECIS group [ED: http://www.e-c-i-s.org/archives/2006/02/brussels_22_feb.html] and do you have any feeling that the March 2004 Decision establishes a broader principle of interoperability that will animate the Commission's decisions as it goes forward?

Kroes responded:

The answer on the last part of your question is Yes, and the answer on the first part of your question is No, I don't have yet -- made a decision.

No later statement on the status of that investigation has been found. The matter is apparently still pending.


European Union antitrust litigation

The European Commission's DG Competition previously found [PDF] that Microsoft's refusal to disclose its interoperability protocols for Windows and Windows Server to competitors was an antitrust violation. But DG Competition did not limit its remedial order to just the Windows communications protocols; it ordered Microsoft to refrain "from any act or conduct having … equivalent object or effect." Id., pg. 299.

The press release describing the ECIS complaint discussed above calls for DG Competition to "rapidly and broadly enforced" the limitations on Microsoft's conduct established in the 2004 order. It is not clear whether DG Competition, should it decide to pursue the matter, would do so as part of the original proceedings or begin a new legal proceeding. Presumably, the refusal to support ISO 26300 would require a separate proceeding, since the refusal to support a standard within the meaning of the Agreement on Technical Barriers to Trade was not an issue in the original proceeding, now on appeal in the European Court of First Instance.


U.S. v. Microsoft

Microsoft is operating under a broad antitrust consent decree (injunction) in the case of U.S. v. Microsoft, in which the court has retained jurisdiction to supervise Microsoft's compliance. Microsoft Office is at least arguably within the scope of the middleware as defined by section VI(K)(2)(b) of that decree. If deemed as such, Microsoft is required to disclose documentation for the Office file formats to competitors and others. It is unknown whether any relevant complaints have been made to the U.S. Justice Department and the various state attorneys general who are plaintiffs in the case.


Tangent Computer v. Microsoft

Microsoft was sued on February 14, 2006 in U.S. District Court for the Northern District of California in the antitrust case of Tangent Computer v. Microsoft. The complaint alleges in relevant part that Microsoft committed Sherman Act and Clayton Act violations by refusing to support ISO 26300 in Microsoft Office and by failing to disclose Office's file formats, certain Office APIs and interoperability features, and refusal to provide an interoperability standard for Object Linking and Embedding used in Office. Id., pp. 29-30, paragraphs 105-110. Consistent with the interpretation of the Consent Decree in U.S. v. Microsoft discussed above, the Complaint also alleges that Microsoft Office is a middleware platform product. Id., paragraph 107.


Approving Ecma 376 would violate the International Agreement on Technical Barriers to Trade

Overview of the Agreement on Technical Barriers to Trade (TBT)

Under Article 2 of the Agreement on Technical barriers, section 2.2 , it says:

Members shall ensure that technical regulations are not prepared, adopted or applied with a view to or with the effect of creating unnecessary obstacles to international trade.

(Emphases added.)

In this paper we argue that ISO adoption of Ecma 376 would create significant obstacles to trade, and furthermore, that those barriers are wholly unnecessary. Therefore, Ecma 376 must not be fast-tracked and must be rejected until such a time as the objections have been resolved.

If Ecma 376 becomes an ISO standard, that would create a significant barrier to trade

See the earlier section, Ecma 376 cannot be reasonably implemented by other vendors. The result of making Ecma 376 an ISO standard would be an international standard that only one vendor can implement successfully. Hence causing a significant barrier to trade, as defined by the Agreement on TBT.

Ecma 376 is not 'necessary' as defined by the Agreement on TBT


When is a new standard 'necessary'?

Because Ecma 376 would, if adopted, create significant "obstacles to international trade" as demonstrated above, JTC-1 must determine whether it is necessary nonetheless.

The sole justification for the Ecma 376 specification is compatibility with "billions of existing Microsoft Office documents". Hence, whether there is a market requirement for Ecma 376 as defined by the Agreement on TBT rests on whether Ecma 376 is required to achieve this compatibility or whether better alternatives exist.

To start, we note that the compatibility provided by Ecma 376 is very limited, as it can only be implemented by one software application (Microsoft Office) as discussed earlier in this document.

In this section we illustrate an alternate, standards based solution, that meets the same stated benefits as Ecma 376, without the corresponding barriers to trade.


Alternate, standards-based solution

We stated earlier that we don't believe that Ecma 376 contains any functionality that can't be expressed in the present ISO 26300 standard. But since Ecma 376 is over 6,000 pages long, it is impossible to know this with absolute certainty.

In this section we propose an alternate solution in the event that Ecma 376 does have a functionality that cannot be implemented in ISO 26300. We propose that Ecma 376 be replaced by a specification that extends ISO 26300 with whichever new tags are needed to cover the additional functionality.

The ISO 26300 standard is designed to be extensible. It is designed to admit new namespaces to adopt new functionality (see section 1.5 of ISO 26300). ISO 26300 was designed by a wide variety of organizations, with a wide variety of needs. Assuming that Ecma 376 truly contains functionality that can't be represented in ISO 26300, the correct way to specify this functionality is by defining an extension of ISO 26300.

Just as ISO 26300 references existing standards (SVG, SMIL, MathML, Dublin Core, ISO 8601, etc) whenever they provide overlapping functionality, so too should Ecma 376 reference ISO 26300 for any and all functionality that can be represented in ISO 26300. Ecma 376 should be limited to only specifying functionality that is not already provided by an existing international standard.

Compatibility Note
There should be very little or no need to extend the ISO 26300 specification to support existing Microsoft Office documents fully. For example:
  • There is no need to add an attribute like autoSpaceLikeWord95 because ISO 26300 already includes the generic attribute style:font-independent-line-spacing.
  • There is no need to add an attribute like useWord97LineBreakRules because ISO 26300 already includes a generic style:line-break property to select the set of line breaking rules to use for text.
  • There is no need to specify 7 pages of clip art, because ISO 26300 already allows the inclusion of any image in the document.
Compatibility Note
It should be pointed out that even for additional functionality, a new standard should not be necessary. ISO 26300 already allows vendors to use their own custom tags as long as they do so in their own namespace.
For example, nothing prevents Microsoft from adding an attribute called microsoft:useWord97LineBreakRules to their files if they really want.
Of course, if an extension to ISO 26300 is to be made, it is highly preferable that said extension be specified as an international standard as well.

Benefits of the alternate solution

This alternate solution (reworking Ecma 376 as an extension of ISO 26300 that only specifies additional functionality) has several important benefits:

  • It avoids duplication. This drastically reduces the barrier to trade, and better matches the ISO primary goal of "one standard, one test, and one conformity assessment procedure accepted everywhere,”
  • It removes conflicts with existing standards. ISO 26300 is already compatible with ISO 8601, ISO 639, W3C SVG, W3C MathML, W3C SMIL, W3C XML-ENC and the Gregorian Calendar. Reworking Ecma 376 as an extension of ISO 26300 would remove almost all conflicts with existing standards.
  • It would reduce the spec size considerably. Much of the Ecma 376 specification merely duplicates functionality already specified in ISO and W3C standards. The considerable size reduction would make the specification much more accessible to scrutiny and implementation.
  • It encourages multi-supplier interoperability. Many suppliers already implement ISO 26300, as it has been available for quite some time; some implementations even provide publicly readable source code. Building on a base of an international standard specifically designed for inter-application interoperability, which has been around much longer, is a more sensible starting point for any international standard.

Feasibility of the alternate solution

The alternate standards-based solution proposed is feasible. First, no less a software luminary than Tim Bray, co-lead developer of the W3C XML 1.0 recommendation, has advocated the same alternate solution:

The ideal outcome would be a common shared office-XML dialect for the basics—and it should be ODF [ISO 26300] (or a subset), since that's been designed and debugged—than another extended vocabulary to support Microsoft features, whether they're cool new whizzy features or mouldy old legacy features (XML Namespaces are designed to support exactly this kind of thing). That way, if you stayed with the basic stuff you'd never need to worry about software lock-in; the difference between portable and proprietary would be crystal-clear. And, for the basic stuff that everybody uses, there'd be only one set of tags. This outcome is technically feasible. Who could possibly be against it?

Second, Microsoft developers have told the Commonwealth of Massachusetts it would be "trivial" for Microsoft to implement ISO 26300 in Microsoft Office:

But Kriss insisted that the ODF [ISO 26300] policy wasn’t intended to be anti-Microsoft. He said technical people at Microsoft told him it would be “trivial” to add support for ODF to the new Office 2007. The resistance to doing so came from the vendor’s business side, according to Kriss.

Doing so would provide software users with interoperability as to all Microsoft Office features that can be mapped to ISO 26300. That would provide software users immediate intra-vendor application interoperability for at least a huge portion of the functionality available in Microsoft Office. Many vendors' applications already support ISO 26300 and Corel recently announced that it was developing ISO 26300 support for the last major office suite other than Microsoft's that does not already support it, WordPerfect Office.

List of ISO 26300 supporting applications

Combining the market shares of the office suites that already support ISO 26300, those that are currently devloping such support, and Microsoft's suite would provide broad interoperability for approaching 100 per cent of all office productivity software users.

Third, Microsoft's Alan Yates, General Manager, Microsoft Information Worker Business Strategy, has predicted that ISO 26300 and Ecma 376 will in fact be harmonized:

I would say, in the future, some time, you know, at some point, there will be convergence [of ISO 26300 and Ecma 376]. Convergence does happen over a period of time. Or there will be incorporation, there will be subsetting, supersetting. You know, the wireless standard, the A version merged into the B version, merged into the G version over a period of time to give better performance and functionality over a period of time. ... So, good news, I think, on that front is that this problem will be solved in time. It is not an easy, sort of snap-your-fingers sort of problem.

(Transcript of meeting audio; audio downloadable here in mp3 format.)

We can not overstress the importance of Mr. Yates' statement. From it one may infer that Microsoft recognizes:

[i] that Ecma 376 in fact duplicates and/or overlaps with the ISO 26300 standard;

[ii] that Ecma 376 and ISO 26300 are at present insufficiently compatible;

[iii] that the contradiction of ISO 26300 thereby acknowledged will require a longer period of time to resolve than is available within the limitations of fast-track processing by JTC-1; and

[iv] that it is in fact feasible for ISO 26300 to be adopted/adapted to meet Microsoft's software requirements.


Microsoft needs an incentive to execute the alternate solution

To be blunt, the present situation amply proves that Microsoft lacks sufficient incentive to fully support ISO 26300, since it feels that it can subvert the international Standards system in order to maintain it's user lock-in strategy. JTC-1 is in the unique position of being able to provide Microsoft with the needed incentive by insisting that all Ecma 376 contradictions with ISO 26300 be removed before continuing the standardization process for Ecma 376.

Doing so would also execute on JTC-1's duty under the Agreement on Technical Barriers to Trade, Article 2 section 2.2:

Members shall ensure that technical regulations are not prepared, adopted or applied with a view to or with the effect of creating unnecessary obstacles to international trade.

(Emphasis added.) Ecma 376 presents a situation in which the "preparation" of a proposed standard has "the effect of creating unnecessary obstacles to international trade." Microsoft has a monopoly position in the relevant software market. Any delay in informing the world's populace that Ecma 376 must not contradict ISO 26300 will result in millions of more people, businesses, and other institutions being persuaded to obtain and use software that contradicts and does not support the existing standard in a meaningful fashion. That poses a rapidly worsening non-interoperability barrier to trade and a weakening of the adoption rate of the existing standard.

JTC-1 has the word of Microsoft developers that it would be trivial for Microsoft to fully support ISO 26300 (see 'Feasibility of the alternate solution') and thereby resolve the interoperability barrier. A public announcement by JTC-1 that Microsoft will be required to work within the framework of the existing ISO 26300 standard if it wishes to use standardized formats would dramatically mitigate the harm to competition being inflicted each day that Microsoft is able to claim that its file formats are under consideration as an ISO standard.

Therefore, we respectfully urge JTC-1 to adopt the proposed alternate solution to resolve the Ecma 376 contradictions with ISO 26300.


Fast-tracking Ecma 376 risks injury to JTC-1's reputation

Fast-tracking Ecma 376 risks injury to JTC-1's reputation. Ecma 376 is already highly controversial and has the avid attention of the press:

  • JTC-1 risks being perceived as granting a single vendor a monopoly
  • JTC-1 risks being regarded as a rubber stamp body for vendor lock-in 'standards'
    •  ??? Cite and quote some articles demonstrating perceptions of Ecma's processing of Ecma 376.
  • JTC-1 risks being blamed for software incompatibilities by confused users
  • JTC-1 risks being perceived as rewarding a company that has refused to support the the multi-year existing standard
    •  ??? Work in requests by European Commission for Microsoft to support ISO 26300 and for Microsoft to harmonize Ecma 376 with ISO 26300. Also work in Microsoft's refusal of Massachusetts' request to support ISO 26300.
  • JTC-1 risks becoming embroiled in a World Trade Organization dispute resolution
Personal tools

Click here to send an email to the editor of this weblog.

Amazon Honor  System Click  Here to Pay Learn
More



Hosting:
Ibiblio