Talk:EOOXML objections

From Grokdoc

Revision as of 05:24, 23 January 2007; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

This page is for discussion of the EOOXML objections, including any potential problems that you find in the Ecma 376 standard, or concerns about the wording of the objections. To add a new topic for discussion, click the "+" tab next to the "edit" tab at the top of the page.

Contents

British Standards Institute

The BSI is a P member on the JTC commitee and therefore has the ability to to halt the fast track process of EOOXML.

The BSI is governed under the following rules:

The Royal Charter

The Formal Memorandum of Understanding

The Public Policy Interest in the UK, Part 1

The Public Policy Interest in the UK, Part 2

And must follow 'BS 0, A Standard for Standards"

Part 1, Development of Standards - Specification Part 2, Structure and Drafting - Requirements and Guidance

There are many provisions in these documents that indicate that the BSI should not aprove the fast-tracking of EOOXML. Here are some extracts from the first 4 listed above (the foundation documents of the BSI):

“BSI has committed to the UK government that it will take into account the UK public policy interest in conducting its NSB activities.”(BS0)

“British Standards, including standards formulated in ... international contexts ... are developed ... to serve the UK public policy interest” (MOU 1.4)

“..Companies themselves can have an incentive to .. promote their own specifications and exclude competition, perhaps by forming cartels.” (or monopolies) “Public Policy is required in order to compensate for these market imperfections.” (MOU 2.2.4)

“The Government and BSI are ... determined to promote effective standardisation policy in order to realise in full the potential socio-economic benefits of standardisation, including the promotion of the small and medium sized business sector and of ... consumer ... interests.” (MOU 2.4)

“BSI will ensure ... that it achieves ... the optimal promotion of UK interests”. (MOU 4.1.3)


And here is a very interesting snippet from BS 0, A Standard for Standards:

"Where BSI as the UK NSB has approved an international draft standard, the published standard shall be adopted as a national standard except in the following cases: • where the international standard is being adopted as a European standard; or • where there is an existing conflicting European standard; or • where there is an existing national standard which the Technical Committee considers is more appropriate for the UK."

Searching the BSI standards database gives the following result which indicates that ODF is already a British Standard. Need to follow this up to ensure that it is correct.

"BS ISO/IEC 26300 (2005/02775) BS ISO/IEC 26300 Information technology - Open Document Format for Office Applications (OpenDocument) v1.0"

Based upon the the above provision, this could preclude OOXML becoming a British standard on the basis of it conflicting with ODF.

BS 0(2) states: "7.8.2.3 Normative Reference to Documents other then Public Standards.

Normative reference shall not be made to material that is: - not publicly available; - not readily accessible; - known to be of unstable or ephemeral nature."

(The text does not state whether these are AND or OR provisions, but is clearly would not make sense for a reference to be excluded only if all 3 of the above conditions were met).

This passage would seem to be in direct conflict with the EOOXML specification's reliance on material external to the specification(ie the behaviour of MS Word 6)

I haven't read all of BS 0 yet, nor uploaded all of my quotes from the 'foundation documents' of the BSI

JTC 1 Directives

13 Preparation and Adoption of International Standards - Fast-Track Processing

13.2 The proposal for the fast-track procedure shall be received by the ITTF which shall take the following actions:

• Settle the copyright or trademark situation, or both, with the proposer, so that the proposed text can be freely copied and distributed within ISO/IEC without restriction;

• Assess in consultation with the JTC 1 Secretariat that JTC 1 is the competent committee for the subject covered in the proposed standard and ascertain that there is no evident contradiction with other ISO/IEC standards;

• Distribute the text of the proposed standard (or amendment) as a DIS (or DAM), indicating that the standard belongs in the domain of JTC 1 (see Form G12). In case of particularly bulky documents the ITTF may demand the necessary number of copies from the proposer.


13.5 During the 30-day review period, a NB may identify to the JTC 1 Secretariat any perceived contradiction with other JTC 1, ISO or IEC standards.

If such a contradiction is alleged, the matter shall be resolved by the ITTF and JTC 1 Secretariat in accordance with Section 13.2 before ballot voting can commence. If no contradiction is alleged, the fast-track ballot voting commences immediately following the 30-day period.


M.7.4.3.3 Substitution and Replacement

a) What needs exist, if any, to replace an existing international standard? Rationale?
b) What is the need and feasibility of using only a portion of the specification as an international standard?
c) What portions, if any, of the specification do not belong in an international standard (e.g. too implementation- specific)?
(Annex M The Transposition of Publicly Available Specifications into International Standards)


International Numeric Formatting Standards?

In section 2.16.4.3 (p. 2283), a number of formatting strings are defined to specify how numbers are printed, such as "CHOSUNG" to use Korean Chosung format, or "CHINESENUM2" to use the Chinese simplified legal format, etc. Surely there is a more general standard for this sort of thing? Stevenj 22:30, 19 January 2007 (EST)

Whats even worse, its similar to Inflexible Numbering format (p 2554) which is a redefinition. I originally had the 2.16.4.3 included in this concern, but somehow its not there anymore... Its a seperate issue to 'infexible', but its anther concern where number-formats are defined multiple times within the same spec. One its by a enumValue (p 2554), and the other (p. 2283) is a ?? enum? string? --- Yoonkit 23:10, 22 January 2007 (EST)

INCLUDEPICTURE

On page 2320, section 2.16.5.33, it defines a field INCLUDEPICTURE that can be used to insert a "picture" from an external file:

Retrieves the picture contained in the document named by field-argument. If field-argument contains white space, it shall be enclosed in double quotes. If field-argument contains any backslash characters, each one shall be preceded directly by another backslash character.

However, it doesn't seem to define what kinds of "picture" file formats should be supported. (It gives an example of a .jpg file, presumably some version of the JPEG standard.)

Stevenj 23:05, 19 January 2007 (EST)

Audio and Video part formats

Section 15.2.2 (p. 148) describes an "Audio Part" which may "contain an audio file" and be of "any supported audio type" and which "may be the target of a relationship in a Handout Master, Notes Slide, Notes Master, Slide, Slide Layout, or Slide Master part-relationship item. The standard does not specify, however, what audio file types (if any) are to be supported by a conforming application. It gives examples of AIFF, WMA, and MIDI files.

Similarly, section 15.2.16 (p. 167) defines a "Video Part" which "contains a video file" of "any supported video type", which can also be the target of a relationship in slides, etc. Again, it does not specify what video file types must be supported (it gives examples of AVI, MPEG, QuickTime, and Windows Media files).

(See also section 5.1.3, page 4073, which specifies how audio and video of arbitrary unspecified types can be embedded in DrawingML.)

I'm of two minds about this. On the one hand, if I'm using a Mac and my OOXML implementation uses QuickTime, it's certainly nice to be able to embed audio and video in any file format that happens to be supported by QuickTime. On the other hand, what good is a standard format for presentations if there is no guarantee that anyone will be able to see/hear critical parts of it? (Especially considering that, if I recall correctly, support for audio and video multimedia was one of the highly touted "features" of OOXML over OpenDocument.)

— Steven G. Johnson 00:51, 20 January 2007 (EST)

"Application-defined" defaults?

There are a number of elements which are not required, but if omitted lead to "application-defined" default behaviors. See

  • docDefaults (2.7.4.1): Document Default Paragraph and Run Properties
  • pPr (2.7.4.2): Paragraph Properties
  • pPrDefault (2.7.4.3): Default Paragraph Properties
  • rPr (2.7.4.4): Run Properties
  • rPrDefault (2.7.4.5): Default Run Properties

— Steven G. Johnson 02:05, 20 January 2007 (EST)

"Application-defined" and undefined "legacy" merge document formats?

In section 2.18.63, ST_MailMergeSourceType (Mail Merge ODSO Data Source Types), which is described as "purely a suggestion" about the source type, many of the possible enumeration values are undefined or poorly defined. It's not clear to me what the practical impact of this "suggestion" is, so I don't know how serious an omission this might be.

Two of the possible enumeration values are "document1" and "document2" on p. 2549 of the PDF, defined as:

Specifies that a given merged WordprocessingML document has been connected to another document format supported by the producing application. The format of this document is application-defined and outside the scope of this Office Open XML Standard.

Another of the possible enumeration value is "legacy", defined as:

Specifies that a given merged WordprocessingML document has been connected to a legacy document format supported by the producing application. The format of this legacy document is application-defined and outside the scope of this Office Open XML Standard.

More of OOXML's touted backwards-compatibility at work?

And then, in case there weren't enough application-defined mail-merge source types, another enumeration value is "native", defined as:

Specifies that a given merged WordprocessingML document has been connected to another document format native to the producing application. The format of this document is application-defined and outside the scope of this Office Open XML Standard.

Another format is "text", defined as:

Specifies that a given merged WordprocessingML document has been connected to a text file.

7-bit ASCII? Unicode? It doesn't say.

— Steven G. Johnson 02:38, 20 January 2007 (EST)

Equation shape objects in drawings are not editable unless you are Microsoft?

Section 6.1.2.19 defines the "shape" element in VML, "the core object in VML". One of the attributes of this element, on page 4653 (PDF p. 5436), is "equationxml", defined as:

Specifies alternate XML markup which may be used to rehydrate an equation using the Office Open XML Math syntax. The actual format of the contents of this attribute are application-defined, but shall contain Office Open XML Math as well as any application-specific content.

My understanding that this attribute is intended to be used if you have an equation object in your drawing, and you want to edit the equation using the math editor — in that case, you need the original equation XML to access the mathematical structure. However, since the contents of this attribute are application-defined, in practice this is apparently possible only if you want to edit the object in the same program that created it. (Saying that it "shall contain" OOXML math doesn't seem like much help...shall contain how? After 327 bytes of application-specific content and gzipped?)

— Steven G. Johnson 03:04, 20 January 2007 (EST)

"Application-defined" binary blobs within shape objects?

Section 6.1.2.19 defines the "shape" element in VML, "the core object in VML". One of the attributes of this element, on page 4655 (PDF p. 5438), is "gfxdata", defined as:

Specifies a base-64 encoded package as deifined [sic] in Part 2 of this Standard that contains DrawingML content. The contents of this package are application-defined, but the contents of the package shall use the Parts defined by this Standard whenever possible. [Rationale: This attribute allows an application to use VML to represent graphical content while still persisting DrawingML for consuming applications that support DrawingML. For example, a diagram stored within this attribute would have the four parts defined for a DrawingML diagram, as well as any number of application-defined parts and relationships. end ationale]

An "application-defined" base64-encoded binary blob in the middle of your drawing?

— Steven G. Johnson 03:09, 20 January 2007 (EST)

"Application-defined" binary blobs for Microsoft Ink™ data?

Section 6.2.2.14, on page 4813 (PDF p. 5596) of the standard defines the "ink" element as:

This element specifies the presence of an ink object. An ink object is a VML object which allows applications to store data for ink annotations in an application-defined format.

The actual data for the "ink" object is stored as a base64-encoded binary blob in the "i" attribute.

This is obviously a thinly-veiled way to include the proprietary Microsoft Ink tablet-PC annotation content in OOXML, without documenting the format used. (According to MS documentation, Ink can be stored as either "fortified" GIF or ISF (possibly wrapped inside of another file format [1]), but the canonical format is apparently the Ink Serialized Format (ISF), an apparently undocumented "highly compressed binary representation of the ink data from a lone ink object." "Fortified" GIF files apparently contain ISF via metadata [2].)

(Just because it's "annotation" data does not mean it's superfluous. Microsoft promotes Ink for use in exchanging copy-editing information as part of Office files, for example. Nor is it clear whey they couldn't use a standard format like PNG for representing transparent bitmaps, or SVG if they want to represent stroke data; the small amount of additional metadata, such as original device coordinates, that ISF encodes could be documented and added as metadata in PNG or SVG. Not to mention the fact that base64-encoding it directly in the XML seems rather silly when it could just be stored as a separate file part.)

— Steven G. Johnson 03:16, 20 January 2007 (EST)

Another undefined "legacy" feature: the DigSig element

7.2.2.6 defines the DigSig element (page 5105):

This element contains the signature of a digitally signed document. [Note: This property is a mechanism used by legacy documents to store the digital signature of its binary representation, and should be considered deprecated in favor of the well-defined mechanism defined in Part 2. Any use of this property should be for legacy compatibility only, and is application-defined. end note]

(Note that it is "application-defined", and no other information is given.)

— Steven G. Johnson 03:22, 20 January 2007 (EST)

Bitmasks are not subject to endian issues when represented as numeric strings

Contrary to (apparently) popular misconception, the use of bitmasks does not inherently cause endian issues. Endian issues only arise in serialized binary formats, and not when numbers are represented by ASCII numerals in whatever base (as in OOXML). (I have a fair amount of experience in dealing with both binary file formats and bitmasks in portable software.)

Whenever a number is written as a string numeral, whether as a decimal number like "21" or as equivalent hexadecimal like "0x15" or as an equivalent binary string "00010101", there is no ambiguity: the digits are always arranged from most to least significant (in English and standards based on English, at least). It is only when this number is represented in a computer's memory by a sequence of bytes that the byte-sequence varies depending on endianness.

Similarly, bit operations in progamming languages are not subject to endian problems, because they are specified in terms of either ASCII strings or bit-shifts relative to the "logical" least-significant bit, regardless of the binary form of the compiled machine code. For example, if I want to set the 13th least-significant bit of an integer "x" in a C program, I would do something like: "x | 4096" or "x | 0x1000" or "x | (1 << 12)". If I subsequently print out the value of "x", using printf, as a decimal or hexadecimal number, I will get the same result on both big- and little-endian systems. I will only have a problem if I typecast &x to "char *" and print out the bytes one by one.

As another example, consider the "chmod" command in Unix, which allows you to specify the file permissions as a bitmask in octal. "chmod 0644", where "0644" is an octal numeral, sets the same file permissions on both big- and little-endian systems.

So, in short, while OOXML's use of bitmasks has numerous problems, portability between different-endian systems is not one of them.

— Steven G. Johnson 20:52, 21 January 2007 (EST)

I agree. Months ago when I first noticed the bitmasks I had a "gotcha" moment where I was sure it was endian-dependent. But further reflection convinced me that it was OK. The stronger criticism here is that bitmasks defy validation by XML Schema, don't work well with XML tools like XSLT, and are just plain silly in a XML format that is going to be zipped up in the end.
--Cicero 22:34, 21 January 2007 (EST)

ST_Panose defined twice

ST_Panose is defined in 2.18.72 (page 2569) and 5.1.12.37 (page 4502). Is that legal? Or is there some implicit namespacing I'm missing at this low level? I discovered this when dredging through all the hexBinary issues. --Trollsfire 16:13, 22 January 2007 (EST)

If having multiple definitions is a problem, then ST_TwipsMeasure is defined both in 2.18.105 (page 2619) and 7.1.3.16 (page 5884). The definitions seem compatible, but still.
--Trollsfire 19:36, 22 January 2007 (EST)

Multiple types used for RGB coding

In digging through all the hexBinary inconsistencies, I discovered that there are three different variable types used for storing RGB data:

  • ST_HexColorRGB (2.18.45, page 2520) which, in my opinion, is the best named
  • ST_UnsignedIntHex (3.18.86, page 3712) which is 4 octets and may be used for thing other than RGB color as well
  • ST_HexBinary3 (5.1.12.28, page 4483) which is accurately named for what it is (a length 3 (3 octet) hexBinary variable) but not what it does

I reference this in the section about the hex inconsistencies, but should this be extracted from there and put elsewhere? A section about internal redundancies (multiple definitions for the same things), maybe? --Trollsfire 16:20, 22 January 2007 (EST)

cryptography and fields

The end of 2.5.1.28, p 1165, worries me quite a bit: not only is there the ability to use any encryption method desired, but any possible external module might be used to create it. algIdExtSource + algIdExt just screams "embrace and extend". On a related topic, 2.6 (p 1487) talks about fields, and "The act of carrying out a field's codes is referred to as a field update. As to how or when any field is updated is outside the scope of this Office Open XML Standard." 2.16.4.1 has a similar comment: "If no date-and-time-formatting-switch is present, a date or time result is formatted in an implementation-defined manner." --Dogcow 16:46, 22 January 2007 (EST)

Good finds! I've added your 2.16.4.1 example to the section "Relies on application-defined behaviors". We might consider adding your other two examples if the problems with them can be clearly explained. (For example, how might inconsistent field-updating cause interoperability problems? Is this just an user-interface issue?) — Steven G. Johnson 19:26, 22 January 2007 (EST)

Paper Sizes

Section 3.3.1.61 page 2770 "pageSetup" (Page Setup Settings) and 3.3.1.62 page 2774 "pageSetup" (Chart Sheet Page Setup) have an Attribute entitled 'paperSize' which is enumerated to 68 fixed paper types. This is contradictory to the naming convention of paper sizes as defined by ISO 216 (A,B,C sizes) used internationally and ANSI Y14.1 used in the USA.

I agree that this is a relatively small issue, but it does show Ecma 376's restriction on the choice of paper sizes and its insistence to use internal lookup tables contrary to ISO names and also defeats the purpose of readable XML. -- Yoonkit 23:02, 22 January 2007 (EST)

Uses a Microsoft-specific namespace

This is whats currently written.. Section 6.2.3.23 page 5197 Attribute "href" (Hyperlink Target) uses a Namespace "urn:schemas.microsoft.com:office:office".

An Ecma standard must not reference company-specific namespaces. This should be replaced by an Ecma namespace.

--

Ive done more checking, and the VML Reference Material, pg 5126 states:

"To maintain backward compatibility, all VML namespaces defined in this specification maintain the legacy namespace structure already used by millions of documents.

[Note: The VML format is a legacy format originally introduced with Office 2000 and is included and fully defined in this Standard for backwards compatibility reasons. The DrawingML format is a newer and richer format created with the goal of eventually replacing any uses of VML in the Office Open XML formats.

VML should be considered a deprecated format included in Office Open XML for legacy reasons only and new applications that need a file format for drawings are strongly encouraged to use preferentially DrawingML."

So it explains why the namespace is there, but it does not explain why a deprecated format is being included in this "modern" spec. How should we rephrase this concern?

Yoonkit 21:27, 22 January 2007 (EST)

I've taken a stab at rephrasing it. See the document. — Steven G. Johnson 21:39, 22 January 2007 (EST)
OK, that reads OK, Thanks! -- Yoonkit 22:55, 22 January 2007 (EST)

A string is used to define deprecated VML in DrawingML

section 6.5.22 (p. 5743) "textdata"

This element specifies optional supplementary text information associated with a legacy VML shape that is a node in a VML diagram when it cannot otherwise be stored within the DrawingML framework.

[Note: An application could use this to preserve a specific diagram format for backward compatibility, but it is strongly recommended to upgrade all VML shapes to DrawingML shapes. end note]

Is this the "billions" of backward compatibility support? store it as a "textdata"?

Yoonkit 21:36, 22 January 2007 (EST)

Ecma 376 contradicts SVG colour names

This got dropped yesterday:

--

SVG color values

Ecma 376 section 2.18.46 page 2521, contradicts the SVG Color Keyword Names hexadecimal RGB values for given color names.

Colour Name SVG Ecma 376
Dark blue 00008B 000080
Dark cyan 008B8B 008080
Dark gray A9A9A9 808080
Dark green 006400 008000
Dark red 8B0000 800000
Light gray D3D3D3 C0C0C0
Compatibility Note
There is no need for redefining color names to achieve compatibility with existing Microsoft Office documents. Microsoft is free to use whatever color names it wishes on its office application and store the hexadecimal color value in the file.

--

I think its quite important that we show how colours are changed in the spec, and will cause problems. As demonstrated by the colours, they are quite different, e.g. DarkGray and DarkGreen.

Its a direct contradiction with completely different RGB values. -- Yoonkit 22:58, 22 January 2007 (EST)

I agree. This section was apparently deleted without explanation by User:Dcarrera, who I've noticed has been making unexplained and repeated deletions in numerous sections. Not from malice, I think — I suspect he's doing it in the name of brevity and focusing on the most important points. He/she wrote, as a hidden comment in the article:
The OBJECTIVE of this document is NOT to have an exhaustive list of everything weird or unusual in the Ecma spec. The OBJECTIVE is to make a strong argument. A strong argument is not made stronger by adding weak points. Adding weak points makes the argument WEAKER because it looks like you are hair-splitting and makes the good arguments harder to find.
My feeling is that, while the most important points are surely the duplication of OpenDocument, SVG, and other existing XML standards, these little inconsistencies and oddities go a long way in demonstrating the haste and lack of care in the Ecma process. The thing with little inconsistencies and oddities, however, is that because they are each minor things individually, it is really by force of numbers that they make their case. Furthermore, this page (as far as I can tell) is mainly intended as a resource for those who wish to object to Ecma 376, since we cannot submit objections directly, and therefore should err on the side of inclusiveness. I'll restore the section. — Steven G. Johnson 00:09, 23 January 2007 (EST)
Personal tools

Click here to send an email to the editor of this weblog.

Amazon Honor  System Click  Here to Pay Learn
More



Hosting:
Ibiblio