Talk:EOOXML objections
From Grokdoc
| Revision as of 23:48, 7 February 2007 Rick Jelliffe (Talk | contribs) (→Bitmasks cause significant validation problems) ← Previous diff |
Revision as of 17:03, 8 February 2007 Stevenj (Talk | contribs) (→Non XML formatting codes) Next diff → |
||
| Line 627: | Line 627: | ||
| XML does not define any element names. So it is incorrect to claim that any element name that is well-formed is "contrary" too XML.[[User:Rick Jelliffe|Rick Jelliffe]] 18:48, 7 February 2007 (EST) | XML does not define any element names. So it is incorrect to claim that any element name that is well-formed is "contrary" too XML.[[User:Rick Jelliffe|Rick Jelliffe]] 18:48, 7 February 2007 (EST) | ||
| + | |||
| + | :Did you actually look at that page in the ECMA 376 spec? It is not using "b" and "i" as XML element names similar to HTML. It is using "\b" and "\i" at the ''end'' of a line to make the preceding line bold and italic, respectively. I agree that the Grokdoc explanation of this is confusingly worded, however. [[User:Stevenj|— Steven G. Johnson]] 12:03, 8 February 2007 (EST) | ||
Revision as of 17:03, 8 February 2007
This page is for discussion of the EOOXML objections, including any potential problems that you find in the Ecma 376 standard, or concerns about the wording of the objections. To add a new topic for discussion, click the "+" tab next to the "edit" tab at the top of the page.
The EOOXML objections page is now locked to prevent editing. If you find further flaws in Ecma 376, please add them to:
British Standards Institute
The BSI is a P member on the JTC commitee and therefore has the ability to to halt the fast track process of EOOXML.
The BSI is governed under the following rules:
The Formal Memorandum of Understanding
The Public Policy Interest in the UK, Part 1
The Public Policy Interest in the UK, Part 2
And must follow 'BS 0, A Standard for Standards"
Part 1, Development of Standards - Specification Part 2, Structure and Drafting - Requirements and Guidance
There are many provisions in these documents that indicate that the BSI should not aprove the fast-tracking of EOOXML. Here are some extracts from the first 4 listed above (the foundation documents of the BSI):
“BSI has committed to the UK government that it will take into account the UK public policy interest in conducting its NSB activities.”(BS0)
“British Standards, including standards formulated in ... international contexts ... are developed ... to serve the UK public policy interest” (MOU 1.4)
“..Companies themselves can have an incentive to .. promote their own specifications and exclude competition, perhaps by forming cartels.” (or monopolies) “Public Policy is required in order to compensate for these market imperfections.” (MOU 2.2.4)
“The Government and BSI are ... determined to promote effective standardisation policy in order to realise in full the potential socio-economic benefits of standardisation, including the promotion of the small and medium sized business sector and of ... consumer ... interests.” (MOU 2.4)
“BSI will ensure ... that it achieves ... the optimal promotion of UK interests”. (MOU 4.1.3)
And here is a very interesting snippet from BS 0, A Standard for Standards:
"Where BSI as the UK NSB has approved an international draft standard, the published standard shall be adopted as a national standard except in the following cases: • where the international standard is being adopted as a European standard; or • where there is an existing conflicting European standard; or • where there is an existing national standard which the Technical Committee considers is more appropriate for the UK."
Searching the BSI standards database gives the following result which indicates that ODF is already a British Standard. Need to follow this up to ensure that it is correct.
"BS ISO/IEC 26300 (2005/02775) BS ISO/IEC 26300 Information technology - Open Document Format for Office Applications (OpenDocument) v1.0"
Based upon the the above provision, this could preclude OOXML becoming a British standard on the basis of it conflicting with ODF.
BS 0(2) states: "7.8.2.3 Normative Reference to Documents other then Public Standards.
Normative reference shall not be made to material that is: - not publicly available; - not readily accessible; - known to be of unstable or ephemeral nature."
(The text does not state whether these are AND or OR provisions, but is clearly would not make sense for a reference to be excluded only if all 3 of the above conditions were met).
This passage would seem to be in direct conflict with the EOOXML specification's reliance on material external to the specification(ie the behaviour of MS Word 6)
I haven't read all of BS 0 yet, nor uploaded all of my quotes from the 'foundation documents' of the BSI
JTC 1 Directives
13 Preparation and Adoption of International Standards - Fast-Track Processing
13.2 The proposal for the fast-track procedure shall be received by the ITTF which shall take the following actions:
• Settle the copyright or trademark situation, or both, with the proposer, so that the proposed text can be freely copied and distributed within ISO/IEC without restriction;
• Assess in consultation with the JTC 1 Secretariat that JTC 1 is the competent committee for the subject covered in the proposed standard and ascertain that there is no evident contradiction with other ISO/IEC standards;
• Distribute the text of the proposed standard (or amendment) as a DIS (or DAM), indicating that the standard belongs in the domain of JTC 1 (see Form G12). In case of particularly bulky documents the ITTF may demand the necessary number of copies from the proposer.
13.5 During the 30-day review period, a NB may identify to the JTC 1 Secretariat any perceived contradiction with
other JTC 1, ISO or IEC standards.
If such a contradiction is alleged, the matter shall be resolved by the ITTF and JTC 1 Secretariat in accordance with Section 13.2 before ballot voting can commence. If no contradiction is alleged, the fast-track ballot voting commences immediately following the 30-day period.
M.7.4.3.3 Substitution and Replacement
a) What needs exist, if any, to replace an existing international standard? Rationale?
b) What is the need and feasibility of using only a portion of the specification as an international standard?
c) What portions, if any, of the specification do not belong in an international standard (e.g. too implementation-
specific)?
(Annex M The Transposition of Publicly Available Specifications into International Standards)
International Numeric Formatting Standards?
In section 2.16.4.3 (p. 2283), a number of formatting strings are defined to specify how numbers are printed, such as "CHOSUNG" to use Korean Chosung format, or "CHINESENUM2" to use the Chinese simplified legal format, etc. Surely there is a more general standard for this sort of thing? Stevenj 22:30, 19 January 2007 (EST)
- Whats even worse, its similar to Inflexible Numbering format (p 2554) which is a redefinition. I originally had the 2.16.4.3 included in this concern, but somehow its not there anymore... Its a seperate issue to 'infexible', but its anther concern where number-formats are defined multiple times within the same spec. One its by a enumValue (p 2554), and the other (p. 2283) is a ?? enum? string? --- Yoonkit 23:10, 22 January 2007 (EST)
INCLUDEPICTURE
On page 2320, section 2.16.5.33, it defines a field INCLUDEPICTURE that can be used to insert a "picture" from an external file:
- Retrieves the picture contained in the document named by field-argument. If field-argument contains white space, it shall be enclosed in double quotes. If field-argument contains any backslash characters, each one shall be preceded directly by another backslash character.
However, it doesn't seem to define what kinds of "picture" file formats should be supported. (It gives an example of a .jpg file, presumably some version of the JPEG standard.)
Stevenj 23:05, 19 January 2007 (EST)
Audio and Video part formats
Section 15.2.2 (p. 148) describes an "Audio Part" which may "contain an audio file" and be of "any supported audio type" and which "may be the target of a relationship in a Handout Master, Notes Slide, Notes Master, Slide, Slide Layout, or Slide Master part-relationship item. The standard does not specify, however, what audio file types (if any) are to be supported by a conforming application. It gives examples of AIFF, WMA, and MIDI files.
Similarly, section 15.2.16 (p. 167) defines a "Video Part" which "contains a video file" of "any supported video type", which can also be the target of a relationship in slides, etc. Again, it does not specify what video file types must be supported (it gives examples of AVI, MPEG, QuickTime, and Windows Media files).
(See also section 5.1.3, page 4073, which specifies how audio and video of arbitrary unspecified types can be embedded in DrawingML.)
I'm of two minds about this. On the one hand, if I'm using a Mac and my OOXML implementation uses QuickTime, it's certainly nice to be able to embed audio and video in any file format that happens to be supported by QuickTime. On the other hand, what good is a standard format for presentations if there is no guarantee that anyone will be able to see/hear critical parts of it? (Especially considering that, if I recall correctly, support for audio and video multimedia was one of the highly touted "features" of OOXML over OpenDocument.)
— Steven G. Johnson 00:51, 20 January 2007 (EST)
"Application-defined" defaults?
There are a number of elements which are not required, but if omitted lead to "application-defined" default behaviors. See
- docDefaults (2.7.4.1): Document Default Paragraph and Run Properties
- pPr (2.7.4.2): Paragraph Properties
- pPrDefault (2.7.4.3): Default Paragraph Properties
- rPr (2.7.4.4): Run Properties
- rPrDefault (2.7.4.5): Default Run Properties
— Steven G. Johnson 02:05, 20 January 2007 (EST)
Equation shape objects in drawings are not editable unless you are Microsoft?
Section 6.1.2.19 defines the "shape" element in VML, "the core object in VML". One of the attributes of this element, on page 4653 (PDF p. 5436), is "equationxml", defined as:
- Specifies alternate XML markup which may be used to rehydrate an equation using the Office Open XML Math syntax. The actual format of the contents of this attribute are application-defined, but shall contain Office Open XML Math as well as any application-specific content.
My understanding that this attribute is intended to be used if you have an equation object in your drawing, and you want to edit the equation using the math editor — in that case, you need the original equation XML to access the mathematical structure. However, since the contents of this attribute are application-defined, in practice this is apparently possible only if you want to edit the object in the same program that created it. (Saying that it "shall contain" OOXML math doesn't seem like much help...shall contain how? After 327 bytes of application-specific content and gzipped?)
— Steven G. Johnson 03:04, 20 January 2007 (EST)
"Application-defined" binary blobs within shape objects?
Section 6.1.2.19 defines the "shape" element in VML, "the core object in VML". One of the attributes of this element, on page 4655 (PDF p. 5438), is "gfxdata", defined as:
- Specifies a base-64 encoded package as deifined [sic] in Part 2 of this Standard that contains DrawingML content. The contents of this package are application-defined, but the contents of the package shall use the Parts defined by this Standard whenever possible. [Rationale: This attribute allows an application to use VML to represent graphical content while still persisting DrawingML for consuming applications that support DrawingML. For example, a diagram stored within this attribute would have the four parts defined for a DrawingML diagram, as well as any number of application-defined parts and relationships. end ationale]
An "application-defined" base64-encoded binary blob in the middle of your drawing?
— Steven G. Johnson 03:09, 20 January 2007 (EST)
"Application-defined" binary blobs for Microsoft Ink™ data?
Section 6.2.2.14, on page 4813 (PDF p. 5596) of the standard defines the "ink" element as:
- This element specifies the presence of an ink object. An ink object is a VML object which allows applications to store data for ink annotations in an application-defined format.
The actual data for the "ink" object is stored as a base64-encoded binary blob in the "i" attribute.
This is obviously a thinly-veiled way to include the proprietary Microsoft Ink tablet-PC annotation content in OOXML, without documenting the format used. (According to MS documentation, Ink can be stored as either "fortified" GIF or ISF (possibly wrapped inside of another file format [1]), but the canonical format is apparently the Ink Serialized Format (ISF), an apparently undocumented "highly compressed binary representation of the ink data from a lone ink object." "Fortified" GIF files apparently contain ISF via metadata [2].)
(Just because it's "annotation" data does not mean it's superfluous. Microsoft promotes Ink for use in exchanging copy-editing information as part of Office files, for example. Nor is it clear whey they couldn't use a standard format like PNG for representing transparent bitmaps, or SVG if they want to represent stroke data; the small amount of additional metadata, such as original device coordinates, that ISF encodes could be documented and added as metadata in PNG or SVG. Not to mention the fact that base64-encoding it directly in the XML seems rather silly when it could just be stored as a separate file part.)
— Steven G. Johnson 03:16, 20 January 2007 (EST)
Bitmasks are not subject to endian issues when represented as numeric strings
Contrary to (apparently) popular misconception, the use of bitmasks does not inherently cause endian issues. Endian issues only arise in serialized binary formats, and not when numbers are represented by ASCII numerals in whatever base (as in OOXML). (I have a fair amount of experience in dealing with both binary file formats and bitmasks in portable software.)
Whenever a number is written as a string numeral, whether as a decimal number like "21" or as equivalent hexadecimal like "0x15" or as an equivalent binary string "00010101", there is no ambiguity: the digits are always arranged from most to least significant (in English and standards based on English, at least). It is only when this number is represented in a computer's memory by a sequence of bytes that the byte-sequence varies depending on endianness.
Similarly, bit operations in progamming languages are not subject to endian problems, because they are specified in terms of either ASCII strings or bit-shifts relative to the "logical" least-significant bit, regardless of the binary form of the compiled machine code. For example, if I want to set the 13th least-significant bit of an integer "x" in a C program, I would do something like: "x | 4096" or "x | 0x1000" or "x | (1 << 12)". If I subsequently print out the value of "x", using printf, as a decimal or hexadecimal number, I will get the same result on both big- and little-endian systems. I will only have a problem if I typecast &x to "char *" and print out the bytes one by one.
As another example, consider the "chmod" command in Unix, which allows you to specify the file permissions as a bitmask in octal. "chmod 0644", where "0644" is an octal numeral, sets the same file permissions on both big- and little-endian systems.
So, in short, while OOXML's use of bitmasks has numerous problems, portability between different-endian systems is not one of them.
— Steven G. Johnson 20:52, 21 January 2007 (EST)
- I agree. Months ago when I first noticed the bitmasks I had a "gotcha" moment where I was sure it was endian-dependent. But further reflection convinced me that it was OK. The stronger criticism here is that bitmasks defy validation by XML Schema, don't work well with XML tools like XSLT, and are just plain silly in a XML format that is going to be zipped up in the end.
- --Cicero 22:34, 21 January 2007 (EST)
ST_Panose defined twice
ST_Panose is defined in 2.18.72 (page 2569) and 5.1.12.37 (page 4502). Is that legal? Or is there some implicit namespacing I'm missing at this low level? I discovered this when dredging through all the hexBinary issues. --Trollsfire 16:13, 22 January 2007 (EST)
- If having multiple definitions is a problem, then ST_TwipsMeasure is defined both in 2.18.105 (page 2619) and 7.1.3.16 (page 5884). The definitions seem compatible, but still.
- --Trollsfire 19:36, 22 January 2007 (EST)
Multiple types used for RGB coding
In digging through all the hexBinary inconsistencies, I discovered that there are three different variable types used for storing RGB data:
- ST_HexColorRGB (2.18.45, page 2520) which, in my opinion, is the best named
- ST_UnsignedIntHex (3.18.86, page 3712) which is 4 octets and may be used for thing other than RGB color as well
- ST_HexBinary3 (5.1.12.28, page 4483) which is accurately named for what it is (a length 3 (3 octet) hexBinary variable) but not what it does
I reference this in the section about the hex inconsistencies, but should this be extracted from there and put elsewhere? A section about internal redundancies (multiple definitions for the same things), maybe? --Trollsfire 16:20, 22 January 2007 (EST)
cryptography and fields
The end of 2.5.1.28, p 1165, worries me quite a bit: not only is there the ability to use any encryption method desired, but any possible external module might be used to create it. algIdExtSource + algIdExt just screams "embrace and extend". On a related topic, 2.6 (p 1487) talks about fields, and "The act of carrying out a field's codes is referred to as a field update. As to how or when any field is updated is outside the scope of this Office Open XML Standard." 2.16.4.1 has a similar comment: "If no date-and-time-formatting-switch is present, a date or time result is formatted in an implementation-defined manner." --Dogcow 16:46, 22 January 2007 (EST)
- Good finds! I've added your 2.16.4.1 example to the section "Relies on application-defined behaviors". We might consider adding your other two examples if the problems with them can be clearly explained. (For example, how might inconsistent field-updating cause interoperability problems? Is this just an user-interface issue?) — Steven G. Johnson 19:26, 22 January 2007 (EST)
Paper Sizes
Section 3.3.1.61 page 2770 "pageSetup" (Page Setup Settings) and 3.3.1.62 page 2774 "pageSetup" (Chart Sheet Page Setup) have an Attribute entitled 'paperSize' which is enumerated to 68 fixed paper types. This is contradictory to the naming convention of paper sizes as defined by ISO 216 (A,B,C sizes) used internationally and ANSI Y14.1 used in the USA.
- Wikipedia: Paper Size
- I agree that this is a relatively small issue, but it does show Ecma 376's restriction on the choice of paper sizes and its insistence to use internal lookup tables contrary to ISO names and also defeats the purpose of readable XML. -- Yoonkit 23:02, 22 January 2007 (EST)
- I agree also, and I re-added this section in the following form:
- Nonstandard, inflexible paper-size naming
- Sections 3.3.1.61 (page 2770) and 3.3.1.62 (page 2774), both of which involver printer settings, define a "paperSize" attribute whose value is an integer representing one of 68 fixed paper sizes. These paper-size codes are apparently based on corresponding paper-size registry codes in Microsoft Windows, rather than using the standard paper-size names as defined in ISO 216, ANSI Y14.1, and similar standards. In contrast, ISO 26300 employs a much more flexible scheme: it simply describes the paper size by recording the physical width and height of the page, leaving the assignment of symbolic paper-size names to the user interface.
- As you might have guessed, these inflexible 68 numeric codes are a direct dump of a Windows-registry data structure. — Steven G. Johnson 01:13, 23 January 2007 (EST)
- Thanks very much for including this in. -- Yoonkit 01:54, 23 January 2007 (EST)
Uses a Microsoft-specific namespace
This is whats currently written.. Section 6.2.3.23 page 5197 Attribute "href" (Hyperlink Target) uses a Namespace "urn:schemas.microsoft.com:office:office".
An Ecma standard must not reference company-specific namespaces. This should be replaced by an Ecma namespace.
--
Ive done more checking, and the VML Reference Material, pg 5126 states:
- "To maintain backward compatibility, all VML namespaces defined in this specification maintain the legacy namespace structure already used by millions of documents.
- [Note: The VML format is a legacy format originally introduced with Office 2000 and is included and fully defined in this Standard for backwards compatibility reasons. The DrawingML format is a newer and richer format created with the goal of eventually replacing any uses of VML in the Office Open XML formats.
- VML should be considered a deprecated format included in Office Open XML for legacy reasons only and new applications that need a file format for drawings are strongly encouraged to use preferentially DrawingML."
So it explains why the namespace is there, but it does not explain why a deprecated format is being included in this "modern" spec. How should we rephrase this concern?
Yoonkit 21:27, 22 January 2007 (EST)
- I've taken a stab at rephrasing it. See the document. — Steven G. Johnson 21:39, 22 January 2007 (EST)
- OK, that reads OK, Thanks! -- Yoonkit 22:55, 22 January 2007 (EST)
A string is used to define deprecated VML in DrawingML
section 6.5.22 (p. 5743) "textdata"
This element specifies optional supplementary text information associated with a legacy VML shape that is a node in a VML diagram when it cannot otherwise be stored within the DrawingML framework.
[Note: An application could use this to preserve a specific diagram format for backward compatibility, but it is strongly recommended to upgrade all VML shapes to DrawingML shapes. end note]
Is this the "billions" of backward compatibility support? store it as a "textdata"?
Yoonkit 21:36, 22 January 2007 (EST)
Ecma 376 contradicts SVG colour names
This got dropped yesterday:
--
SVG color values
Ecma 376 section 2.18.46 page 2521, contradicts the SVG Color Keyword Names hexadecimal RGB values for given color names.
|
- Compatibility Note
- There is no need for redefining color names to achieve compatibility with existing Microsoft Office documents. Microsoft is free to use whatever color names it wishes on its office application and store the hexadecimal color value in the file.
--
I think its quite important that we show how colours are changed in the spec, and will cause problems. As demonstrated by the colours, they are quite different, e.g. DarkGray and DarkGreen.
Its a direct contradiction with completely different RGB values. -- Yoonkit 22:58, 22 January 2007 (EST)
- I agree. This section was apparently deleted without explanation by User:Dcarrera, who I've noticed has been making unexplained and repeated deletions in numerous sections. Not from malice, I think — I suspect he's doing it in the name of brevity and focusing on the most important points. He/she wrote, as a hidden comment in the article:
- The OBJECTIVE of this document is NOT to have an exhaustive list of everything weird or unusual in the Ecma spec. The OBJECTIVE is to make a strong argument. A strong argument is not made stronger by adding weak points. Adding weak points makes the argument WEAKER because it looks like you are hair-splitting and makes the good arguments harder to find.
- My feeling is that, while the most important points are surely the duplication of OpenDocument, SVG, and other existing XML standards, these little inconsistencies and oddities go a long way in demonstrating the haste and lack of care in the Ecma process. The thing with little inconsistencies and oddities, however, is that because they are each minor things individually, it is really by force of numbers that they make their case. Furthermore, this page (as far as I can tell) is mainly intended as a resource for those who wish to object to Ecma 376, since we cannot submit objections directly, and therefore should err on the side of inclusiveness. I'll restore the section. — Steven G. Johnson 00:09, 23 January 2007 (EST)
Ive found that Section 5.1.12.48 (p. 4531) "ST_PresetColorVal" (Preset Color Value) has excellent correlation with SVG colors. But of course it misses the opportunity for following well defined standards but continues to subvert it with a different name "darkGray" becomes "dkGray" -- Yoonkit 01:29, 23 January 2007 (EST)
- Ive blogged this at: SVG Colour Contradiction --- Yoonkit 14:24, 26 January 2007 (EST)
Legacy Information
- there's more to be added here, another "minor" concern, but may be significant if we can catalogue all the "legacy" tags which can be represented or duplicated with similar functionality in the modern spec -- Yoonkit 01:13, 23 January 2007 (EST)
Preset Color
Section 5.8.2.6 (p. 638) "prstClr" (Preset Colour)
- "This is a legacy definition of colors which is no longer currently used. A preset 5 color is a choice from among several presets provided in older versions of Office."
why is this in a modern spec? why not use SVG colours? or convert to new spec colours as in 5.1.12.48 (p. 4531)?
"Application-defined" and undefined "legacy" merge document formats?
In section 2.18.63, ST_MailMergeSourceType (Mail Merge ODSO Data Source Types), which is described as "purely a suggestion" about the source type, many of the possible enumeration values are undefined or poorly defined. It's not clear to me what the practical impact of this "suggestion" is, so I don't know how serious an omission this might be.
Two of the possible enumeration values are "document1" and "document2" on p. 2549 of the PDF, defined as:
- Specifies that a given merged WordprocessingML document has been connected to another document format supported by the producing application. The format of this document is application-defined and outside the scope of this Office Open XML Standard.
Another of the possible enumeration value is "legacy", defined as:
- Specifies that a given merged WordprocessingML document has been connected to a legacy document format supported by the producing application. The format of this legacy document is application-defined and outside the scope of this Office Open XML Standard.
More of OOXML's touted backwards-compatibility at work?
And then, in case there weren't enough application-defined mail-merge source types, another enumeration value is "native", defined as:
- Specifies that a given merged WordprocessingML document has been connected to another document format native to the producing application. The format of this document is application-defined and outside the scope of this Office Open XML Standard.
Another format is "text", defined as:
- Specifies that a given merged WordprocessingML document has been connected to a text file.
7-bit ASCII? Unicode? It doesn't say.
— Steven G. Johnson 02:38, 20 January 2007 (EST)
Another undefined "legacy" feature: the DigSig element
7.2.2.6 defines the DigSig element (page 5105):
- This element contains the signature of a digitally signed document. [Note: This property is a mechanism used by legacy documents to store the digital signature of its binary representation, and should be considered deprecated in favor of the well-defined mechanism defined in Part 2. Any use of this property should be for legacy compatibility only, and is application-defined. end note]
(Note that it is "application-defined", and no other information is given.)
— Steven G. Johnson 03:22, 20 January 2007 (EST)
Microsoft Office names
Section 6.2.2 (p. 735) "Placement" "MSO-Position-Horizontal"
- "Specifies relative horizontal position data for objects in WordprocessingML."
What does 'MSO' represent? Micro-Soft Office? horrors.
6.1.2.2 (p. 5145) "style" (Shape Styling Properties) is based on CSS2 and subverts it.
- Specifies the CSS2 styling properties of the shape. This uses the syntax described in the "Visual formatting model" of the Cascading Style Sheets, Level 2 specification, a Recommendation of the World Wide Web Consortium available here: http://www.w3.org/TR/REC-CSS2. Full descriptions of each property are not repeated here, but the VML treatment of each property is defined. Allowed properties include:
- mso-position-horizontal
- mso-position-horizontal-relative
- mso-position-vertical
...
- mso-wrap-style
- mso-direction-alt
Admittedly VML will be deprecated, but why is this here?
This is re-declared many times (lost count, but more than 8 times!) till p.5640 ...
Yoonkit 01:40, 23 January 2007 (EST)
VML and DrawingML
Section 6.1 page 5126 "VML" states:
- "VML is a language for defining graphical objects in cases where DrawingML does not apply, such as text boxes and shapes in WordprocessingML documents and comments and controls in SpreadsheetML documents. This namespace provides the base elements and attributes for defining shape primitives. Other VML namespaces define elements that layer on information beyond the baseline graphical definition. To maintain backward compatibility, all VML namespaces defined in this specification maintain the legacy namespace structure already used by millions of documents."
- Does this mean that DrawingML (5 page 3994) is inadequate and not complete? How can 2000+ pages NOT supercede the features of VML?
- VML is obviously provides functionality to WordprocessingML and SpreadsheetML, so it wont be going away anytime soon.
- VML (8.6.2 page 25) is mentionned as "should be considered a deprecated format", is this very important to require everybody to implement this format if it's already deprecated. VML specification is 617 pages (from 5126 to 5743).
OOXML defines two *different* hashing algorithms
- Ecma 376 section 2.15.1.28 (page 1941) does not follow the advice of any of these organizations. Instead, it defines a new hashing algorithm that has not undergone scrutiny by the cryptographic community. Section 3.3.1.69 page 2786 "protectedRange" has yet another implementation called 'GetPasswordHash'
I don't think this is correct. Sections 2.15.1.28 and 3.3.1.69 define two different insecure hash algorithms, not just different "implementations." (For example, 2.15.1.28 defines an "encryption matrix" of words to XOR with the key, but 3.3.1.69 uses no such matrix.)
I've fixed this twice already, and both times my changes have been reverted without explanation, most recently in an enormous edit by User:Marbux. This kind of behavior seems to be common on this page, and it's getting annoying (see below).
And now the page is protected. WTF?
— Steven G. Johnson 12:22, 23 January 2007 (EST)
Does nobody know wikiquette?
Just today, User:Marbux deleted several sections of the document, including several that had been discussed above, and reverted numerous other changes in the document. All in a single edit with no edit comment.
I'm beginning to hate editing this wiki, because the same things are happening repeatedly.
A few clues:
- Use edit comments to explain what you are doing for non-trivial edits.
- Edit a wiki page by making changes to the relevant sections, not by editing the whole document at once and blowing away everyone else's changes in the meantime.
- If you want to delete large sections of other people's edits, it's polite to give some explanation on the Talk page, and preferably wait for comment first.
- Don't revert the same text repeatedly without explanation.
— Steven G. Johnson 12:22, 23 January 2007 (EST)
The whole ISO 639 issue is not correct
Forget this whole issue about ST_LangCode simple type that use hex values. All references to languages in the Word document format use the ST_Lang simple type, which is described in clause 2.18.51, and which does allow for use of ISO 639.
See also Wouter van der Vlugt blog
I suggest the issue is removed from the different pages. HAl 08:35, 30 January 2007 (EST)
- It is true that ST_Lang is either and ISO 639 language code or an ST_LangCode. However, that means that every application which can read these documents must implement both forms (and convert these forms into the application's appropriate internal format). If ST_LangCode does not offer any functionality that can not be noted with ISO 639 codes (I have not checked to see if there are any languages which are listed in ST_LangCode but are not in ISO 639, but I do know there are languages in ST_LangCode which are in ISO 639), why does it even need to be there? You are correct that it isn't proper to say that ISO 639 isn't supported; more properly, the standard requires handling ISO 639 and a second, hardcoded, redundant, and so for unjustified way of supporting languages. This burden is borne by all implementers for no real benefit. (Actually, I believe the benefit is solely to Microsoft since these codes correspond to the codings in Windows; standards should not confer benefits to one party with the costs borne by all other parties. However, this aside need not even be true for two language designations to be a bad idea in a standard.) --Trollsfire 13:32, 6 February 2007 (EST)
- I am glad that someone finally agrees this objection is not valid. I can see what you mean with supporting two languages but actually the ooxml standard has no requirement in what you do or do not support. So reuirement is not exaclty correct I guess any implementation wanting to be fully compatible with MS Office would require the hardcoded table for reading the file but those applications will do so anyways and probably already have this tables. I would think that most application will just use a hardcoded ISO 639 translation table that has an extra column with this numbering.
- However most applications will only support the ISO language code in writing and applications that write Office files are much much more important than reading as the office formats are now in reach of normal .NET, java or other applications that can easily create direct office document output. It would be good if MS office 2007 would also always use at least the ISO coding but I haven't got any files created with Office 2007 to see if it does. HAl 14:38, 6 February 2007 (EST)
OOXML does use standard dates.
In most of the spec OOXML conforms to the standard W3C XML schema validation. The standard XML does not use ISO 8601 but a subset of ISO 8601. The ISO 8601 standard is extremly extensive and put a lot of extra implementation effort to fully implement which is why w3c implemnets the subset which is reused by OOXML.
As for the issues raised about the 1900 and 1904 date formats. The 1900 and 1904 date format are for use in spreadsheet cells only. It would be extremely difficult to implement ISO 8601 correctly in spreadsheets. For instance when two date are subtracted in ISO that leaves a 'period' it would be something that is very weird implementing into a spreadsheet.
Example Cel a contains 20070214T131031
Cel b contains 20070214T131030
Cel c contains 20070214T131030/20070214T131031
These are all correct ISO 8601 where in cell c there is effectivly a 1 second period that corresponds with the difference between cell a and cell b.
In an internal spreadsheet formula format that would represent as: Cel a 39127,548969907400
Cel b 39127,548958333300
Cel c 0,000011574113
This is an example of the internal Excel 1900 date format where in cell c also is a 1 second period that corresponds with the difference in cell a and cell b. However it is virtually impossible to go from ISO to internal format and then convert back to ISO without data loss.
To prevent this an application should therefore also use the ISO format as an internal format and convert the format for every individual calculation with the format which would be extremly slow and isn't used by any spreadsheet. HAl 09:18, 30 January 2007 (EST)
---
But the point is not that EOOXML uses a start date to calculate days. The point is it use two start dates, 1900 and 1904, and that it calculates the wrong number of days when using 1900 as a start. Forcing all other applications to miscalculate leap years is rather awkward. Furthermore, EOOXML does not allow dates BEFORE 1900. Which would make the use of EOOXML spreadsheets for historical research "difficult". For instance, you cannot use EOOXML spreadsheets to study Washington's book-keeping, or the costs of the war of independence. --Winter 03:10, 2 February 2007 (EST)
- The ISO 8601 standard only uses gregorian time which is only used for valid dates after 1582 as before that time historians use Julian time. So use of spreadsheet date fields is limited for historians whatever standard you use. Definitly the ISOO standard has more possibilities but a big big questionsmark can be put over it's usefullness in spreadsheets. The date1900 is much simpler for shreadsheetuse and serves definitly a different requirement (namely processing) then the ISO format. Therefore is is not a contradiction as such. The leap year issue I do find a valid issue as well but it is also a very minor issue. In implementing it is just adding 1 extra if statement.
- Date1904 seem for use in converters and legacy Apple applications only. That is a valid item for legacy implemention which is actually an important goal of the standard. So that is no issue but a normal part of the spec but one that almost no implementations will need to support. HAl 04:34, 2 February 2007 (EST)
- HAl, I dont understand; I need your help to explain this. Why can't the application write to file using ISO 8601 e.g '20070214T131031', and upon loading, convert it to whatever internal number the application implements? e.g. If the internal value was '39127.548969907400' it will keep it as that until the need to save to file. Would that solve the 'period' and 'performance' problems? Is it such a chore to parse the ISO 8601 once on the read process? If the application has to load up Lotus1-2-3 files then it just loads up '39127.548969907400' as '39127.548969907400'. No big deal. Instead MSOOXML is forcing the new writes to remain '39127.548969907400' thus propagating this bug forever. Yoonkit 06:29, 2 February 2007 (EST)
- The leap year bug is a seperate issue from the use of which format. The leap year bug is to be able to maintain 100% valitity with legacy spreadsheets. Because you cannot realiably detect which dates are affected or not you cannot realiably convert dates in old spreadsheets to get rid of the bug. However I think a better solution would have been possible.
- The format use issue. You claim that you can convert ISO dates into a processing format and back. This hoever is not per se correct. Because ODF for example specified periods of time as a valid datatype it is incredibly hard to convert them 100% reliable to a processing dataformat and back. To be able to do reliable conversions it requires that the internal format is exactly as complex as the ISO format. That would take away the advantage of using an simple internal format for calculations.
Also such conversion if it were possible could take a long time if the spreadsheets have lots of datefields. (I currenctly have a few with almost a million dates but that is because I was limited to only 65k rows before so they will soon contain about 4,5 million dates or more. Because date1900 is a date format made for easy processing in spreradsheets it certainly does not contradicts ISO 8601 as that certainly is NOT created as processing format. HAl 07:49, 2 February 2007 (EST)
Licensing
The information provided on licensing does not list the analysis by Baker and Mackenzie that was done for Microsoft. http://www.bakernet.com/BakerNet/Resources/Publications/Recent+Publications/OpenXML.htm
As a legal analysis provided by a reputable legal firm contracted by Microsoft it would seem to hold more legal status than any analysis done by Groklaw. Microsoft can't really go against the legal analysis that are provided and published by the legal advisors on behalf of Microsoft for informing their relations, customers and governments.
The fact that this legal analysis is not mentioned by Groklaw seems a serious flaw in any legal reasoning about the licensing of the Office Open XML format.
I wonder why Groklaw does not use that official analysis as an important source of interpreting the CNS but in stead tries to create an alternate analysis ???? HAl 10:37, 31 January 2007 (EST)
---
You are right it should have been mentioned. I would really like to see that, because I cannot find any point where it is actually stated that MS grants anyone rights to use their patented technology to implement EOOXML in full. The closest part to stating something about this in the analysis I found was:
There are three qualifications detailed in the Microsoft CNS, which are also present in Sun’s Covenant and reflect standard industry practice.The first is designed to protect Microsoft from the actions of others. It states that the covenant will not apply where a person asserts or threatens to assert rights against Microsoft.The second qualification concerns the scope of Microsoft patents: it is designed to put users on notice that a conforming implementation of the Schema may not include a patent claimed by Microsoft or, if the conforming implementation does include such a patent, that the patent may not be enforceable.The third qualification addresses the intellectual property rights of others that any conforming implementation of the Schema may contain. Microsoft is not in a position to protect users from any such third party infringement.The second and third qualifications are designed to protect Microsoft from any liability arising from the implementation of the Schema. As such, neither impact on the ‘safe harbour’ users are given under the CNS from any Microsoft enforcement action.
IANAL. Now does MS grants me the right to implement EOOXML in full or not?
- Wierd you cite from the analysis a part that does not focus on rights but on explaining the restriction of the CNS that are simular to the restricions that are posed by Sun's CNS on OpenDocument.
- When looking at the analysis looking for rights I would focus on very usefull parts like this: The CNS is designed to simplify the implementation of the Schema by users. As noted above, this is achieved by removing any usage limitations on those implementing the Schema. In the course of implementing the Schema, it is likely, if not inevitable, that the Schema will be incorporated into products that are designed to interoperate with Microsoft’s Office 2007, as either complementary or competitive applications.The CNS does not affect users’ rights to create their own applications using the Schema specifications. For example, there are no restrictions in the CNS that would prohibit third parties from incorporating the standard into applications they create and distribute in source code form, or for other hardware or operating-system platforms.
- Do you see any restricting in implementing from that ??? Somehow it seems this kind of information isn't allowed on the Grokdoc as it might actually seem that it contradicts most of their claims mayby ????
- You should probalby also note that Microsoft cannot grant rights that are in the hands of standards organisation Ecma international which is the author and publisher of the final standard document. HAl 05:01, 2 February 2007 (EST)
The ISO/IEC 8632 (Computer Graphics Metafile) issue. More FUD
The objections about the OOXML spec contradicting ISO/IEC are really ridiculous. I know this FUD came from OASIS attorney Andy Upgrove but did anyone ever read the OOXML spec on that subject.
I'll cite the blog by Wouter van Vugt as he seems to have worded it best:
Similarly, 6.2.3.17 Embedded Object Alternate Image Requests Types (page 5679) and section 6.4.3.1 Clipboard Format Types (page 5738) refer back to Windows Metafiles or Enhanced Metafiles
Either someone misread the OOX specs here, or they are just trying to throw in another ‘contradiction’ in there to try and foil the evil empire. OpenOffice uses Metafiles for embedded objects, so I doubt anyone out there would really think it’s a crime to do so. The Open XML spec does not contain any requirement on EMF or WMF. The section people are talking about (6.2.3.17) can use any format. The allowed values for clipboard format types are "Bitmap", "Pict", "PictOld", "PictPrint", and "PictScreen". The spec then gives you some potential formats based on those values, saying that "Pict" and "PictOld" could be mapped to the WMF and EMF formats. Though the spec could be more informative of this though.
HAl 05:19, 2 February 2007 (EST)
Poor semantics in Licensing discussion
From the objections: However, glossing over that problem by assuming arguendo that some rights were granted nonetheless, the existence and scope of those rights can only be determined by examining the specification itself to determine the meaning of conform and its variants. But those are rights identified in another document, the specification, not "in this [covenant/promise]." That is fatal to any attempt to use the Ecma 376 specification as a source of rights granted because the reader is forbidden from looking to another document as a source of implementer's rights not expressly stated in the IP documents.
This text is ment to state that the specs of OOXML cannot be used to define conformance (which it actually does define in the part 1 funcdamentals).
But actually the spec CAN be used very well to show what is conformance. The cited clause from the objections page states that no 'rights' are granted that are not in the promise/covenant. however conformance is not a 'right' as such and the fact that it is not in the text of the OSP or CNS therefore does not matter.
It would be actually be kind of hard hard for Micrsoft to define conformance on a spec that is controlled by Ecma. Only Ecma can define conformance and a such they have done that in the spec itself. That conformance paragraphs make it clear that no full implementation is required to conform to the Ecma specs. So as Ecma has defined conformance in the spec, Microsoft can reference that definition in their CNS/OSP without any trouble. HAl 11:43, 2 February 2007 (EST)
Patent claim
The groklaw objections state that:
"software may employ methods and concepts described in a patent's claims, but the patent claims are not the methods and concepts described therein"
However this is not correct at all. In software patents the patent claim in general is the method or a system to implement the invention.
So in granting rights to patent claim to software you actually grant rights to methods and systems that implement an invention. The objection state that you cannot grants patent rights to implement things but that is therefore not correct. HAl 06:52, 3 February 2007 (EST)
Objections against examples ???
The following is listed as an objhection:" Mismatched detailed description The text that describes USERINITIALS, section 2.16.5.77 (p. 2353–2354), instead discusses USERNAME.
However this is a very minor error not in the description but in an example text. So it has no real effect on the standard whatsoever. Does anyone really consider this a serious objection that matters in ISO standardization ? what's next ? Are we going to list spelling error's and typo's to as objections ?
I just looked at the ODF specs. They have revision lists for version 1.0 revised edition and version 1.1 full with minor error like this. Some very minor issues do not grant objective reasons to object against the standardization as is suggested here by Grokdoc/Groklaw. HAl 07:46, 6 February 2007 (EST)
A TWIPpy objection ???
A TWIP (twentieth of a point) is mayby not a standard ISO size measument but it is a derived measument based on PostScript points. As postscript, which is globally used as a printer format , uses points measuments I cannnot see that a small item of points measurement in a closely related to postscript Office format would be a problem which warrant objections for ISO standardization. Is this realy a serious listing of things that warrent objections against the standard or just a colletion of nitpicking items to create an atmosphere of FUD around the OOXML standardization. HAl 08:06, 6 February 2007 (EST)
Bitmasks cause significant validation problems
I would like to see the reasoning behind this claim. Schematron can definitely validate bitmasks, either as binary number strings or HEX numbers; Following is some code as a rough example
<rule context="x">
<let name="i" select="." />
<let name="bit1" select=" i - (floor(i / 2) * 2) " />
<let name="bit2" select=" floor(i/2) - floor(i/4)*2 " />
<let name="bit3" select=" floor(i/4) - floor(i/8)*2 " />
<let name="bit4" select=" floor(i/8) - floor(i/16)*2 " />
<let name="bit5" select=" floor(i/16) - floor(i/32)*2 " />
<let name="bit6" select=" floor(i/32) - floor(i/64)*2 " />
<let name="bit7" select=" floor(i/64) - floor(i/128)*2 " />
<let name="bit8" select=" floor(i/128)" />
<assert test=" $bit1 = 1"> The first bit should be 1</assert>
<assert test=" $bit1 + $bit2 + $bit3 + $bit4 <= 1 ">Only one of the first four bits can be 1</assert>
</rule>
Furthermore, W3C XML Schemas can validate bitmasks, for example using regular expressions or enumerations or unions or numeric ranges. There may be a few constraints it cannot express compared to ISO Schematron, but that is typical for many kinds of schemas. Indeed, it is the fundamental concept in IS8879 SGML that the "Document Type Declarations" cannot capture all of the 'document type definition".Rick Jelliffe 18:40, 7 February 2007 (EST)
Non XML formatting codes
The names 'b', 'i' as bold and italic are HTML element names. They are hardly unfamiliar or uncommon or ambiguous. (Indeed, if my memory serves me, they are also used in a sample DTD by an ISO Technical Report (TR) from (now) SC34 on Techniques for Using SGML.) If CSS is allowed as standard for your purposes, why is HTML not?Rick Jelliffe 18:48, 7 February 2007 (EST)
XML does not define any element names. So it is incorrect to claim that any element name that is well-formed is "contrary" too XML.Rick Jelliffe 18:48, 7 February 2007 (EST)
- Did you actually look at that page in the ECMA 376 spec? It is not using "b" and "i" as XML element names similar to HTML. It is using "\b" and "\i" at the end of a line to make the preceding line bold and italic, respectively. I agree that the Grokdoc explanation of this is confusingly worded, however. — Steven G. Johnson 12:03, 8 February 2007 (EST)


