An efficient Xml implementation is slower and consumes more space than an efficient packed data equivalent. This is an inescapable conclusion which stems from the fact that Xml data is encoded in the clear, requires closing tags, has language constructs which do not by themselves convey any additional meaning, and must be parsed. It would be foolish to underestimate the resulting overhead, especially in a performance-sensitive application.
Consider a server application which transmits commands to its clients. The command is identified by the name of the tag, and the command text is enclosed by the tag. In the case of a packed data implementation, four bytes of data in each message could encode the message type, followed by two bytes for the message length, followed by the length of the message itself. By comparison, the server implementation would use at least seven bytes just to encode the opening and closing tag, and would more likely require as many as fifteen. If the message text is binary, then an additional base-64 translation will be necessary, even further inflating the size.
The packed message’s four-byte type identifier may be processed by a switch-case construct, whereas the Xml protocol’s message type must be evaluated by a string searching routine after parsing takes place. Packed messages can be interpreted and acted upon in a single pass, whereas Xml data requires (at the least) two passes.
Measures do exist, however, which can increase the efficiency of Xml. Two implementations of binary Xml, for example, exist which can improve the processing and space efficiency of Xml: A raw binary implementation, and a state-machine implementation.
The state machine implementation runs a state machine on both the client and server, and uses a packed data protocol in between. Tags and attributes are processed before transmission, and tag names are given identifiers the first time they are encountered. The XML data is then reconstructed at the destination. The system’s space efficiency is close to that of a packed data protocol, but the time performance is clearly not comparable. State machines are elegant in that they can easily replace a pure Xml transmission system, but do not escape the actual requirement that Xml data be parsed—they simply shift the task of parsing it from the client to the server.
The raw binary implementation uses identification numbers for everything except the actual data. It maintains no states, but because translation does not occur at the recipient back into Xml, it requires that the recipient know the ID numbers used in the place of tag and attribute names. This solution’s space and speed efficiencies are very close to that of a packed data protocol, but the obvious problem is that there isn’t actually any Xml in this implementation. In practice, raw binary implementations are little more than formalizations of packed data protocols, and because they cannot easily replace Xml systems, more forethought is required before implementation.

In any communication system, security should always be a consideration. Security is typically provided by a combination of obfuscation and cryptography, though the strongest implementations do not gain anything from obfuscation.
Before proceeding, it should be made clear that a determined hacker will be able to breach a weak cryptographic protocol, whether it is implemented in Xml or by a packed protocol. Ultimately, the security of a system should be rooted firmly in the security of the cryptographic algorithms used to implement the system. It should also be noted, however, that this kind of determined hacker is rare.
Except for cases where both peers share the same key, at least some of the communication between them will need to be in the clear. If a packed data protocol is being used, it takes time for an adversary to discover the protocols being used, and then even more time for the adversary to determine what attacks can be made on the system. Even if all of the communication between the client and the server is made in the clear, discovery of the data protocols can be a very difficult task.
With an Xml -based communication system, however, all of the security offered by protocol obscurity is lost. The system itself is exposed to the adversary with little or no analysis required. This is not a failing of Xml: Quite the opposite, it is one of its features. The standardization is intended to make it easy for unlike applications to communicate, even without documentation on the communication protocol being used.

Configuration Files

Xml lends itself well to small configuration files, or files which are loaded infrequently. The first, and most obvious, benefit is that an Xml-based configuration file can be modified in emergencies where the normal configuration utility may not be available. This may be especially important if the configuration file is so corrupted that the utility is unable to open it, or if settings must be changed that cannot be changed with the utility due to user interface inadequacies.
Xml as a configuration format also allows a software developer to completely eliminate the need for a configuration utility in the first place. Provided that the target audience is capable enough to modify Xml, the raw Xml configuration file may even lend itself more easily to such kinds of modification.
Additionally, there are hundreds of software utilities available for the direct modification of Xml data. Internet Explorer has a built-in parser for Xml, and almost every software development IDE supports Xml syntax highlighting.
On the negative side, however, is that Xml format can only store tree-type data natively. This means that a configuration file which must represent a graph—for example, a topology or a relationship table—must perform searches to reconstruct the graph. Using sequential ID numbers is a potential solution, but this damages the format’s human modifiability.


The process of standardization is itself an extremely difficult one. If one proposes a standard too early, the standard doesn’t consider enough information and is therefore not adopted. If the standard is proposed too late, too many people will have adopted their own techniques, and the standard will be ignored. Some examples of standards which came too early or too late would be JavaScript, the OSI layers, and the rewritable DVD.
Xml escapes some of this difficulty by allowing a standard to evolve within the context of an Xml format. Certainly, this isn’t a process that can occur on its own, but Xml provides the general framework and establishes the very basic things necessary for meaningful communication to occur.
An example of a format which has converged just this way is the RSS format. RSS is an Xml-based format which provides news-streaming services to its clients. Before RSS existed, there were several formats already available for syndication. This standard went through many changes, and was even orphaned by its creator, Netscape. After five years of update and modification, the New York Times offered its news services in RSS format, and shortly thereafter RSS 2.0 became a de-facto standard.

Keywords: Web Service Description Language, or WSDL,AJAX, programming technique,Xml implementation,Xml data,binary- Xml equivalent,Xml-based configuration file,XML,Xml parsing,JavaScript, the OSI layers, the rewritable DVD, RSS format