package require expat
expat ?parsername? ?-namespace? ?arg arg ...?
or
xml::parser ?parsername? ?-namespace? ?arg arg ...?
The parsers do not validate the XML document. They do parse the internal DTD and, at request, external DTD and external entities, if you resolve the identifier of the external entities with the -externalentitycommand (see there).
Additionly, the Tcl extension code that implements this command provides an API for adding C level coded handlers. Up to now, there exists the parser extension command "tdom". The handler set installed by this extension build an in memory "tDOM" DOM tree, while the parser is parsing the input.
It is possible to register an arbitrary amount of different handler scripts and C level handlers for most of the events. If the event occurs, the are called in turn.
If this option is set to "0" then the parser will not report certain errors if the XML data is not well-formed upon end of input, such as unclosed or unbalanced start or end tags. Instead some data may be saved by the parser until the next call to the parse method, thus delaying the reporting of some of the data.
If this option is set to "1" then documents which are not well-formed upon end of input will generate an error.
The attribute list is a Tcl list consisting of name/value pairs, suitable for passing to the array set Tcl command.
Example:
$parser configure -elementstartcommand HandleStart proc HandleStart {name attlist} { puts stderr "Element start ==> $name has attributes $attlist" } $parser parse {<test id="123"></test>}
This would result in the following command being invoked:
HandleStart test {id 123}
Specifies a Tcl command to associate with the end tag of an element. The actual command consists of this option followed by at least one argument: the element type name. In addition, if the -reportempty option is set then the command may be invoked with the -empty configuration option to indicate whether it is an empty element. See the description of the -reportempty option for an example.
Example:
$parser configure -elementendcommand HandleEnd proc HandleEnd {name} { puts stderr "Element end ==> $name" } $parser parse {<test id="123"></test>}
This would result in the following command being invoked:
HandleEnd test
It is not guaranteed that character data will be passed to the application in a single call to this command. That is, the application should be prepared to receive multiple invocations of this callback with no intervening callbacks from other features.
Example:
$parser configure -characterdatacommand HandleText proc HandleText {data} { puts stderr "Character data ==> $data" } $parser parse {<test>this is a test document</test>}
This would result in the following command being invoked:
HandleText {this is a test document}
Example:
$parser configure -processinginstructioncommand HandlePI proc HandlePI {target data} { puts stderr "Processing instruction ==> $target $data" } $parser parse {<test><?special this is a processing instruction?></test>}
This would result in the following command being invoked:
HandlePI special {this is a processing instruction}
This handler script is special in two ways. First, it is required to return either the external entity opened as an Tcl channel or the content of the external entity as a string. Second, it could not be stacked like the other handler scripts. Behind the scene, the external entity referenced by the returned Tcl channel or string will be parsed with an expat external entity parser with the same handler sets as the main parser. If parsing of the external entity fails, the whole parsing is stopped with an error message. If a Tcl command registered as externalentitycommand isn't able to resolve an external entity it is allowed to return TCL_CONTINUE. In this case, the wrapper give the next registered externalentitycommand a try. If no externalentitycommand is able to handle the external entity parsing stops with an error.
External entities are only tried to resolve via this handler script, if necessary. This means, external parameter entities triggers this handler only, if -paramentityparsing is used with argument "always" or if -paramentityparsing is used with argument "notstandalone" and the document isn't marked as standalone.
Example:
$parser configure -commentcommand HandleComment proc HandleComment {data} { puts stderr "Comment ==> $data" } $parser parse {<test><!-- this is <obviously> a comment --></test>}
This would result in the following command being invoked:
HandleComment { this is <obviously> a comment }
Examples:
proc elDeclHandler {name content} { puts "$name $content" } set parser [expat -elementdeclcommand elDeclHandler] $parser parse {<?xml version='1.0'?> <!DOCTYPE test [ <!ELEMENT test (#PCDATA)> ]> <test>foo</test>}
This would result in the following command being invoked:
test {MIXED {} {} {}} $parser reset $parser parse {<?xml version='1.0'?> <!DOCTYPE test [ <!ELEMENT test (a|b)> ]> <test><a/></test>}
This would result in the following command being invoked:
elDeclHandler test {CHOICE {} {} {{NAME {} a {}} {NAME {} b {}}}}
Example:
proc attlistHandler {elname name type default isRequired} { puts "$elname $name $type $default $isRequired" } set parser [expat -attlistdeclcommand attlistHandler] $parser parse {<?xml version='1.0'?> <!DOCTYPE test [ <!ELEMENT test EMPTY> <!ATTLIST test id ID #REQUIRED name CDATA #IMPLIED> ]> <test/>}
This would result in the following commands being invoked:
attlistHandler test id ID {} 1 attlistHandler test name CDATA {} 0
-idattributeindex Returns the index of the ID attribute passed in the last call to XML_StartElementHandler, or -1 if there is no ID attribute. Each attribute/value pair counts as 2; thus this corresponds to an index into the attributes list passed to the elementstartcommand.
-currentbytecount Return the number of bytes in the current event. Returns 0 if the event is in an internal entity.
-currentlinenumber Returns the line number of the current parse location.
-currentcolumnnumber Returns the column number of the current parse location.
-currentbyteindex Returns the byte index of the current parse location.
Only one value may be requested at a time.
A script invoked for any of the parser callback commands, such as -elementstartcommand, -elementendcommand, etc, may return an error code other than "ok" or "error". All callbacks may in addition return "break" or "continue".
If a callback script returns an "error" error code then processing of the document is terminated and the error is propagated in the usual fashion.
If a callback script returns a "break" error code then all further processing of every handler script out of this Tcl handler set is suppressed for the further parsing. This does not influence any other handler set.
If a callback script returns a "continue" error code then processing of the current element, and its children, ceases for every handler script out of this Tcl handler set and processing continues with the next (sibling) element. This does not influence any other handler set.