Chilkat Software Chilkat Software Chilkat Software
Chilkat Software Chilkat Software

  

  

  

  

  

 

HTML to XML Conversion Sample #3

Goto Sample #1

Goto Sample #2

Goto Sample #4

This is the 3rd of several examples describing the details of how the Chilkat HTML-to-XML library converts HTML into well-formed XML.

Here is another HTML sample. You'll notice that this one contains several errors, which are automatically corrected by the HTML-to-XML library:

<html>
<head>
<title>This is a test</title>
<meta http-equiv="Content-Language" content="en-us">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
</head>
<body>
<table>
<tr>
<td>Row 1, column 1</td>
<td>Row 1, column 2</td>
<td>Row 1, column 3 Oops forgot the ending td
</tr>
<tr>
<td>Row 2, column 1 Oops...</abc>
<td>Row 2, column 2</td>
<td>Row 2, column 3</td>
</tr>
<tr>
<td>Row 2, column 1 Oops...</abc>
<td>Row 2, <div> This is a test </div> column 2</td>
<td>Row 2, column 3</td>
<!-- Oops, forgot to close the last tr -->
</table>

</body>
</html>

The XML output is shown below.

  • The XML below is well-formed and the HTML errors have been corrected.
  • HTML comments are saved within <comment> nodes.
  • All text content is placed under <text> nodes.
<?xml version="1.0" encoding="windows-1252" ?>

<root>
    <html>
        <head>
            <title>
                <text>This is a test</text>
            </title>
            <meta http-equiv="Content-Language" content="en-us"></meta>
            <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></meta>
        </head>
        <body>
            <table>
                <tr>
                    <td>
                        <text>Row 1, column 1</text>
                    </td>
                    <td>
                        <text>Row 1, column 2</text>
                    </td>
                    <td>
                        <text>Row 1, column 3 Oops forgot the ending td
                        </text>
                    </td>
                    <tr>
                        <td>
                            <text>Row 2, column 1 Oops...</text>
                        </td>
                        <td>
                            <text>Row 2, column 2</text>
                        </td>
                        <td>
                            <text>Row 2, column 3</text>
                        </td>
                    </tr>
                    <tr>
                        <td>
                            <text>Row 2, column 1 Oops...</text>
                        </td>
                        <td>
                            <text>Row 2, </text>
                            <div>
                                <text>This is a test </text>
                            </div>
                            <text>column 2</text>
                        </td>
                        <td>
                            <text>Row 2, column 3</text>
                        </td>
                        <comment>Oops, forgot to close the last tr</comment>
                    </tr>
                </tr>
            </table>
        </body>
    </html>
</root>

(The Chilkat HTML-to-XML API is offered across many programming languages: Ruby, Perl, Python, Java, C#, VB.NET, etc.)


Privacy Statement. Copyright 2000-2017 Chilkat Software, Inc. All rights reserved.

(Regarding the usage of the Android logo) Portions of this page are reproduced from work created and shared by Google and used according to terms described in the Creative Commons 3.0 Attribution License.

Send feedback to support@chilkatsoft.com


Software components and libraries for Linux, MAC OS X, iOS, Android™, Solaris, RHEL/CentOS, FreeBSD, MinGW
Azure, Windows 10, Windows 8, Windows Server 2012, Windows 7, Vista, XP, 2003 Server, 2008 Server, etc.