Mht Python Reference Documentation

Mht

Current Version: 10.0.0

Chilkat MHT can generate email objects from HTML files and URLs, and convert HTML to MHT or EML files.

Object Creation

obj = chilkat2.Mht()

Properties

AbortCurrent
bool AbortCurrent
Introduced in version 9.5.0.58

When set to True, causes the currently running method to abort. Methods that always finish quickly (i.e.have no length file operations or network communications) are not affected. If no method is running, then this property is automatically reset to False when the next method is called. When the abort occurs, this property is reset to False. Both synchronous and asynchronous method calls can be aborted. (A synchronous method call could be aborted by setting this property from a separate thread.)

top
BaseUrl
string BaseUrl

When processing an HTML file or string (not a website URL), this defines the base URL to be used when converting relative HREFs to absolute HREFs.

top
ConnectTimeout
int ConnectTimeout

The amount of time in seconds to wait before timing out when connecting to an HTTP server. The default value is 10 seconds.

top
DebugHtmlAfter
string DebugHtmlAfter

A filename to save the result HTML when converting a URL, file, or HTML string. If problems are experienced, the before/after HTML can be analyzed to help determine the cause.

top
DebugHtmlBefore
string DebugHtmlBefore

A filename to save the input HTML when converting a URL, file, or HTML string. If problems are experienced, the before/after HTML can be analyzed to help determine the cause.

top
DebugLogFilePath
string DebugLogFilePath

If set to a file path, causes each Chilkat method or property call to automatically append it's LastErrorText to the specified log file. The information is appended such that if a hang or crash occurs, it is possible to see the context in which the problem occurred, as well as a history of all Chilkat calls up to the point of the problem. The VerboseLogging property can be set to provide more detailed information.

This property is typically used for debugging the rare cases where a Chilkat method call hangs or generates an exception that halts program execution (i.e. crashes). A hang or crash should generally never happen. The typical causes of a hang are:

  1. a timeout related property was set to 0 to explicitly indicate that an infinite timeout is desired,
  2. the hang is actually a hang within an event callback (i.e. it is a hang within the application code), or
  3. there is an internal problem (bug) in the Chilkat code that causes the hang.

More Information and Examples
top
DebugTagCleaning
bool DebugTagCleaning

When True causes the Mht class to be much more verbose in its logging. The default is False.

top
EmbedImages
bool EmbedImages

Controls whether images are embedded in the MHT/EML, or whether the IMG SRC attributes are left as external URL references. If false, the IMG SRC tags are converted to absolute URLs (if necessary) and the images are not embedded within the MHT/EML.

top
EmbedLocalOnly
bool EmbedLocalOnly

If True, only images found on the local filesystem (i.e. links to files) will be embedded within the MHT.

top
FetchFromCache
bool FetchFromCache

If True, page parts such as images, style sheets, etc. will be fetched from the disk cache if possible. The disk cache root may be defined by calling AddCacheRoot. The default value is False.

top
IgnoreMustRevalidate
bool IgnoreMustRevalidate

Some HTTP responses contain a "Cache-Control: must-revalidate" header. If this is present, the server is requesting that the client always issue a revalidate HTTP request instead of serving the page directly from cache. If IgnoreMustRevalidate is set to True, then Chilkat MHT will serve the page directly from cache without revalidating until the page is no longer fresh. (assuming that FetchFromCache is set to True)

The default value of this property is False.

top
IgnoreNoCache
bool IgnoreNoCache

Some HTTP responses contain headers of various types that indicate that the page should not be cached. Chilkat MHT will adhere to this unless this property is set to True.

The default value of this property is False.

top
LastErrorHtml
string LastErrorHtml (read-only)

Provides information in HTML format about the last method/property called. If a method call returns a value indicating failure, or behaves unexpectedly, examine this property to get more information.

top
LastErrorText
string LastErrorText (read-only)

Provides information in plain-text format about the last method/property called. If a method call returns a value indicating failure, or behaves unexpectedly, examine this property to get more information.

top
LastErrorXml
string LastErrorXml (read-only)

Provides information in XML format about the last method/property called. If a method call returns a value indicating failure, or behaves unexpectedly, examine this property to get more information.

top
LastMethodSuccess
bool LastMethodSuccess

Indicate whether the last method call succeeded or failed. A value of True indicates success, a value of False indicates failure. This property is automatically set for method calls. It is not modified by property accesses. The property is automatically set to indicate success for the following types of method calls:

  • Any method that returns a string.
  • Any method returning a Chilkat object, binary bytes, or a date/time.
  • Any method returning a standard boolean status value where success = True and failure = False.
  • Any method returning an integer where failure is defined by a return value less than zero.

Note: Methods that do not fit the above requirements will always set this property equal to True. For example, a method that returns no value (such as a "void" in C++) will technically always succeed.

top
NoScripts
bool NoScripts

Only applies when creating MHT files. Scripts are always removed when creating EML or emails from HTML. If set to True, then all scripts are removed, if set to False (the default) then scripts are not removed.

top
NtlmAuth
bool NtlmAuth

Setting this property to True causes the MHT component to use NTLM authentication (also known as IWA -- or Integrated Windows Authentication) when authentication with an HTTP server.

The default value of this property is False.

top
NumCacheLevels
int NumCacheLevels

The number of directory levels to be used under each cache root. The default is 0, meaning that each cached item is stored in a cache root directory. A value of 1 causes each cached page to be stored in one of 255 subdirectories named "0","1", "2", ..."255" under a cache root. A value of 2 causes two levels of subdirectories ("0..255/0..255") under each cache root. The MHT control automatically creates subdirectories as needed. The reason for mutliple levels is to alleviate problems that may arise when huge numbers of files are stored in a single directory. For example, Windows Explorer does not behave well when trying to display the contents of directories with thousands of files.

top
NumCacheRoots
int NumCacheRoots (read-only)

The number of cache roots to be used for the disk cache. This allows the disk cache spread out over multiple disk drives. Each cache root is a string indicating the drive letter and directory path. For example, "E:\Cache". To create a cache with four roots, call AddCacheRoot once for each directory root.

top
PreferIpv6
bool PreferIpv6

If True, then use IPv6 over IPv4 when both are supported for a particular domain. The default value of this property is False, which will choose IPv4 over IPv6.

top
PreferMHTScripts
bool PreferMHTScripts

This property provides a means for the noscript option to be selected when possible. If PreferMHTScripts = False, then scripts with noscript alternatives are removed and the noscript content is kept. If True (the default), then scripts are preserved and the noscript options are discarded.

top
Proxy
string Proxy

(Optional) A proxy host:port if a proxy is necessary to access the Internet. The proxy string should be formatted as "hostname:port", such as "www.chilkatsoft.com:100".

top
ProxyLogin
string ProxyLogin

If an HTTP proxy is used and it requires authentication, this property specifies the HTTP proxy login.

top
ProxyPassword
string ProxyPassword

If an HTTP proxy is used and it requires authentication, this property specifies the HTTP proxy password.

top
ReadTimeout
int ReadTimeout

The amount of time in seconds to wait before timing out when reading from an HTTP server. The ReadTimeout is the amount of time that needs to elapse while no additional data is forthcoming. During a long data transfer, if the data stream halts for more than this amount, it will timeout. Otherwise, there is no limit on the length of time for the entire data transfer.

The default value is 20 seconds.

top
RequireSslCertVerify
bool RequireSslCertVerify

If True, then the HTTP client will verify the server's SSL certificate. The certificate is expired, or if the cert's signature is invalid, the connection is not allowed. The default value of this property is False.

top
SocksHostname
string SocksHostname

The SOCKS4/SOCKS5 hostname or IPv4 address (in dotted decimal notation). This property is only used if the SocksVersion property is set to 4 or 5).

top
SocksPassword
string SocksPassword

The SOCKS5 password (if required). The SOCKS4 protocol does not include the use of a password, so this does not apply to SOCKS4.

top
SocksPort
int SocksPort

The SOCKS4/SOCKS5 proxy port. The default value is 1080. This property only applies if a SOCKS proxy is used (if the SocksVersion property is set to 4 or 5).

top
SocksUsername
string SocksUsername

The SOCKS4/SOCKS5 proxy username. This property is only used if the SocksVersion property is set to 4 or 5).

top
SocksVersion
int SocksVersion

SocksVersion May be set to one of the following integer values:

0 - No SOCKS proxy is used. This is the default.
4 - Connect via a SOCKS4 proxy.
5 - Connect via a SOCKS5 proxy.

top
UnpackDirect
bool UnpackDirect
Introduced in version 9.5.0.47

If True, then the UnpackMHT and UnpackMHTString methods will unpack the MHT directly with no transformations. Normally, the related parts are unpacked to a "parts" sub-directory, and the unpacked HTML is edited to update references to point to the unpacked image and script files. When unpacking direct, the HTML is not edited, and the related parts are unpacked to sub-directories rooted in the directory where HTML file is created (i.e. the unpack directory). When unpacking direct, the "partsSubDir" argument of the UnpackMHT* methods is unused.

Note: It is only possible to directly unpack MHT files where the Content-Location headers DO NOT contain URLs. The MHT must be such that the Content-Location headers of the related items contain relative paths.

Note: The default value of this property is False.

top
UnpackUseRelPaths
bool UnpackUseRelPaths

Controls whether absolute or relative paths are used when referencing images in the unpacked HTML. The default value is True indicating that relative paths will be used. To use absolute paths, set this property value equal to False.

top
UpdateCache
bool UpdateCache

Controls whether the cache is automatically updated with the responses from HTTP GET requests. If True, the disk cache is updated, if False (the default), the cache is not updated.

top
UseCids
bool UseCids

Controls whether CID URLs are used for embedded references when generating MHT or EML documents. If UseCids is False, then URLs are left unchanged and the embedded items will contain "content-location" headers that match the URLs in the HTML. If True, CIDs are generated and the URLs within the HTML are replaced with "CID:" links.

The default value of this property is True.

top
UseFilename
bool UseFilename

If True, a "filename" attribute is added to each Content-Disposition MIME header field for each embedded item (image, style sheet, etc.). If False, then no filename attribute is added.

The default value of this property is True.

top
UseIEProxy
bool UseIEProxy

If True, the proxy host/port used by Internet Explorer will also be used by Chilkat MHT.

top
UseInline
bool UseInline

If True, an "inline" attribute is added to each Content-Disposition MIME header field for each embedded item (image, style sheet, etc.). If False, then no inline attribute is added.

The default value of this property is True.

top
VerboseLogging
bool VerboseLogging

If set to True, then the contents of LastErrorText (or LastErrorXml, or LastErrorHtml) may contain more verbose information. The default value is False. Verbose logging should only be used for debugging. The potentially large quantity of logged information may adversely affect peformance.

top
Version
string Version (read-only)

Version of the component/library, such as "9.5.0.94"

More Information and Examples
top
WebSiteLogin
string WebSiteLogin

(Optional) Specifies the login if a a Web page is accessed that requires a login

top
WebSiteLoginDomain
string WebSiteLoginDomain

The optional domain name to be used with NTLM authentication.

top
WebSitePassword
string WebSitePassword

Optional) Specifies the password if a a Web page is accessed that requires a login and password

top

Methods

AddCacheRoot
void AddCacheRoot(string dir)

If disk caching is used, this must be called once for each cache root. For example, if the cache is spread across D:\cacheRoot, E:\cacheRoot, and F:\cacheRoot, an application would setup the cache object by calling AddRoot three times -- once with "D:\cacheRoot", once with "E:\cacheRoot", and once with "F:\cacheRoot".

More Information and Examples
top
AddCustomHeader
void AddCustomHeader(string name, string value)

Adds a custom HTTP header to all HTTP requests sent by the MHT component. To add multiple header fields, call this method once for each custom header.

top
AddExternalStyleSheet
void AddExternalStyleSheet(string url)

(This method rarely needs to be called.) Includes an additional style sheet that would not normally be included with the HTML. This method is provided for cases when style sheet names are constructed and dynamically included in Javascript such that MHT .NET cannot know beforehand what stylesheet to embed. MHT .NET by default downloads and embeds all stylesheets externally referenced by the HTML

top
ClearCustomHeaders
void ClearCustomHeaders()

Removes all custom headers that may have accumulated from previous calls to AddCustomHeader.

top
ExcludeImagesMatching
void ExcludeImagesMatching(string pattern)

(This method rarely needs to be called.) Tells Chilkat MHT .NET to not embed any images whose URL matches a pattern. Sometimes images can be referenced within style sheets and not actually used when rendering the page. In cases like those, the image will appear as an attachment in the HTML email. This feature allows you to explicitly remove those images from the email so no attachments appear.

top
GetAndSaveEML
bool GetAndSaveEML(string url_or_htmlFilepath, string emlPath)

Creates an EML file from a web page or HTML file. All external images and style sheets are downloaded and embedded in the EML file.

Returns True for success, False for failure.

top
GetAndSaveEMLAsync (1)
Task GetAndSaveEMLAsync(string url_or_htmlFilepath, string emlPath)

Creates an asynchronous task to call the GetAndSaveEML method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
GetAndSaveMHT
bool GetAndSaveMHT(string url_or_htmlFilepath, string mhtPath)

Creates an MHT file from a web page or local HTML file. All external images, scripts, and style sheets are downloaded and embedded in the MHT file.

Returns True for success, False for failure.

top
GetAndSaveMHTAsync (1)
Task GetAndSaveMHTAsync(string url_or_htmlFilepath, string mhtPath)

Creates an asynchronous task to call the GetAndSaveMHT method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
GetAndZipEML
bool GetAndZipEML(string url_or_htmlFilepath, string zipEntryFilename, string zipFilename)

Creates an EML file from a web page or HTML file, compresses, and appends to a new or existing Zip file. All external images and style sheets are downloaded and embedded in the EML.

Returns True for success, False for failure.

top
GetAndZipEMLAsync (1)
Task GetAndZipEMLAsync(string url_or_htmlFilepath, string zipEntryFilename, string zipFilename)

Creates an asynchronous task to call the GetAndZipEML method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
GetAndZipMHT
bool GetAndZipMHT(string url_or_htmlFilepath, string zipEntryFilename, string zipFilename)

Creates an MHT file from a web page or HTML file, compresses, and appends to a new or existing Zip file. All external images and style sheets are downloaded and embedded in the MHT.

Returns True for success, False for failure.

top
GetAndZipMHTAsync (1)
Task GetAndZipMHTAsync(string url_or_htmlFilepath, string zipEntryFilename, string zipFilename)

Creates an asynchronous task to call the GetAndZipMHT method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
GetCacheRoot
string GetCacheRoot(int index)

Returns the Nth cache root (indexing begins at 0). Cache roots are set by calling AddCacheRoot one or more times.

Returns None on failure

top
GetEML
string GetEML(string url_or_htmlFilepath)

Creates EML from a web page or HTML file, and returns the EML (MIME) message data as a string.

Returns None on failure

top
GetEMLAsync (1)
Task GetEMLAsync(string url_or_htmlFilepath)

Creates an asynchronous task to call the GetEML method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
GetMHT
string GetMHT(string url_or_htmlFilepath)

Creates MHT from a web page or local HTML file, and returns the MHT (MIME) message data as a string

Returns None on failure

top
GetMHTAsync (1)
Task GetMHTAsync(string url_or_htmlFilepath)

Creates an asynchronous task to call the GetMHT method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
HtmlToEML
string HtmlToEML(string htmlText)

Creates an in-memory EML string from an in-memory HTML string. All external images and style sheets are downloaded and embedded in the EML string that is returned.

Returns None on failure

top
HtmlToEMLAsync (1)
Task HtmlToEMLAsync(string htmlText)

Creates an asynchronous task to call the HtmlToEML method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
HtmlToEMLFile
bool HtmlToEMLFile(string html, string emlFilename)

Creates an EML file from an in-memory HTML string. All external images and style sheets are downloaded and embedded in the EML file.

Returns True for success, False for failure.

top
HtmlToEMLFileAsync (1)
Task HtmlToEMLFileAsync(string html, string emlFilename)

Creates an asynchronous task to call the HtmlToEMLFile method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
HtmlToMHT
string HtmlToMHT(string htmlText)

Creates an in-memory MHT web archive from an in-memory HTML string. All external images and style sheets are downloaded and embedded in the MHT string.

Returns None on failure

top
HtmlToMHTAsync (1)
Task HtmlToMHTAsync(string htmlText)

Creates an asynchronous task to call the HtmlToMHT method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
HtmlToMHTFile
bool HtmlToMHTFile(string html, string mhtFilename)

Creates an MHT file from an in-memory HTML string. All external images and style sheets are downloaded and embedded in the MHT file.

Returns True for success, False for failure.

top
HtmlToMHTFileAsync (1)
Task HtmlToMHTFileAsync(string html, string mhtFilename)

Creates an asynchronous task to call the HtmlToMHTFile method with the arguments provided. (Async methods are available starting in Chilkat v9.5.0.52.)

Returns None on failure

top
LoadTaskCaller
bool LoadTaskCaller(Task task)
Introduced in version 9.5.0.80

Loads the caller of the task's async method.

Returns True for success, False for failure.

top
RemoveCustomHeader
void RemoveCustomHeader(string name)

Removes a custom header by header field name.

top
RestoreDefaults
void RestoreDefaults()

Restores the default property settings.

top
UnpackMHT
bool UnpackMHT(string mhtFilename, string unpackDir, string htmlFilename, string partsSubDir)

Unpacks the contents of a MHT file. The destination directory is specified by unpackDir. The name of the HTML file created is specified by htmlFilename, and supporting files (images, javascripts, etc.) are created in partsSubDir, which is automatically created if it does not already exist.

Returns True for success, False for failure.

More Information and Examples
top
UnpackMHTString
bool UnpackMHTString(string mhtString, string unpackDir, string htmlFilename, string partsSubDir)

Same as UnpackMHT, except the MHT is passed in as an in-memory string.

Returns True for success, False for failure.

More Information and Examples
top