Chilkat Software Chilkat Software Chilkat Software
Chilkat Software Chilkat Software

  

  

  

  

  

 

Spider Component Features

  • Crawl a web site.
  • Accumulate outbound links for crawling other web sites.
  • Cache pages so future crawls can fetch from cache.
  • Robots.txt compliant.
  • Fetch the HTML content of each page crawled.
  • Able to crawl HTTPS pages.
  • Define "avoid" patterns to avoid URLs matching specific wildcard patterns.
  • Define "avoid" patterns for avoiding matching outbound links.
  • Read and connect timeouts.
  • Maximum URL size to avoid ever-growing URLs.
  • Maximum response size to avoid pages with very large or infinite content.
  • Wind-down count to set a limit on pages spidered per site.
  • Thread safe.


Privacy Statement. Copyright 2000-2017 Chilkat Software, Inc. All rights reserved.

(Regarding the usage of the Android logo) Portions of this page are reproduced from work created and shared by Google and used according to terms described in the Creative Commons 3.0 Attribution License.

Send feedback to support@chilkatsoft.com


Software components and libraries for Linux, MAC OS X, iOS, Android™, Solaris, RHEL/CentOS, FreeBSD, MinGW
Azure, Windows 10, Windows 8, Windows Server 2012, Windows 7, Vista, XP, 2003 Server, 2008 Server, etc.