THTTPSCAN analyzes recursively HTML pages and reports all the links it
finds to a text file: html, mail, jpg, mpeg, mp3, etc...
THttpScan extracts links through HTML pages in
the neighborhood of the initial URL. The html links found are added in a
download queue. THttpScan downloads each related page, extracts the
links found, and so on...
- the LinkScan property limits the scanning to the initial site
or the initial URL path,
- the LinkReport property lets report only links owned by the current site, or
the links under the subfolders of the initial link.
- the DepthSearchLevel property allows to limit
the level of pages scanned, starting from the initial page, especially
when the scanning is not limited to a web site.
By using the LinkScan and LinkReport
properties combined with an high DephSearchLevel value, you can easily scan a whole
site or only a subdirectory from a web site.
Events occur for each link found and each page read, returning
URL, meta tags, document type, referrer, host name...
According to the line speed, thousands of links may be extract from a
starting URL in a few minutes.
Most common parameters can be simply set from the Object Inspector.
System requirements
- Windows Vista / XP / MCE / 2000 / NT / 98 / 95
- Delphi or C++Builder