Description du projet

webStraktor is a programmable World Wide Web data extraction client. It features a scripting language to facilitate the collection, extraction, and storage of information available on the Web, including images. The scripting language uses elements of regular expression and XPath syntax. The standard webStraktor output format is XML based, either in ASCII, UTF-8, or ISO-8859-1 (Latin1). It adheres to the Robots Exclusion Protocol and can be configured to operate anonymously by connecting through proxy servers. Exhaustive logging and tracing information are provided.

Évaluation
Votre évaluation
Votre avis sur ce projet