public interface Handler
Modifier and Type | Method and Description |
---|---|
java.lang.String[] |
getContentTypes()
Returns an array of String containing the content-types this
ContentHandler can process. |
void |
handle(CrawlerAccess crawler,
java.io.InputStream in,
int depth,
java.net.URL page)
Processes the given page.
|
void handle(CrawlerAccess crawler, java.io.InputStream in, int depth, java.net.URL page)
InputStream
contains the data of an object that is
of the content-type this content handler accepts. May use the given
CrawlerAccess
object to add further pages to the queue of the
WebCrawler
that asked to process the page.crawler
- The call back object for the handler. Any objects built from
the content of the page can be sent to this.in
- The InputStream
of the page.depth
- The depth remaining depth. When reporting another page to
the queue, the depth of that page should be set to this depth-1.page
- The URL
of the page.java.lang.String[] getContentTypes()
ContentHandler
can process.Copyright 2004-2015 Wandora Team