-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Home
Muhammad Ali Hassan edited this page Apr 20, 2016
·
7 revisions
crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes.
To use the latest release of crawler4j, please use the following snippet in your pom.xml
<dependency>
<groupId>edu.uci.ics</groupId>
<artifactId>crawler4j</artifactId>
<version>4.2</version>
</dependency>
#Without Maven crawler4j JARs are available on the release page and at Maven Central.
If you use crawler4j without Maven, be aware that crawler4j jar file has a couple of external dependencies. In release page, you can find a file named crawler4j-X.Y-with-dependencies.jar that includes crawler4j and all of its dependencies as a bundle. You can add download it and add it to your classpath to get all the dependencies covered.