Open Source Software
YaCy is a free search engine that anyone can use to build a search portal for their intranet or to help search the public internet. When contributing to the world-wide peer network, the scale of YaCy is limited only by the number of users in the world and can index billions of web pages. It is fully decentralized, all users of the search engine network are equal, the network does not store user search requests and it is not possible for anyone to censor the content of the shared index.
Oracle Oracle Oracle Technology Network > Java Challenge Win A Trip to JavaOne 2014
Table of Contents This page Here you can... ... learn how to write a multithreaded Java application... learn how to write a webcrawler... by the way learn how to write stuff that is object-oriented and reusable... or use the provided webcrawler more or less off-the-shelf. More or less in this case means that you have to be able to make minor adjustments to the Java source code yourself and compile it. How to write a multi-threaded webcrawler in Java
BotSpot 2005 ®: the spot for all bots
Contents About WebSPHINX WebSPHINX ( Website-Specific Processors for HTML INformation eXtraction) is a Java class library and interactive development environment for web crawlers. A web crawler (also called a robot or spider) is a program that browses and processes Web pages automatically. WebSPHINX consists of two parts: the Crawler Workbench and the WebSPHINX class library. WebSPHINX: A Personal, Customizable Web Crawler
Java tip: How to get a web page Technologies: Java 5+ The starting point for building a link checker, web spider, or web page analyzer is, of course, to get the web page from the web server. Java's java.net package includes classes to manage URLs and to open web server connections. This tip shows how to use them to a get text, image, audio, or data file from a web server. Introduction
Capturing Screen in Java,Capture Screen Shot,How to Capture Screen Using Java Swing
HTML Parser - HTML Parser Welcome to the homepage of HTMLParser - a super-fast real-time parser for real-world HTML. What has attracted most developers to HTMLParser has been its simplicity in design, speed and ability to handle streaming real-world html. The two fundamental use-cases that are handled by the parser are extraction and transformation (the syntheses use-case, where HTML pages are created from scratch, is better handled by other tools closer to the source of data). While prior versions concentrated on data extraction from web pages, Version 1.4 of the HTMLParser has substantial improvements in the area of transforming web pages, with simplified tag creation and editing, and verbatim toHtml() method output. In general, to use the HTMLParser you will need to be able to write code in the Java programming language.
In this article, I guide you through the steps involved in designing a utility to download a Website. This utility downloads only text and image files, but it can easily be extended to download files of any type. At the end of the article I'll provide tips on how you can extend the utility. Download a Website for offline browsing
Version 3.47-27 (09/15/2013) Engine fixes (zip_zipWriteInFileInZip_failed/bogus state errors), Unicode fixes Installing HTTrack: Go to the download section now! For help and questions:Visit the forum, Read the documentation, Read the FAQs, Browse the sources Welcome HTTrack is a free (GPL, libre/free software) and easy-to-use offline browser utility.
Open Source Freeware : 400+ free applications and utilities Extremely useful open source applications and utilities available free under various licenses. Free (but NOT open-source) is listed separately : I want a Freeware Utility to ... 450+ common problems solved. ; Please subscribe to our rss feed Also : I want Wordpress Plugin to ... 450+ solutions to blogging problems. Anti-Spyware/Anti-Virus/Anti-Rootkit Freeware Utilities : I want to ...
This is Vivalogo's list of best free, downloadable, open source social networking software / scripts (kinda hard to say all these words :) ). Unlike some other lists you may find on the net, this one contains only really downloadable and functional software.Note: listed in no particular order. SocialEngine SocialEngine is social networking software powered by PHP and Zend. The script lets you easily create your own social network or online community.
Screen Capture Tools: 40+ Free Tools and Techniques Screen capture, or print screen is perhaps the most efficient way to share whatever appears on your desktop. They help tech users like us to share and communicate better with friends and peers. Major operating systems today comes with basic screen capture and print screen function, but if these can’t fulfill what you need from a screen capture then you are probably looking for a screen capturing tool. Screen capturing tools do what the basic tool don’t. What these tools could do varies, including the ability to include sketches and text, instantly upload image online, audio capturing, dimension-specific capturing and more.
Open Source Windows Open Source Windows The promise of open source software is best quality, flexibility and reliability. This is the updated list of the best open source software. The only way to have TRUE "Open Source Windows" is to have all equivalent native Windows programs uninstalled and removed. [Contents]
Open Source Crawlers in Java - Heritrix
Open Source Crawlers in Java
HTML Screen Scraping Tools Written in Java