background preloader

WEBDEV

Facebook Twitter

Using Firebug for scraping — Scrapy 0.24.4 documentation. Introduction¶ This document explains how to use Firebug (a Firefox add-on) to make the scraping process easier and more fun. For other useful Firefox add-ons see Useful Firefox add-ons for scraping. There are some caveats with using Firefox add-ons to inspect pages, see Caveats with inspecting the live browser DOM. In this example, we’ll show how to use Firebug to scrape data from the Google Directory, which contains the same data as the Open Directory Project used in the tutorial but with a different face.

Firebug comes with a very useful feature called Inspect Element which allows you to inspect the HTML code of the different page elements just by hovering your mouse over them. Otherwise you would have to search for the tags manually through the HTML body which can be a very tedious task. In the following screenshot you can see the Inspect Element tool in action. At first sight, we can see that the directory is divided in categories, which are also divided in subcategories. Breach - A browser for the HTML5 era. Free Web Analytics Software - Analytics - Piwik - Mozilla FireFox for eBuro. Collect & visualize your logs with Logstash, Elasticsearch & Redis | Michael Bouvy. Update of December 6th : although Logstash does the job as a log shipper, you might consider replacing it with Lumberjack / Logstash Forwarder, which needs way less resources, and keep Logstash on your indexer to collect, transform and index your logs data (into ElasticSearch) : check out my latest blog post on the topic.

Kibana Dashboard Even if you manage a single Linux server, you probably already know how hard it is to keep an eye on what’s going on with your server, and especially tracking logs data. And this becomes even worse when you have several (physical or virtual) servers to administrate. Although Munin is very helpful monitoring various informations from my servers / VMs, I felt the need of something more, and bit less static / more interactive. There are 3 kind of logs I especially wanted to track : Apache 2 access logsiptables logsSyslogs As you can see, I am using 4 complementary applications, the role of each one being : Installation Redis Logstash (shippers) Elasticsearch Kibana. Node.js : votre premier web service | Node.js.