background preloader

Python

Facebook Twitter

Scrapy Tips from the Pros: Part 1. Scrapy is at the heart of Scrapinghub.

Scrapy Tips from the Pros: Part 1

We use this framework extensively and have accumulated a wide range of shortcuts to get around common problems. We’re launching a series to share these Scrapy tips with you so that you can get the most out of your daily workflow. Each post will feature two to three tips, so stay tuned. Use Extruct to Extract Microdata from Websites I am sure each and every developer of web crawlers has had a reason to curse web developers who use messy layouts for their websites. Which is why we are so grateful for Schema.org, a collaborative effort to bring semantic markup to the web. For example, AggregateRating is a schema used by online retailers to represent user ratings for their products. This way, search engines can show the ratings for a product alongside the URL in the search results, with no need to write specific spiders for each website: You can also benefit from the semantic markup that some websites use.

This spider generates items like this: Wrap Up. Let’s Build A Simple Interpreter. Part 1. - Ruslan's Blog. “If you don’t know how compilers work, then you don’t know how computers work.

Let’s Build A Simple Interpreter. Part 1. - Ruslan's Blog

If you’re not 100% sure whether you know how compilers work, then you don’t know how they work.” — Steve Yegge There you have it. Think about it. It doesn’t really matter whether you’re a newbie or a seasoned software developer: if you don’t know how compilers and interpreters work, then you don’t know how computers work. It’s that simple. So, do you know how compilers and interpreters work? Or if you don’t and you’re really agitated about it. Do not worry. Why would you study interpreters and compilers? To write an interpreter or a compiler you have to have a lot of technical skills that you need to use together.

Okay, but what are interpreters and compilers? The goal of an interpreter or a compiler is to translate a source program in some high-level language into some other form. At this point you may also wonder what the difference is between an interpreter and a compiler. Here is the deal. Let’s Build A Web Server. Part 1. - Ruslan's Blog. Out for a walk one day, a woman came across a construction site and saw three men working.

Let’s Build A Web Server. Part 1. - Ruslan's Blog

She asked the first man, “What are you doing?” Annoyed by the question, the first man barked, “Can’t you see that I’m laying bricks?” Not satisfied with the answer, she asked the second man what he was doing. The second man answered, “I’m building a brick wall.” Then, turning his attention to the first man, he said, “Hey, you just passed the end of the wall. The moral of the story is that when you know the whole system and understand how different pieces fit together (bricks, walls, cathedral), you can identify and fix problems faster (errant brick). What does it have to do with creating your own Web server from scratch? I believe to become a better developer you MUST get a better understanding of the underlying software systems you use on a daily basis and that includes programming languages, compilers and interpreters, databases and operating systems, web servers and web frameworks.

00_Legal_stuff

01_Python Basics. 10_Python Modules. 20_External Modules. 30_Guides, Quick References, Tutorials. 40_How-tos. 50_Algorithms, Functions, Scripts. 80_Application's Specific. 90_Useful applications, tools, ... A0_Blogs and Articles.