background preloader

The Anatomy of a Search Engine

The Anatomy of a Search Engine
Sergey Brin and Lawrence Page {sergey, page}@cs.stanford.edu Computer Science Department, Stanford University, Stanford, CA 94305 Abstract In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. 1. (Note: There are two versions of this paper -- a longer full version and a shorter printed version. 1.1 Web Search Engines -- Scaling Up: 1994 - 2000 Search engine technology has had to scale dramatically to keep up with the growth of the web. 1.2. Creating a search engine which scales even to today's web presents many challenges. These tasks are becoming increasingly difficult as the Web grows. 1.3 Design Goals 1.3.1 Improved Search Quality Our main goal is to improve the quality of web search engines. 1.3.2 Academic Search Engine Research 2. 2.1 PageRank: Bringing Order to the Web References

http://infolab.stanford.edu/~backrub/google.html

Related:  machine-learning-courseRecherche sur le webGeneral Professional InterestBlog Referencia

HTML Scraping Web Scraping Web sites are written using HTML, which means that each web page is a structured document. Sometimes it would be great to obtain some data from them and preserve the structure while we’re at it. Universities with the Best Free Online Courses Free online courses are offered by real schools. Learn which courses are available, what topics they cover and which ones lead to real college credit. Online Courses for Credit Social networking and recommendation systems Due: at 9pm on Friday, February 1. Submit via this turnin page. When you sign into Facebook, it suggests friends. In this assignment, you will write a program that reads Facebook data and makes friend recommendations.

Computer Science 101 UPDATE: we're doing a live, updated MOOC of this course at stanford-online July-2014 (not this Coursera version). See here: CS101 teaches the essential ideas of Computer Science for a zero-prior-experience audience. Computers can appear very complicated, but in reality, computers work within just a few, simple patterns. Learning a Personalized Homepage As we've described in our previous blog posts, at Netflix we use personalization extensively and treat every situation as an opportunity to present the right content to each of our over 57 million members. The main way a member interacts with our recommendations is via the homepage, which they see when they log into Netflix on any supported device. The primary function of the homepage is to help each member easily find something to watch that they will enjoy. A problem we face is that our catalog contains many more videos than can be displayed on a single page and each member comes with their own unique set of interests. Thus, a general algorithmic challenge becomes how to best tailor each member's homepage to make it relevant, cover their interests and intents, and still allow for exploration of our catalog. This type of problem is not unique to Netflix, it is faced by others such as news sites, search engines, and online stores.

Have you tested your strategy lately? - McKinsey Quarterly - Strategy - Strategic Thinking “What’s the next new thing in strategy?” a senior executive recently asked Phil Rosenzweig, a professor at IMD, in Switzerland. His response was surprising for someone whose career is devoted to advancing the state of the art of strategy: “With all respect, I think that’s the wrong question. There’s always new stuff out there, and most of it’s not very good. Rather than looking for the next musing, it’s probably better to be thorough about what we know is true and make sure we do that well.” Let’s face it: the basic principles that make for good strategy often get obscured. Recommending for the World #AlgorithmsEverywhere The Netflix experience is driven by a number of Machine Learning algorithms: personalized ranking, page generation, search, similarity, ratings, etc. On the 6th of January, we simultaneously launched Netflix in 130 new countries around the world, which brings the total to over 190 countries. Preparing for such a rapid expansion while ensuring each algorithm was ready to work seamlessly created new challenges for our recommendation and search teams. In this post, we highlight the four most interesting challenges we’ve encountered in making our algorithms operate globally and, most importantly, how this improved our ability to connect members worldwide with stories they'll love.

We’re all marketers now - McKinsey Quarterly - Marketing & Sales - Strategy For the past decade, marketers have been adjusting to a new era of deep customer engagement. They’ve tacked on new functions, such as social-media management; altered processes to better integrate advertising campaigns online, on television, and in print; and added staff with Web expertise to manage the explosion of digital customer data. Yet in our experience, that’s not enough. Programmatically Assign Menus To Theme Locations In WordPress In this tutorial we are going to look at how we can programmatically set a menu to a menu location. In WordPress you assign a menu to a location by going to the menu page in the admin area and click on manage locations this will take you to the page /wp-admin/nav-menus.php?action=locations. From here you will see a list of all the locations assigned to your theme with a dropdown where you can choose what menu you want to assign to the location. An example of when you will need to programmatically set this is on a plugin activation you create a number of new menus based on user role and want to assign these to a menu location.

Distributed Time Travel for Feature Generation We want to make it easy for Netflix members to find great content to fulfill their unique tastes. To do this, we follow a data-driven algorithmic approach based on machine learning, which we have described in past posts and other publications. We aspire to a day when anyone can sit down, turn on Netflix, and the absolute best content for them will automatically start playing. [Infographic] Google's 200 Ranking Factors There’s no doubt that Google’s algorithm is more complex — and volatile — than ever. The days where SEO was all about meta tags and backlinks is long gone. Google now uses at least 200 ranking factors in their algorithm, including social signals, user-interaction signals, and trust. If you’d like a bird’s eye view of these factors, then this infographic by Entrepreneur.com and Backlinko may give you some much needed perspective. The piece covers all of the known 200 ranking signals that Google uses to rank sites and pages.

The Eight Pillars of Innovation The greatest innovations are the ones we take for granted, like light bulbs, refrigeration and penicillin. But in a world where the miraculous very quickly becomes common-place, how can a company, especially one as big as Google, maintain a spirit of innovation year after year? Nurturing a culture that allows for innovation is the key. As we’ve grown to over 26,000 employees in more than 60 offices, we’ve worked hard to maintain the unique spirit that characterized Google way back when I joined as employee #16. At that time I was Head of Marketing (a group of one), and over the past decade I’ve been lucky enough to work on a wide range of products. Some were big wins, others weren’t.

Digital Inkwell Well, I should probably rephrase that. Search engine optimizers that are worth their price will have already known about the Penguin 2.0 update and will have implemented strategies to bolster their sites’ PageRank. If... A Neural Network Playground Um, What Is a Neural Network? It’s a technique for building a computer program that learns from data. It is based very loosely on how we think the human brain works. First, a collection of software “neurons” are created and connected together, allowing them to send messages to each other. Next, the network is asked to solve a problem, which it attempts to do over and over, each time strengthening the connections that lead to success and diminishing those that lead to failure. For a more detailed introduction to neural networks, Michael Nielsen’s Neural Networks and Deep Learning is a good place to start.

Related:  MagisterkaSNA