background preloader

Enterprise search

Facebook Twitter

Spinscale/elasticsearch-suggest-plugin. Language support and linguistics in Lucene, Solr and ElasticSearch, and the eco-system. A new Lucene highlighter is born. Robert has created an exciting new highlighter for Lucene, PostingsHighlighter, our third highlighter implementation (Highlighter and FastVectorHighlighter are the existing ones).

A new Lucene highlighter is born

It will be available starting in the upcoming 4.1 release. Highlighting is crucial functionality in most search applications since it's the first step of the hard-to-solve final inch problem, i.e. of getting the user not only to the best matching documents but getting her to the best spot(s) within each document. The larger your documents are, the more crucial it is that you address the final inch. Ideally, your user interface would let the user click on each highlight snippet to jump to where it occurs in the full document, or at least scroll to the first snippet when the user clicks on the document link. This is in general hard to solve: which application renders the content is dependent on its mime-type (i.e., the browser will render HTML, but will embed Acrobat Reader to render PDF, etc.).

Suchergebnisse für "verbraucher widerrufsrecht nachricht" - Using Elastic Search as a Key Value store. I have in the past used Solr as a key value store.

Using Elastic Search as a Key Value store

Doing that provided me with some useful information: Using Solr as a key value store actually works reasonably well. Tipue. FindZebra - The search engine for difficult medical cases. SmartCat - Allen County Public Library. Okfn/facetview. Natural Language Processing for MediaWiki: First major release of the Semantic Assistants Wiki-NLP Integration. Printer-friendly version Send by email PDF version 1.

Natural Language Processing for MediaWiki: First major release of the Semantic Assistants Wiki-NLP Integration

Introduction We are happy to announce the first major release of our Semantic Assistants Wiki-NLP integration. 2. The current release includes the following features: Light-weight MediaWiki Extension The Wiki-NLP integration is introduced to an existing MediaWiki engine through installing a light-weight extension. NLP Pipeline Independent Architecture. Explain rank in Sharepoint 2013 Search. Default ranking model in Sharepoint 2013 is completely different from what we’ve seen in FS4SP and is definitely a step forward comparing to SP2010.

Explain rank in Sharepoint 2013 Search

In a nutshell it uses multiple features (to take into account query terms, it’s proximity; document click-through, authority, anchor text, url depth etc ) which are mixed with help of neural network as a final step. Details can be found in patent claim Hopefully there’s a way to bring more light into this black magic. I’ve modified default Display Template and added “Explain Rank” link to each item. This link redirects user to ExplainRank page which hosted under {search_center_url}/_layouts/15/ folder. Data at the heart of the enterprise - Antidot. Sense out of Search. Why we chose Solr 4.0 instead of ElasticSearch.

As part of a new project we have to decide which search engine to use.

Why we chose Solr 4.0 instead of ElasticSearch

There are two leading search engines on the market, ElasticSearch(0.19.9) and Apache Solr(4.0BETA). Both were built on the top of Lucene. After evaluating both we opted for Solr, let me explain why. Conditions Our search engine should be highly available & fault tolerant. Pros & Cons Both search engines have great benefits. Decision points. Funnelback Website and Enterprise Search - Home. Panacea Project. Solutions. Lemur Project Home. Gaston Gonzalez - My thoughts on technology, food, running and stuff - Tech Blog - FAST Search Engineer's Guide: An Alternate Approach to SBC Absolute Position Boosting. Search solutions for enterprises - Solr / ElasticSearch.

Enterprise Search. Search, Big Data, Analytics, Natural Language Processing. Chaos, Codings, Comments. Bulk Indexing With ElasticSearch and Hadoop. At Infochimps we recently indexed over 2.5 billion documents for a total of 4TB total indexed size.

Bulk Indexing With ElasticSearch and Hadoop

Fun with elasticsearch's children and nested documents - Space Vatican. When you’re indexing data, the world is rarely as simple as each document existing in isolation.

Fun with elasticsearch's children and nested documents - Space Vatican

Sometimes, you’re better off denormalizing all data into the child documents. For example if you were modelling books, adding an author field to books could be a sensible choice (even if in the database that is your authoritative datasource the data is split into separate authors and books table). It’s simple and you can easily construct queries on both attributes of the book and the author’s name. Denormalizing Maps with Lucene Payloads. Last week, I tried out Lucene's Payload and SpanQuery features to do some position based custom scoring of terms.

Denormalizing Maps with Lucene Payloads

I've been interested in the Payload feature ever since I first read about it, because it looked like something I could use to solve another problem at work... The problem is to to be able to store a mapping of concepts to scores along with a document. Our search uses a medical taxonomy, basically a graph of medical concepts (nodes) and their relationships to each other (edges). During indexing, a document is analyzed and a map of node IDs and scores is created and stored in the index. NGDATA - Lily - Smart Data, at Scale, made Easy. Lily is Smart Data, at Scale, made Easy. Lily is a data management platform combining planet-sized data storage, indexing and search with on-line, real-time usage tracking, audience analytics and content recommendations. It's a one-stop-platform for any organization confronted with Big Data challenges that seeks rapid implementation, rock-solid performance at scale, and efficiency at management.

Lily unifies Apache HBase, Hadoop and Solr into a comprehensively integrated, interactive data platform with easy-to-use access APIs, a high-level data model and schema language, flexible, real-time indexing and the expressive search power of Apache Solr. Application Platform for Enterprise Big Data. Open Source Web Mining Toolkit. Das Magazin für Führungskräfte: Anbieterübersicht. Wer die Wahl hat, hat bekanntlich die Qual.

Das Magazin für Führungskräfte: Anbieterübersicht

So ergeht es auch Führungskräften, die erkannt haben, dass sie für ein gutes Wissensmanagement gut funktionierende technische Lösungen und Dienstleistungen benötigen, aber die Entscheidung aufgrund des unüberschaubaren Marktes hinauszögern. Eine Hilfestellung leistet hier unsere Anbieterübersicht mit Lösungen und ihre Leistungen. Unter der Headline Unternehmen/Produkte finden Sie kurze Unternehmensprofile. Bei den Anbieterübersichten finden Sie thematische Entscheidungstabellen zu unterschiedlichen Bereichen, die aus der angegebenen Heftausgabe stammen und als PDF-Datei zum Download angeboten werden.

RankingAlgorithm. OpenPipe. SMILA - Unified Information Access Architecture. Scaling Solr Indexing with SolrCloud, Hadoop and Behemoth. Emilis/solr-lt. How UIA Completes the Big Data Picture. By Sid Probstein Many people say the Big Data movement is about "unstructured data," but they are missing an important point.

How UIA Completes the Big Data Picture

Log files and click streams are not really unstructured; they are just relatively unfamiliar, and sometimes variable structures. What about the other sources that contain important information about customers and buying habits? These offer a wealth of value that is today largely untapped. Emails, open-ended survey questions, Web forms, call logs, discussion boards, SharePoint and Wiki sites — this is the true "unstructured content" that completes the picture of customer perception. Evolvingweb/ajax-solr. Data Integration. Standards and Infrastructure for Innovation Data Exchange by Laurel L. Haak, David Baker, Donna K. Ginther, Gregg J. Gordon, Matthew A. The Open Source Phasor Data Concentrator. Mining of Massive Datasets. The book has a new Web site This page will no longer be maintained.

Your browser should be automatically redirected to the new site in 10 seconds. The book has now been published by Cambridge University Press. The publisher is offering a 20% discount to anyone who buys the hardcopy Here. By agreement with the publisher, you can still download it free from this page. --- Jure Leskovec, Anand Rajaraman (@anand_raj), and Jeff Ullman Download Version 2.1 The following is the second edition of the book, which we expect to be published soon. There is a revised Chapter 2 that treats map-reduce programming in a manner closer to how it is used in practice, rather than how it was described in the original paper. RDF Aggregates and Full Text Search on Steroids with Solr at Frederick Giasson. Preamble As I explained in my latest blog post, I am now starting to talk about a couple of things I have been working on in the last few months that will lead to a release, by Structured Dynamics, in the coming months.

This blog post is the first step into that path. Enjoy! Introduction. » Custom security filtering in Solr. Dec. 6, 2012: A code update was made for Solr 4.0 (see commented section in below)Yonik recently wrote about “Advanced Filter Caching in Solr” where he talked about expensive and custom filters; it was left as an exercise to the reader on the implementation details. In this post, I’m going to provide a concrete example of custom post filtering for the case of filtering documents based on access control lists.

Recap of Solr’s filtering and caching First let’s review Solr’s filtering and caching capabilities. Queries to Solr involve a full-text, relevancy scored, query (the infamous q parameter). Search Wizards Speak: Jean Ferré of Sinequa. HOME. Software AG · Spezialberatung zu Lucene (Java und .NET) & Solr. VuFind: Home. Suchmaschine BASE (Bielefeld Academic Search Engine): Trefferliste. Loading Error: Cannot Load Popup Box Mobile | A A A | Designing Search (part 1 of 5): Entering the query « Information Interaction. JWNL (Java WordNet Library) Current Version - WordNet - Current Version. The most recent Windows version of WordNet is 2.1, released in March 2005. Version 3.0 for Unix/Linux/Solaris/etc. was released in December, 2006. Version 3.1 is currently availalbe only online. WordNet binaries and source are available for Windows and Unix-like systems (Irix, Solaris, and Linux binaries).

The New SolrCloud: Overview. February 1, 2012 by Rafał Kuć. Cominvent AS - Enterprise search consultants. What is YouSeer? Documentprocessor. Migrating Fast to Solr. OpenPipeline.