background preloader

Analyse

Facebook Twitter

Suggesting/disambiguation query. This application is a divisional of U.S. Ser. No. 11/319,928 filed 27 Dec. 2005 (Attorney Docket No. BAYN0001). 1. The invention relates to electronic access to information. 2. For years enterprises have struggled with ineffective search techniques. Enterprise search needs are poorly served. PC (Desktop) Search PC or desktop search can be compared with finding stuff in your messy garage. Traditional PC search from Microsoft is based on parsing a file at the time of search. Web Search The other spectrum of the search is Web search. Enterprise Search The enterprise, however, does not behave as the PC or the Web environment. An example of the problem with enterprise search is shown in FIG. 1, which is a flow diagram showing the state of the art in enterprise search.

Traditional enterprise search technology uses inverted index, NLP, and database index approaches (see FIG. 2). Enterprise Search Exhibits a Unique Set of Characteristics Primary Guide User Behavior Freshness and Credibility. French stemming algorithm. Letters in French include the following accented forms, â à ç ë é ê è ï î ô û ù The following letters are vowels: a e i o u y â à ë é ê è ï î ô û ù Assume the word is in lower case. Then put into upper case u or i preceded and followed by a vowel, and y preceded or followed by a vowel. u after q is also put into upper case. For example, (The upper case forms are not then classed as vowels — see note on vowel marking.) If the word begins with two vowels, RV is the region after the third letter, otherwise the region after the first vowel not at the beginning of the word, or the end of the word if these positions cannot be found. For example, a i m e r a d o r e r v o l e r t a p i s |...| |.....| |.....| | R1 is the region after the first non-vowel following a vowel, or the end of the word if there is no such non-vowel.

For example: f a m e u s e m e n t |......R1.......| |...R2 Note that R1 can contain RV (adorer), and RV can contain R1 (voler). Start with step 1 Step 1: Standard suffix removal aux. French stemmer. Lucene SKOS analyzer. Lexique français.

Lucene + SKOS = Search + thesaurus

Porter Stemming Algorithm. JavaScript Stemmer for French Language : Kasun's Tech Thoughts. About a month ago, I wrote a JavaScript port for the Porter French Stemming Algorithm in Snowball. Algorithm was pretty clear so, that was just a day of work :) I did this port for a requirement of the Google Summer of CodeDocBook Webhelp Project which I worked on in the last few months. If you are not familiar with what a Stemmer is, here's a brief introduction :). What a Stemmer basically does is extracting the root form of a given verb. Stemmers are very useful for Search engines such that users can enter search query in any variety, but view the content for the root word, which the users probably meant. ( Google does this ;) ) Following example shows what a stem is:Playing =Played ====> PlayPlays =Play =----- As the human languages are very complex, it is really difficuly to devise an algorithm to extract the exact root.

This is the Stemmer for French. The Stemmer: Alx2002.free.fr/utilitarism/stemmer/PaiceHuskStemmer.php. Alx2002.free.fr/utilitarism/stemmer/PaiceHuskStemRules_fr.php.