Get flash to fully experience Pearltrees
When a document is indexed, its individual fields are subject to the analyzing and tokenizing filters that can transform and normalize the data in the fields. For example — removing blank spaces, removing html code, stemming, removing a particular character and replacing it with another. At indexing time as well as at query time you may need to do some of the above or similiar operations.
Chances are, if you're like me, you didn't grow up dreaming of better ways to find text and data on a website or a hard drive. Heck, you probably didn't even think about it once you were enrolled in college, even if you were a Computer Science student. Truth is, you probably are working on a project that requires you to search your content and now you're wondering how to do just that. Or, perhaps, you already have search working, but your tests and/or your programming instinct tells you it could be better. Even worse, maybe your boss/QA dept.
The Semantic Vectors Package SemanticVectors creates semantic WordSpace models from free natural language text. Such models are designed to represent words and documents in terms of underlying concepts. They can be used for many semantic (concept-aware) matching tasks such as automatic thesaurus generation, knowledge representation, and concept matching.