Peter’s latest book, Ambient Findability , was published in 2005.
Web indexing (or Internet indexing ) refers to various methods for indexing the contents of a website or of the Internet as a whole. Individual websites or intranets may use a back-of-the-book index , while search engines usually use keywords and metadata to provide a more useful vocabulary for Internet or onsite searching. With the increase in the number of periodicals that have articles online, web indexing is also becoming important for periodical websites.
The Universal Decimal Classification (UDC) is a bibliographic and library classification developed by the Belgian bibliographers Paul Otlet and Henri La Fontaine at the end of the 19th century. UDC provides a systematic arrangement of all branches of human knowledge organized as a coherent system in which knowledge fields are related and inter-linked. Originally based on the Dewey Decimal Classification , the UDC was developed as a new analytico-synthetic classification system with a significantly larger vocabulary and syntax that enables very detailed content indexing and information retrieval in large collections.
A faceted classification system allows the assignment of an object to multiple characteristics (attributes), enabling the classification to be ordered in multiple ways, rather than in a single, predetermined, taxonomic order. A facet comprises "clearly defined, mutually exclusive, and collectively exhaustive aspects, properties or characteristics of a class or specific subject". [ 1 ] For example, a collection of books might be classified using an author facet, a subject facet, a date facet, etc. Faceted classification is used in faceted search systems that enable a user to navigate information along multiple paths corresponding to different orderings of the facets.
Colon classification ( CC ) is a system of library classification developed by S. R. Ranganathan . It was the first ever faceted (or analytico-synthetic) classification .
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings , thesauri , taxonomies and other form of knowledge organization systems . Controlled vocabulary schemes mandate the use of predefined, authorised terms that have been preselected by the designer of the vocabulary, in contrast to natural language vocabularies, where there is no restriction on the vocabulary. [ edit ] In library and information science In library and information science controlled vocabulary is a carefully selected list of words and phrases , which are used to tag units of information (document or work) so that they may be more easily retrieved by a search.   Controlled vocabularies solve the problems of homographs , synonyms and polysemes by a bijection between concepts and authorized terms.
Taxonomy (from ancient Greek τάξις taxis , arrangement, and νομία nomia , method ) [ 1 ] is the academic discipline of defining groups of biological organisms on the basis of shared characteristics and giving names to those groups. Each group is given a rank and groups of a given rank can be aggregated to form a super group of higher rank and thus create a hierarchical classification . [ 2 ] [ 3 ] The groups created through this process are referred to as taxa (singular taxon ). An example of a modern classification is the one published in 2009 by the Angiosperm Phylogeny Group for all living flowering plant families (the APG III system ). [ 4 ]
Faceted search , also called faceted navigation or faceted browsing , is a technique for accessing information organized according to a faceted classification system, allowing users to explore a collection of information by applying multiple filters. A faceted classification system classifies each information element along multiple explicit dimensions, enabling the classifications to be accessed and ordered in multiple ways rather than in a single, pre-determined, taxonomic order. [ 1 ] Facets correspond to properties of the information elements. [ 2 ] They are often derived by analysis of the text of an item using entity extraction techniques or from pre-existing fields in a database such as author, descriptor, language, and format.
A taxonomic database is a database created to hold information related to biological taxa - for example groups of organisms organized by species name or other taxonomic identifier - for efficient data management and information retrieval as required. Today, taxonomic databases are routinely used for the automated construction of biological checklists such as floras and faunas , both for print publication and online; to underpin the operation of web based species information systems; as a part of biological collection management (for example in museums and herbaria ); as well as providing, in some cases, the taxon management component of broader science or biology information systems. They are also a fundamental contribution to the discipline of biodiversity informatics . [ edit ] Goal
Categorization is the process in which ideas and objects are recognized , differentiated , and understood . [ 1 ] Categorization implies that objects are grouped into categories, usually for some specific purpose. Ideally, a category illuminates a relationship between the subjects and objects of knowledge . Categorization is fundamental in language , prediction , inference , decision making and in all kinds of environmental interaction.
Knowledge representation (KR) is an area of artificial intelligence research aimed at representing knowledge in symbols to facilitate inferencing from those knowledge elements, creating new elements of knowledge. The KR can be made to be independent of the underlying knowledge model or knowledge base system (KBS) such as a semantic network . [ 1 ] [ edit ] Overview Knowledge Representation (KR) research involves analysis of how to reason accurately and effectively and how best to use a set of symbols to represent a set of facts within a knowledge domain. A symbol vocabulary and a system of logic are combined to enable inferences about elements in the KR to create new KR sentences. Logic is used to supply formal semantics of how reasoning functions should be applied to the symbols in the KR system.
A comparison of phylogenetic and phenetic concepts Biological systematics is the study of the diversification of living forms, both past and present, and the relationships among living things through time. Relationships are visualized as evolutionary trees (synonyms: cladograms , phylogenetic trees , phylogenies). Phylogenies have two components, branching order (showing group relationships) and branch length (showing amount of evolution). Phylogenetic trees of species and higher taxa are used to study the evolution of traits (e.g., anatomical or molecular characteristics) and the distribution of organisms ( biogeography ). Systematics, in other words, is used to understand the evolutionary history of life on Earth.
Named-entity recognition (NER) (also known as entity identification and entity extraction ) is a subtask of information extraction that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
From highlighting pens to faceted clipmarks … By Jan Wyllie , Trend Monitor 2.0 I have been a dedicated highlighter and annotator of what I thought were key bits of text for nearly 40 years now. I learned its use from my father who was a prolific under-liner in pencil, before he became a highlighter himself when marker pens came in.
Content analysis or textual analysis is a methodology in the social sciences for studying the content of communication . Earl Babbie defines it as "the study of recorded human communications, such as books , websites , paintings and laws ." According to Dr. Farooq Joubish, content analysis is considered a scholarly methodology in the humanities by which texts are studied as to authorship , authenticity , or meaning .