
dbpedia « Griff's Graphs First of all, thanks very much for the feedback you all gave me on my graph of ideas. I wasn’t quite aware of how many people are interested in this sort of stuff. I now have lots of great ideas for new projects which will keep me busy for a long while. I must say making the graphs is the easy part – it is obtaining the data which takes time. I’ve made a note of all of your suggestions and will try to create something out of them soon. If you haven’t already, you can submit an idea here. Housekeeping There were a great number of comments about my last graph and so I’ll try to answer the main questions here. “It is way too biased towards Western ideas.” – Yes, see point one of the original blog post. “Where are all of the musicians and artists?” “The title is very misleading.”– The original post had an asterisk on the word ‘every’ which was meant to highlight the fact the graph had caveats. Now that is out-of-the-way, I’d like to present my latest work. Network: Graph Of Ideas vs. Caveats
Running UIMA Engines in Stanbol using the same JVM Introduction This post is a follow up of my previous posts about using UIMA with Apache Stanbol (see and As I have already explained in these posts, running UIMA engines and Apache Stanbol in the same Java Virtual Machine (JVM) is not the easiest thing to do. The root of the problem is that both systems try to do something similar but in different ways: they both rely on customized run-time controlled class loading (in certain settings). Stanbol Architecture is built on the top of Felix, that is an OSGi implementation. This vision I had at the beginning of this work was that Stanbol admins should be able to get a provided UIMA pear package working with the help of an also provided OSGi Stanbol Bundle, without compilation and bundling. Bundling an UIMA engine The first thing we do is to convert every UIMA-related stuff into a bundle. Modifying UIMALocal Bundle
Getting Started with Apache Stanbol Enhancement Engine Note: This blog conforms to Stanbol 0.10.0-incubating SNAPSHOT release. We at FORMCEPT are really excited to become an early adopter of Apache Stanbol product. We are working on its integration with our Big Data Analysis Stack. While working with Apache Stanbol we created multiple enhancement engines and would like to share the experience on building an enhancement engine. Overview Apache Stanbol is built as a modular set of components. Apache Stanbol Components Enhancer and Enhancement Engines: Annotate possible entities and link them to public or private entity repositories.Entityhub: Caches and manages entities stored in local indexes of linked-data repositories including entities specific to a particular domain.Contenthub: Provides persistent document store on top of Apache Solr. Apache Stanbol Component Layer (Credit: Rupert) Getting Started Clean up you existing jersey repository and compile again. If it doesn’t work, make sure that you are using Maven 3. Go ahead play around with it!
The Semantic Web and the Modern Enterprise For over a decade the Semantic Web has been maligned, misconstrued and misunderstood. It’s been overhyped by its supporters while its critics have hung the albatross of artificial intelligence around its neck. Even its successes have been understated, often coming with little fanfare and without the mindshare and hype surrounding other trends such as Web 2.0, NoSQL or Big Data. So I wouldn’t fault you in the slightest if you were surprised, confused or downright skeptical when I claim that the Semantic Web is emerging as the technology of choice for tackling some of today’s most pressing challenges in enterprise information management. This article is the first in a series that will introduce and explain Semantic Web technologies and their role in enterprise information management today. The World Wide Web Today: A Web of Documents The World Wide Web as we know it today is a Web of linked documents, full of content intended to be displayed for humans. This is a great vision.
Rada Mihalcea: Downloads downloads [see also the research page for related information] Various software modules and data sets that are/were used in my research. For any questions regarding the content of this page, please contact Rada Mihalcea, rada at cs.unt.edu new Efficient Indexer for the Google Web 1T Ngram corpus new Wikipedia Interlingual Links Evaluation Dataset new Sentiment Lexicons in Spanish Measuring the Semantic Relatedness between Words and Images Text Mining for Automatic Image Tagging Learning to Identify Educational Materials (LIEM) Cross-Lingual Semantic Relatedness (CLSR) Data for Automatic Short Answer Grading Multilingual Subjectivity Analysis: Gold Standard and Training Data GWSD: Graph-based Unsupervised Word Sense Disambiguation Affective Text: data annotated for emotions and polarity SenseLearner: all words word sense disambiguation tool Benchmark for the evaluation of back-of-the-book indexing systems FrameNet - WordNet verb sense mapping Resources and Tools for Romanian NLP TWA sense tagged data set
Weka 3 - Data Mining with Open Source Machine Learning Software in Java Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. Weka is open source software issued under the GNU General Public License. Yes, it is possible to apply Weka to big data! Data Mining with Weka is a 5 week MOOC, which was held first in late 2013.
weka - home BabelNet BabelNet is a multilingual semantic network obtained as an integration of WordNet and Wikipedia. Statistics of BabelNet[edit] As of October 2013[update], BabelNet (version 2.0) covers 50 languages, including all European languages, most Asian languages, and even Latin. BabelNet 2.0 contains more than 9 million synsets and about 50 million word senses (regardless of their language). Each Babel synset contains 5.5 synonyms, i.e., word senses, on average, in any language. Applications[edit] BabelNet has been shown to enable multilingual Natural Language Processing applications. See also[edit] References[edit] Jump up ^ R. External links[edit] Semantic Web Case Studies and Use Cases Case studies include descriptions of systems that have been deployed within an organization, and are now being used within a production environment. Use cases include examples where an organization has built a prototype system, but it is not currently being used by business functions. The list is updated regularly, as new entries are submitted to W3C. There is also an RSS1.0 feed that you can use to keep track of new submissions. Please, consult the separate submission page if you are interested in submitting a new use case or case study to be added to this list. (), by , , Activity area:Application area of SW technologies:SW technologies used:SW technology benefits: A short overview of the use cases and case studies is available as a slide presentation in Open Document Format and in PDF formats.
C Linked Data Platform Working Group Charter The mission of the Linked Data Platform (LDP) Working Group is to produce a W3C Recommendation for HTTP-based (RESTful) application integration patterns using read/write Linked Data. This work will benefit both small-scale in-browser applications (WebApps) and large-scale Enterprise Application Integration (EAI) efforts. It will complement SPARQL and will be compatible with standards for publishing Linked Data, bringing the data integration features of RDF to RESTful, data-oriented software development. Introduction This group is based on the idea of combining two Web-related concepts to help solve some of the long-standing challenges involved in building and combining software: RDF, the Resource Description Framework, is a W3C Recommended general technique for conveying information. The Linked Data Platform is envisioned as an enterprise-ready collection of standard techniques and services based on using RESTful APIs and the W3C Semantic Web stack. Scope Technical Issues Deliverables