background preloader

Sensing

Facebook Twitter

How to Discover What is Important to Your Clusters. Raw information is useless if you don’t understand what it means, but sometimes there is just so much it’s hard to get a handle on what is going on. One way to better understand your data is through cluster analysis – grouping similar data in “clusters”. At BigML we use a centroid-based clustering to group your data with just one click. While this is terribly convenient, it can obscure how the clustering decisions are actually made. When a dataset has dozens of input fields (or more!)

, how can you tell which ones were actually important in grouping your data? What is important anyway? This is a really big question, but at BigML we specialize in turning big questions into answers. How can we apply this definition of importance to the case of clusters? In fact, there is already a one-click way to grow a tree based on a cluster just this way. Global importance is here! This new BigML script creates not just one, but an ensemble of trees designed to find the importance of your input fields. Giant Gray. Monitor key internal systems with self-learning, multi-sensor video analytics and SCADA solutions Power and water facilities are two key components within the nation-wide critical infrastructure system. Operational efficiency in these complex facilities is vital. A failure in just one critical infrastructure mechanism could wreak havoc across an entire nation.

These facilities not only face security threats, such as external attacks and sabotage, but they also face internal system vulnerabilities in which a malfunction could not only cripple the critical infrastructure network but threaten the lives of those working in and around the facility. Giant Gray’s Graydient platform utilizes self-learning, behavioral recognition software for video surveillance that sends real-time alerts when anomalous behavior is detected. Social Media Tools. The Product Types Ontology: Use Wikipedia pages for describing products or services with GoodRelations and schema.org. Data Scraping Studio ™ - Web Scraping & Data Extraction Software. w20922. Discover Relevant Business Information. The Top 10 Machine Learning APIs. By Angela Guess Janet Wagner recently wrote for Programmable Web, “Machine learning is everywhere these days… Right now, Amazon, Google, IBM, and Microsoft are the biggest players battling to dominate the very fast-growing machine learning cloud services market.

IBM further strengthened its position in the market with the recent acquisition of AlchemyAPI, a leading deep learning-based machine learning services platform… The APIs that made it to our top 10 machine learning APIs list offer a wide range of capabilities including image tagging, face recognition, document classification, speech recognition, predictive modeling, sentiment analysis, and pattern recognition. The APIs also scored well against a diverse set of criteria: Popularity, Potential, Documentation, Ease of use, Functionality.” Read the full list here. photo credit: Diffbot. Home Page.

Recommendations & Analogies

(LOV) Linked Open Vocabularies. Decision-ontology - An ontology for representing decisions and decision-making. The purpose of Decision Ontology (DO) is to provide a basis for representing, modeling and analyzing decision and decision-making. You can browse the ontology here. Visit the wiki section for more information.

If you have any questions or comments or need help in using this ontology, send an email to: piotrnowara(at)gmail.com April 2012 - improved annotations and new 'How to' wiki page. October 2011 - new version of Decision Ontology is available - please check the repository and this wiki page. Representing decisions & decision-making Decision ontology can be used for archiving information related to the decisions and their context. Decision Ontology can be used for storing and analyzing decision patterns that is the knowledge associated with certain types of decision, where every aspect could be analyzed hierarchically. Examples: Some of the questions that can be answered using DO: What problem/question initiated the decision-making process? This project is open for collaboration. Requirement-ontology - An ontology for describing of requirements, privileges, obligations, norms etc. Linked Open Vocabularies. What is PredictionIO?

Why do developers like it? Developers can get started quickly by downloading one of the many engine template available on PredictionIO template gallery. They can further customize each component of an engine for any specific business logics they have in mind. The DASE architecture offers separation-of-concerns, making it easier for them to re-use and maintain the code. With Event Server and SDKs, developers can integrate PredictionIO easily with applications and platforms of any programming language. PredictionIO manages the deployment of engines with scalability in mind. PredictionIO helps developers build and deploy production-grade predictive engines, faster.

Why do data scientists like it? Data scientists use PredictionIO to evaluate models and keep track of parameter adjustment history. PredictionIO helps data scientists evaluate and tune multiple models in parallel effectively and systematically. Use Cases See some examples of what can be built with PredictionIO. Wolfram Data Drop: Universal Data Accumulator. Introducing Sense - A Next-Generation Platform for Data Science. Today, we're excited to announce the public release of Sense — a next-generation platform for data science.

Sense brings together the most powerful tools — R, Python, Julia, Spark, Impala, Redshift, and more — into a unified platform to accelerate data science from exploration to production. Why are we excited? Because Sense offers what we've always dreamed of as data scientists. Powerful collaboration for teams and enterprises. Sound interesting? Starting today, you can sign up to try Sense for free or email enterprise@sense.io to learn more. We hope you'll like it! Sense allows teams to collaborate around the entire life cycle of data science from exploration to production. When you sign in to Sense, you'll see your projects and projects that have been shared with you. Each project provides a central hub for collaboration. You can add and remove collaborators to projects with permissions you control. Launching a new engine with dedicated CPU and RAM is just one click.

Some interesting new algorithms | Follow the Data. Just wanted to note down some new algorithms that I came across for future reference. Haven’t actually tried any of these yet. LIBFMM, a library for Field-aware Factorization machines. Developed by a group at National Taiwan University, this technique has been used to win two Kaggle click-through competitions. (Criteo, Avazu)Random Bits Regression, a “strong general predictor for big data” (paper). “This method first generates a large number of random binary intermediate/derived features based on the original input matrix, and then performs regularized linear/logistic regression on those intermediate/derived features to predict the outcome. “BIDMach, a CPU and GPU-accelerated machine learning library that shows some amazing benchmark results compared to Spark, Vowpal Wabbit, scikit-learn etc.

And another one which is not as new, but which I wanted to highlight because of a nice blog post about interactions and generalization by David Chudzicki: Like this: Like Loading... Machine Learning Documentation. Analytics, Data Mining, and Data Science. Cognitive Fingerprints, the new frontier of authenticationSecurity Affairs. Security plays a crucial role in today’s world. Whether it is a multi-billion organization or a single person with a computer, security is important for all. One of the pillars for cyber security is the Authentication.

People want an easy way to deal with authentication, but currently the only available technique is to remember and manage long cumbersome passwords. Even this form of authentication is compromised now and then, because of the growing threat to the security domain as well as humans being the weakest link. DARPA (Defense Advanced Research Projects Agency) has come up with an innovative approach to solve the difficulties faced during authentication with the help of a concept termed as Cognitive Fingerprints also called “Active Authentication.” This is a multi-million dollar contract to produce a new identity verification system based on users’ behavior. “This program focuses on the behavioral traits that can be observed through how we interact with the world.

Share On. Pricing. Tint: Display Any Social Feeds Anywhere. Resolve API - Entity Resolution, Data Cleaning, & Mapping. Narrative Generation - Natural Language Generation | Narrative Science. TempoIQ. TextBlob: Simplified Text Processing — TextBlob 0.9.0 documentation. Release v0.8.4. (Changelog) TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both. Features Noun phrase extractionPart-of-speech taggingSentiment analysisClassification (Naive Bayes, Decision Tree)Language translation and detection powered by Google TranslateTokenization (splitting text into words and sentences)Word and phrase frequenciesParsingn-gramsWord inflection (pluralization and singularization) and lemmatizationSpelling correctionJSON serializationAdd new models or languages through extensionsWordNet integration Get it now $ pip install -U textblob $ python -m textblob.download_corpora Ready to dive in? Diction's use in big data projects. Semantic Technology Developer. RelFinder. Freebase - Google+ When we publicly launched Freebase back in 2007, we thought of it as a "Wikipedia for structured data.

" So it shouldn't be surprising that we've been closely watching the Wikimedia Foundation's project Wikidata[1] since it launched about two years ago. We believe strongly in a robust community-driven effort to collect and curate structured knowledge about the world, but we now think we can serve that goal best by supporting Wikidata -- they’re growing fast, have an active community, and are better-suited to lead an open collaborative knowledge base.

So we've decided to help transfer the data in Freebase to Wikidata, and in mid-2015 we’ll wind down the Freebase service as a standalone project. Freebase has also supported developer access to the data, so before we retire it, we’ll launch a new API for entity search powered by Google's Knowledge Graph. Here are the important dates to know: The Knowledge Graph team at Google. Applications. GeoNames. Anomaly Detection is the New Black. Ted Dunning, Chief Applications Architect, MapR Technologies In a smooth-running business, something that stands out from normal usually is not good. But even if it’s a happy accident, you still need to look at it. Sounds simple, but with huge amounts of data this can be challenging, and the volume of incoming data is growing fast. More and more things are being attached to the Internet, and these things are often continuously making measurements to determine how they are working and what is around them.

The answer is to build an automated, self-adaptive anomaly detection system. Who needs anomaly detection? Good anomaly detection provides benefits in many areas, including: Security. And, perhaps most importantly, good anomaly detection allows you to find problems before your CEO does. Let’s look at how to build anomaly detectors. Discover, Don’t Define Related Stories The Internet of Things: More Connectivity Can Mean More Vulnerability.Read the story » Why is probability important? Figure 1. Google's slippery slope: If search giant pays Twitter for content, should it pay all publishers? | VentureBeat | Media | by Chris O'Brien. Toward the end of Bloomberg’s story about a potential deal between Google and Twitter to display tweets in search results, this bit at the end made me sit up: “There’s no advertising revenue involved in the deal between Twitter and Google, one of the people said.

That suggests Twitter will receive data-licensing revenue, which was $41 million in the third quarter, up from $16 million a year earlier.” In other words, Google is going to pay Twitter for better access to its content. Now, we don’t know if that’s actually the case, since neither company has confirmed the deal, let alone the terms. But if that turns out to be accurate, it would seem to have huge implications for content across the Web, particularly for news organizations.

For years, publishers have been arguing to various degrees that Google should be paying for their content. Those snippets have in particular become a hot-button issue in Europe between Google and publishers. “Where’s our check?” Powered by VBProfiles. Sourcemap. Saffron Technology Home - Saffron Technology. MAPCITE - Location Intelligence Applications For Everyone. Productivity Tip #1: Don’t Read Anything Right Now (Including This) | Hunter Walk. (13) Techmeme. The daily email newsletter to start your day — theSkimm. ★reeder. Pocket. REDEF. Nuzzel - News From Your Friends. Quixey's Deep Mobile Search Will Change The Way You Use Your Phone. Search is broken. That's because just when we figured out how to crawl and organize the infinite pages of the World Wide Web, we switched to a new way of accessing the Internet: apps.

Now, nearly 2 billion smartphones later, humanity has one experience searching the desktop web and quite another when it come to finding things on the devices we carry with us everywhere. The result? A fractured mess. Quixey, a deep mobile search company, is trying to patch things up by changing the way search works on mobile devices. "We think Google is all the world's information, but it isn't," says Quixey cofounder and CEO Tomer Kagan.

"You can't really get that out of Google, because you can't crawl it effectively," he says. Quixey's solution is to chip away at the walled gardens in which today's apps live and make the contents and functionality of each one easy to crawl, index, and search. Say, for instance, that you search for your favorite Katy Perry song. Secondly, there's app discovery. We'll see. FreshPatents.com: Updated Patent Lists & Free Tools. FreshPatents.com: Updated Patent Lists & Free Tools. GNU Collaborative International Dictionary of English. The free dictionary. Automated surveillance of 911 call data for detection of possible water contamination incidents. Ditto. Programmatic Marketing Management Platform. DataXu’s programmatic marketing software enables marketers to build stronger brands in a digital world, leveraging data and analytics to increase the efficiency and effectiveness of their customer acquisition strategies.

Understand your customersEngage consumers at every point in the customer journeyOptimally manage your marketing investments Turn Data into Insight, Action and ROI DataXu helps you market more intelligently by answering tough questions: Who are our most profitable customers? How does their response differ by product, channel and messaging? The platform’s automated learning system evaluates millions of ad impressions, identifying high-performing combinations of consumer attributes, context and creative, to find and engage the best prospects.

Join the Programmatic Marketing Revolution Successful marketing depends on reaching more customers, more effectively and efficiently. DataXu makes marketing: Machine Learning For Streaming Data - Algorithms.io. SnapAnalytx. Skimfeed V3 - Tech News Aggregator. Algorithmia - Open Marketplace for Algorithms. The 5 Most Influential Data Visualizations of All Time. Skip to main content Free Whitepaper The 5 Most Influential Data Visualizations of All Time The 5 Most Influential Data Visualizations of All Time Discover 5 powerful, beautiful visualizations that changed how people thought about the world. Data visualization allows us all to see and understand our data more deeply. “The greatest value of a picture is when it forces us to notice what we never expected to see.” – John Tukey, 1977 Without data visualization and data analysis, we are all more prone to misunderstandings and missed opportunities.

This visual presentation will show show you 5 powerful, beautiful visualizations that changed how people thought about the world. Reviewed In: About Tableau Tableau Software helps people see and understand data.

Search

Signals. NLP.