background preloader

Some Datasets Available on the Web » Data Wrangling Blog

Some Datasets Available on the Web » Data Wrangling Blog

Data Mining Research - www.dataminingblog.com | Data Mining Blogs If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting! I posted an earlier version of this data mining blog list in a previously on DMR. Here is an updated version (blogs recently added to the list have the logo “new”). I will keep this version up-to-date. Abbott Analytics: both industry and research oriented posts covering any topic related to data mining (Will Dwinnell and Dean Abbott)A Blog by Tim Manns: as defined in it’s subtitle, this blog deals with “data mining, analysing terabyte data warehouses, using SPSS Clementine, telecommunications, and other stuff” (Tim Manns).AI, Data mining, Machine learning and other things (Markus Breitenbach): Markus writes about machine learning with a focus on statistics, security and AI.anuradha@NumbersSpeak: A blog on analytics applications, statistics and data mining (Anuradha Sharma).Blog by bruno: This blog covers a very large number of topics including web data analysis and data visualization. Ryan Rosario

UNdata Data Mining and Predictive Analytics Download Detailed Files by Election Cycle Tutorial: Working with Downloadable FEC Data Files Using MS Access. These files contain the most recent 10 years of data, the current election cycle plus the most recent five (5) election cycles. These files are updated each Sunday and include data entered into the FEC database through that date. Please note that complete entry from each reporting period takes about 30 days. Committees The committee master file contains one record for each committee registered with the Federal Election Commission. The file contains basic information about the committees. Candidates The candidate master file contains one record for each candidate who has either registered with the Federal Election Commission or appeared on a ballot list prepared by a state elections office. Individual Contributions The individual contributions file contains each contribution from an individual to a federal committee if the contribution was at least $200. Individual Contributions File Updates Itemized Committee Contributions

Available KNIME Extensions KNIME extensions provide additional functionality for KNIME ranging from Excel support, R integration, JFreeChart for advanced visualisations, 100+ nodes wrapping machine learning algorithms from Weka, the Reporting extension, and much more. Below you can find a list of features. How they are installed is described on the update manager help page. In addition, we provide extensions from KNIME Labs and KNIME Community. R Integration With this feature that integrates the statistics project R into KNIME it is possible to run snippets of R code as one step of the workflow, open R views, or even learn models within R. R plug-in contains all nodes but assumes either an existing local R installation or R server. R binaries if you do not have a local R installation you can download the binaries (for Windows only). JFreeChart Integration This plug-in contains a full set of visualisation nodes in addition to the ones in our base distribution. Chemistry Base Chemistry Types & Nodes Chemistry Add-Ons

Summary Campaign Finance Data Files Files by Election Cycle Tutorial: Working with Downloadable FEC Data Files Using MS Access. House and Senate Candidates Current Campaigns Candidate Financial Summary Without PAC Breakdown - the Most Current Information Available These files contain one record for each campaign. When using these summary files you need to be aware of some possible double counting of activity. All Candidates The all candidate summary file contains one record including summary financial information for all candidates who raised or spent money during the period no matter when they are up for election. When using the all candidates summary file you need to be aware of some possible double counting of activity. Political Action Committees PAC Summary The most current information available for PACs is a file called, webk.

Thank you - OutWit Hub So, what is OutWit, in a word? OutWit is a Web collection engine for everyone. It runs on your Windows, MacOS or Linux machine and allows you to browse through and easily grab information, images, contacts or files from the Internet, in just a few clicks. Originally conceived for researchers and data managers, the program is bringing Web scraping tools to everyone for both business and personal use. Just browse the Web for pages that include the information you seek. How to start? Have a look at our quick start examples. Here are a few more things you might enjoy testing, to start exploring OutWit Hub's features: In your search engine, search for any topic and click on the Next in Series arrow once or twice. In the feedback menu, you will find an access to the bug report and suggestion pages.

Periodic Table of Social Media Elements Fellow Ad Age Power 150 member Eyecube, aka Rick Liebling put out a periodic table of social media elements: While it’s getting a bit of criticism (yes, social media doesn’t have anything to do with the periodic table of elements), that’s not what it is all about. It is a good thing to think in creative terms. I’m a big fan of visualizations, so this was right up my alley: [image credit] The only frustrating thing about it is trying to figure out what all the symbols mean! Social Media Behaviours: (These are the positive things you choose to do) Sh = Share Mt = Monitor Fr = Friend Cv = Converse Cu = Customize Li = Listen En = Engage Di = Dialogue Social Media All-Rounders: (These are the people you can find all over the Social Media landscape) Mc = Mack Collier (The Viral Garden) To = Todd Defren (PR Squared) Lo = Lee Odden (Online Marketing Blog) Dr = Darren Rowse (ProBlogger) Mj = Mitch Joel (Six Pixels of Separation) Ds = David Meerman Scott (Web Ink Now) Pe = Peter Kim (Being Peter Kim) Cc = C.C.

Surfing Restaurant Inspections with Microsoft Data Explorer and GeoFlow - Microsoft Business Intelligence Father’s Day is approaching and you might be thinking about a good place to have a nice lunch with your Dad… We would like to show you how Data Explorer and Geoflow can help you gather some insights to make a good decision. In order to achieve this, we will look at publicly available data about Food Establishment Inspections for the past 7 years and we will also leverage the Yelp API to bring ratings and reviews for restaurants. For the purpose of this post, we will focus on the King County area (WA) but you can try to find local data about Food Establishment inspections for your area too. What you will need: Data Explorer. What you will learn in this post: Import data from the Yelp Web API (JSON) using Data Explorer. That sounds like too much for a single blog post, but let’s get started and you’ll see how it is easier than you might think. Import data from Yelp API The first thing that we will have to do is click the “From Web” button in the Data Explorer ribbon tab.

Related: