Academic Search Engines Additional Resources Information and Tips A note on Authoritative Wikipedia - Web Archiving, Aspects of Curation Web archiving is the process of collecting portions of the World Wide Web to ensure the information is preserved in an archive for future researchers, historians, and the public. Web archivists typically employ web crawlers for automated capture due to the massive size and amount of information on the Web. The largest web archiving organization based on a bulk crawling approach is the Internet Archive which strives to maintain an archive of the entire Web. 11. Software Repositories - Adding and Managing Package Repositories 11. Software Repositories As mentioned in the previous chapter, the package manager installs software by fetching packages from software repositories, therefore the software available for easy installation via the package manager depends on the configured repositories. A software repository is a collection of RPM packages (the openSUSE packaging format) and metadata for the available packages.
Open Data Catalog (Index) This catalog is created and maintained through the efforts of OpenGeoCode.Org and Crowdsourcing. If you'd like to contribute, email your list of open data portals to: firstname.lastname@example.org, or post the submission here: Download Catalog as CSV file (CC0) Data Portal Transparency Portal GIS/Gazetteer Census/Demographics Climate Health Education Commerce Agriculture/Food All Search Engines:Research Aid Databases From Topical Search Wiki Academic Ranking Journals characteristics SHERPA Databases RoMEO – A database of publisher's policies regarding the self- archiving of journal articles on the web and in Open Access repositories. JULIET – A database of funders archiving mandates and guidelines. CofactorJournalGuideGenamics JournalSeek – A catalog of research journals including journal description, abbreviation, homepage link, subject category and ISSN. Reference works Scholars social networks CrossRef – An authoritative catalog of primary research publications. JournalTOCs – A catalog of academic journals tables of contents (TOCs) RSS feeds.
Kindling: An Introduction to Spark with Cassandra (Part 1) This is an introduction to the new (relatively) distributed compute platform Apache Spark. The focus will be on how to get up and running with Spark and Cassandra; with a small example of what can be done with Spark. I chose to make this the focus for one reason: when I was trying to learn Spark two months ago I had difficulty finding articles on how to setup Spark to use Cassandra. The process is actually not that difficult, but pulling all the steps together required some searching and investigation. The secondary focus is to give you a small taste of what can be done with Spark, so that once you have the Spark cluster up and running there is motivation to keep exploring and experimenting. This is based on a talk I gave at the Chicago Cassandra meet up in late October.
OnCoRe Blueprint: Resources: Repository Information Current Repositories As of June 30, 2011 The following list was accurate as of June 30, 2011, the date this FIPSE grant was concluded. For an updated listing of repositories, please visit The Orange Grove: Florida's Digital Repository at: The number of open access repositories is very large; this is not a complete listing. It is provided as a guide to assist you in your background research. Top 20 Open Data Sources Data is everywhere, created and used by just about anyone. The days when companies or individuals had to pay significant sums of money to access useful and interesting datasets is long gone. Here is our top 20 list of the best free data sources available online. 1. Data.gov.uk the UK government’s open data portal including the British National Bibliography – metadata on all UK books and publications since 1950. 2.
Top Thesis & Dissertation References on the Web: OnlinePhDprogram.org A Master’s Thesis or Doctoral Dissertation is the capstone of many graduate programs. It requires a monumental amount of effort to put together the original research, citations, and sheer writing time to finish. Many students cruise through their master’s and PhD coursework without breaking a sweat, only to be stonewalled when it comes time to write a long, in-depth dissertation that contributes original material to the student’s chosen field. Writing about D3.js - a glob of nerdishness On Monday I signed a contract to write a book about D3.js for Manning Publications. If all goes according to schedule, it should be out early next year, with draft chapters available in electronic form for subscribers even sooner. Wow! It's certainly exciting, though I'm still worried over how much work it will be. I do get paid a useful advance at two milestones within the process, but I'll likely need to raise my consulting rates to compensate for my overall lost time. I didn't say "yes" lightly, but I couldn't really say "no" either.
Data repositories From Open Access Directory This list is part of the Open Access Directory. This is a list of repositories and databases for open data. Please annotate the entries to indicate the hosting organization, scope, licensing, and usage restrictions (if any). Row vs. Columnar vs. NoSQL Database Options I am often asked how to select the right database. The answer is it depends - no one size fits all situations. How do you choose? There's no easy answer - choosing the best is difficult because a good developer will want to balance the strength of the project, the availability of commercial support, and the quality of the documentation with the quality of the code.The greatest divergence is in the extras - all will store piles of keys with their values, but the real question is how well they split the load across servers and how well they propagate changes across them.