While thinking about the hype surrounding “Big Data”, I started to think about another term that I’ve also been hearing tossed around a lot in the patent, health, and library fields: crowdsourcing. The Merriam-Webster Dictionary defines “crowdsourcing” as:
The practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people and especially from the online community rather than from traditional employees or suppliers.
Crowdsourcing has come of age in a digital environment, especially over social media, where individuals or organizations can solicit help from thousands of people instantly. Everyone seems to be hopping on the crowdsourcing bandwagon, due to the low cost and speed of implementation and the sometimes highly creative responses provided by participants. The main downside to crowdsourcing seems to be the need to sift through large amounts of contributions of highly varying quality (which may be an instance where “Big Data” skills and tools can come in handy).
Here are just a few examples of crowdsourcing in the patent, health, and library fields:
- Patent Searching – A few companies, like Patexia and Article One Partners, successfully crowdsource patent searches by offering prizes and rewards to any participants who successfully locate and submit highly relevant patent prior art. Fledgling searchers can use free online patent databases,like Google Patents, Espacenet, and WIPO PATENTSCOPE (Patexia also offers free patent search tools), and participants receive guidance and support through online communities created on the company websites.
- Challenges from National Institute of Health (NIH) – NIH has offered over a dozen contests where researchers can submit solutions to various challenges, like “A Wearable Alcohol Biosensor” or “Innovation in Breast Cancer Genetics Epidemiology.” Government agencies like NIH can use the Challenge.gov site to post “a problem or question to the public and ‘solvers’ respond and submit solutions. An agency pays only for those solutions that meet the criteria and are chosen as winners.”
- Libraries Embrace Crowdsourcing – I can’t even pick a single example. A multitude of blog posts, opinion pieces, and articles describe how institutional and public libraries use crowdsourcing for a variety of projects:
- Library of Congress makes catalog corrections and enhancements to photographs in their collection, based on comments from users on Flickr.
- New York Public Library is asking “citizen volunteers to provide identification, transcription, tagging and more” for a digitized collection of “bond and mortgage records from The Emigrant Savings Bank during the years 1841–1933.”
- The Biodiversity Heritage Library, a consortium of natural history museums and botanical garden libraries, is “testing the effectiveness of gaming for crowdsourcing OCR text correction” and also “using crowds to verify the accuracy of semantic markup of text that was done by automated algorithms.”
- The British Library has created a portal called LibCrowds, which lists challenges like “help create a catalogue of Lord Chamberlain’s Plays and Correspondence.”
I haven’t even discussed one of the greatest crowdsourcing achievements, Wikipedia, which has created a vast free online encyclopedia of modern human knowledge.
Crowdsourcing has drawbacks, but there are so many possible applications that result in brilliant discoveries and new tools based on harnessing the online knowledge base, I can’t even begin to list them all.