Harnessing Communities of Knowledge: Building an Automated Species Identification Tool

961
32
Опубликовано 19 ноября 2018, 16:47
There exists communities of knowledge, and these communities are a distributed, social network of people. Some academic examples include the medical field, engineering, and the natural sciences; non-academic examples include stamp collectors, car enthusiasts and fashionistas. An important aspect of these communities is that the knowledge contained in the distributed network of people, as a whole, is greater than the sum of the individuals. Modern technology has introduced new components into these communities. The internet has made it faster, and easier to communicate, and the type of data that is communicated has become much richer, including images, videos, documents and code. We also have the ability to store and retrieve all of this data, so these communities are supporting both knowledge and vast quantities of data. In addition the world has become more connected, allowing more people to find and join communities, and start new ones.

What, if anything, can we learn from these communities? Can we learn who knows what, and what their area of focus is? Can we learn how to combine information from multiple people within these communities? And can we distill the distributed knowledge of the community and make it centralized and consolidated so that anyone, anywhere can access it quickly and efficiently?

I explore these questions through the naturalist community via the website iNaturalist. In this talk I will present models that learn the skills of the community members and are capable of combining those skills to predict the species label for an observation. I will discuss building computer vision datasets from data provided by this community, classification results on those datasets, and I will demo a new algorithm that reduces the memory requirement of large classification networks for fast on device inference.

See more at microsoft.com/en-us/research/v...
автотехномузыкадетское