In social sciences, similarly to other fields, there is exponential growth of literature and textual data that people are no more able to cope with in a systematic manner. In many areas there is a need to catalogue knowledge and phenomena in a certain area. However, social science concepts and phenomena are complex and in many cases there is a dispute in the field between conflicting definitions. In this paper we present a method that catalogues a complex and disputed concept of social innovation by applying text mining and machine learning techniques. Recognition of social innovations is performed by decomposing a definitions into several more specific criteria (social objectives, social actor interactions, outputs and innovativeness). For each of these criteria, a machine learning-based classifier is created that checks whether certain text satisfies given criteria. The criteria can be successfully classified with an F1-score of 0.83–0.86. The presented method is flexible, since it allows combining criteria in a later stage in order to build and analyse the definition of choice.
|Title of host publication||Natural Language Processing and Information Systems (NLDB 2018)|
|Number of pages||12|
|Publication status||Published - 13 Jun 2018|
|Name||Lecture Notes in Computer Science|
- text mining
- natural language processing
- social innovation