- Complex problems such as climate change pose severe challenges to societies worldwide. To overcome these challenges, digital innovation contests have emerged as a promising tool for idea generation. However, assessing idea quality in innovation contests is becoming increasingly problematic in domains where specialized knowledge is needed. Traditionally, expert juries are responsible for idea evaluation in such contests. However, experts are a substantial bottleneck as they are often scarce and expensive. To assess whether expert juries could be replaced, we consider two approaches. We leverage crowdsourcing and a Large Language Model (LLM) to evaluate ideas, two approaches that are similar in terms of the aggregation of collective knowledge and could therefore be close to expert knowledge. We compare expert jury evaluations from innovation contests on climate change with crowdsourced and LLM’s evaluations and assess performance differences. Results indicate that crowds and LLMs haveComplex problems such as climate change pose severe challenges to societies worldwide. To overcome these challenges, digital innovation contests have emerged as a promising tool for idea generation. However, assessing idea quality in innovation contests is becoming increasingly problematic in domains where specialized knowledge is needed. Traditionally, expert juries are responsible for idea evaluation in such contests. However, experts are a substantial bottleneck as they are often scarce and expensive. To assess whether expert juries could be replaced, we consider two approaches. We leverage crowdsourcing and a Large Language Model (LLM) to evaluate ideas, two approaches that are similar in terms of the aggregation of collective knowledge and could therefore be close to expert knowledge. We compare expert jury evaluations from innovation contests on climate change with crowdsourced and LLM’s evaluations and assess performance differences. Results indicate that crowds and LLMs have the ability to evaluate ideas in the complex problem domain while contest specialization—the degree to which a contest relates to a knowledge-intensive domain rather than a broad field of interest—is an inhibitor of crowd evaluation performance but does not influence the evaluation performance of LLMs. Our contribution lies with demonstrating that crowds and LLMs (as opposed to traditional expert juries) are suitable for idea evaluation and allows innovation contest operators to integrate the knowledge of crowds and LLMs to reduce the resource bottleneck of expert juries.…

