Automated Assistance for Developing Software in Ecosystems of the Future

Software pervades every aspect of human life. Modern civilisation relies on software at an increasing pace, and society needs to adapt itself to become more resilient against the negative sides of the digital era. Software ecosystems form large socio-technical networks of technical and social components that interact with each other on top of common software and hardware platforms. Well-known software ecosystems are operating system distributions like Linux and Android, package dependency networks like npm for JavaScript and CRAN for the popular statistical computing environment R, and the OSCAR digital health ecosystem for Electronic Medical Records. SECO-ASSIST will pave the way by providing novel automated techniques to coordinate the numerous contributors and contributions to such ecosystems as effectively as possible. To do so, SECO-ASSIST will expand the state-of-the-art in software recommendation techniques to meet the inherent challenges of future software ecosystems.

The project consortium is composed of seasoned software engineers that are research authorities in software evolution, software testing, static program analysis and database usage. Their complementary expertise will permit a scientific breakthrough which will nurture the software ecosystems of the future.

Software ecosystems are one of the most promising approaches for organising the software needs of the future. To realise this potential, software ecosytems need to coordinate the contributions of their numerous contributors as effectively as possible. This requires addressing and overcoming four main challenges:

  • Longevity. Ecosystems evolve at a rapid pace and this trend is only aggravated by modern software tools and agile software processes. The need to maintain them for a prolonged period of time implies that they accumulate a wealth of data over their lifespan.
  • Scale. Software ecosystems contain large numbers of interdependent software components developed by thousands of interacting developers. Their evolution history contains huge amounts of structured and unstructured data that needs to be processed.
  • Heterogeneity. Software ecosystems are developed using multiple programming languages; are tightly intertwined with external databases using a variety of data formats; and include many types of interconnected software artefacts.
  • Community. Ecosystem contributors interact frequently and intensively to engineer the software artefacts within an ecosystem. Many ecosystem contributions are voluntary in nature, hence the ecosystems must try to attract and retain contributors.

SECO-ASSIST aims to leverage recent advances in software recommendation systems to support more effective development of software in ecosystems. To realise this goal, SECO-ASSIST adopts a proof-by-construction research strategy formulating four objectives:

  • Optimise library usage: Software ecosystems typically contribute many reusable software libraries, representing units of functionality explicitly intended to be reused. To coordinate the co-evolution of library and system contributions in a software ecosystem, recommending the most appropriate library should embrace the ecosystem itself as a source of already-proven solutions.
  • Improve database manipulation: Improper use of database technologies is a considerable bottleneck in data-intensive software systems. In order to improve database manipulation, recommendation techniques should analyse good and bad usage in comparable systems within the same ecosystem.
  • Enhance team interaction: To attract newcomers and retain existing contributors in software ecosystems, recommendation techniques should help in identifying and addressing “toxic” team members; as well as predicting future abandonment and finding adequate replacements.
  • Strengthen software tests to reducing the impact of software defects. To support test-related activities in future ecosystems, recommendation techniques should encourage contributors to focus tests on those parts of the ecosystem with the highest risk.