Computational science uses large-scale computation and data to gain insights into how to solve complex physical problems ranging from the microscopic to the planetary scale. The legitimacy of computational science emerges from a continuous process of hypothesising, experimentation and recombination of code across these datasets. We've asked ourselves how we might recenter science on the numerous people involved in this process, and so propose the following first set of principles to start the conversation. We invite interested people to join our exploration.
Science is a human endeavor shaped by contributions from many individuals, and scientific insights stem from and flow through people. All stakeholders, such as academics, curators, traditional knowledge holders, journalists, reviewers, policymakers, and citizens, should be included in and benefit from the process. Scientific digital infrastructure therefore exists to both support inquisitive exploration and also to distribute the benefits of evidence-driven actions across society.
Opening the door to public participation in the scientific discourse means broadening access to data and the skills required to analyse it. Today, large institutions play the important role of bringing together researchers under one roof, but emerging digital infrastructure promises new opportunities for participation across institutional and geographic boundaries. We aim to decouple access to the tools for science from institutional affiliation, and thus reduce the barrier to meaningful participation.
Academics, journalists, and politicians all collaborating effectively across their specialisms results in maximal return on investments in science. We aim to empower creators to disseminate datasets towards the accumulation of reliable evidence and to reduce ongoing maintenance costs. By tracking positive outcomes, we seek to incentivize the responsible, ongoing curation of datasets.
Modern computational science should leverage both traditional and community-centered sources of insights and data, and enable such groups to take full advantage of digital resources. We want to develop these in an open and collaborative process that follows the principles above. We invite you to work with us!
"Our vision is a world where everybody can access tools to explore, participate in and benefit from computational science."
We are identifying use cases that can be used to help design and iterate on systems towards implementing our manifesto. The use cases are intended to represent needs that different kinds of scientists have. There will be many other additional use cases to bring to light concerns that may not be represented here. Please consider contributing yours!
Some large-scale research projects require acquiring, aggregating, manipulating, analyzing, and reporting on a multitude of data sets. Sometimes, the outputs may even connect to real-time systems, such as data dashboards or sensor networks, which need to be configured as a result of the analysis. For example, the IUCN Red List of Threatened Species is derived from a multitude of data regarding species sightings, habitat loss, and climate predictions. Among this data is output from the Centre for Earth Observation, which itself constructs complex models that assimilate data. As another example, a network of sensing buoys may receive control inputs from a control system that sends commands on the basis of weather predictions and mission inputs.
Some inputs must be protected. For example, locations of threatened species may be useful in the analysis but need to be kept private lest poachers destroy those populations. Thus, although this scenario benefits from openness and transparency in general, not everything can be made public.
Investigators: Michael Coblenz, Anil Madhavapeddy, Cyrus Omar
Meaningful participation in important scientific discourses requires specialist knowledge. For example, to evaluate the safety of a plan to deploy machine learning in public infrastructure, some understanding of current techniques and their range of applicability is required; likewise, climate data cannot be meaningfully analysed by someone who lacks prior experience in statistics. Access to specialist training of this kind is presently centralised in the universities and gated behind tuition fees and time barriers.
We will answer the erosion of public trust in the scientific process with a credible invitation: join us, learn our methods, and contribute to the discourse. To that end, we aim to build sustainable infrastructure for a decentralised network of publicly accessible online courseware, textbooks, tutorials, and social media that will serve as the roots of a new and serious partnership between the public and the scientific community.
Investigators: Jonathan Sterling, Anil Madhavapeddy
Many research projects necessitate the involvement of inputs from other researchers, including those in a different area. For example:
However, finding other researchers' digital artifacts, and then using (or extending) them is challenging today: we have relatively few mechanisms to discover existing digital artifacts (the ones we have usually require serendipitously reading a research paper or website that describes the artifact); no mechanisms to ensure that contributions from publicly available artifacts are recognized; and nearly no good mechanism to share artifacts among a small set of research groups (each of which might have an evolving set of individual participants). We will address these challenges by developing systems that make it easier to discover digital artifacts, make it easier to track the contributions of each artifact's provider, and reduce friction for sharing artifacts between research groups.
Investigators: Michael Coblenz
Programming for the Planet (PROPL) will bring us together again to refine the manifesto and continue designing our systems for decentralised science.
We held the Bellairs research summit on planetary computing, where we launched the first version of the manifesto towards human computational science.
Jon Sterling publishes the Forester 5.0 design for global identity.
Ian Brown contrasts Mastodon vs BlueSky in a deep-dive of their architectures.
Thomas Gazagnaire launches SpaceOS to perform scientific computation in orbit.
Mark Elvers operates the first capability-based distributed build platform for open source software.
Aurojit Panda rethinks the architecture of edge Internet services to bring back end-to-end simplicity.
Anil Madhavapeddy makes the case for planetary computing for data-driven environmental policy-making to handle the ingestion, transformation, analysis and publication of environmental data products.
Nate Foster considers how programming languages might help to capture property conveyancing, sparking an interest in the legal applications of ownership.
We'd love to have you on board as well; we recognise that we need lots of input and participation from diverse stakeholders across disciplines, and so we invite you to join our exploration and contribute to this open collaboration. Just get in touch with any of the participants above and we can add you to the list above. We're using Matrix to stay connected as well.