Capacity BuildingBlog

Rethinking a Toolkit Design for the Messy Reality of Policymaking

read

|

16 Feb 2026

|

About

Through the ASEAN–UK SAGE programme, we work with the Ministry of Primary and Secondary Education (MoPSE) in Indonesia to develop a standardised, evidence-based policymaking process aligned with the National Administrative Body (LAN) guidelines, known as the Indeks Kualitas Kebijakan (IKK), using a Sandbox approach. 

We learned early on that while theory provides an essential foundation, a standardised toolkit only becomes truly useful when it is grounded in deep contextual understanding, iterative prototyping, and continuous user feedback. Anchored in a Do–Measure–Learn cycle, the Sandbox prioritises early learning through rapid, real-world experimentation, enabling teams to iterate quickly and scale approaches that prove effective.

EdTech Hub and MoPSE are therefore co-designing a toolkit to support long-term institutional ownership. The aim is for MoPSE to lead dissemination across the ministry and apply the approach independently to future policy challenges. Building on Sprint 0, the partnership has progressed from understanding the policymaking context to prototyping and initial testing in Sprint 1, following the iterative process illustrated in the Sandbox diagram below:

The Do–Measure–Learn cycle (see diagram flow) is co-designed to help us learn as early as possible, so we can iterate quickly and make timely, effective interventions. Each experiment helps us learn what works (and what doesn’t), adapt the design, and scale only what proves useful (see more here).

The Learning and Iterating Cycle: 

Sprint 1 builds directly on insights from Sprint 0, which highlighted that the core challenge is not a lack of guidance but the need for relevance, appropriate timing, and professional judgment in high-pressure policymaking environments. Guided by the Do–Measure–Learn cycle, Sprint 1 was scoped as follows:

  • Do (prototype development): We developed an initial version of the toolkit aligned with global practices and Indonesia’s Indeks Kualitas Kebijakan (IKK), emphasising flexible, modular supports suited to high-velocity policymaking contexts.
  • Measure (user testing): We tested the toolkit in real settings, gathering user feedback on what works in practice. This confirmed that policy decisions often cannot wait for “perfect” data.
  • Learn (refining the path): We synthesised the feedback to inform the next iteration, prioritising relevance, usability, and scalability.

As the Do–Measure–Learn cycle unfolded in Sprint 1, patterns emerged that challenged conventional assumptions about evidence and revealed important implications for policy design and implementation.

💡 Sandbox lesson:  Testing the toolkit in real decision contexts revealed that relevance often matters more than methodological purity.

1. Redefining Evidence (Do & Measure)

Through our initial design (Do) and subsequent engagement with MoPSE (Measure), we discovered that in high-pressure policymaking, evidence is not synonymous with academic research. As Shaxson (2005) puts it: “Evidence… is any information that helps policymakers make decisions and get results that are concrete, manageable and achievable.”

💡 The toolkit now explicitly encourages multiple types of evidence: contextual and experiential moving beyond academic studies alone. It helps teams draw on local capacity data, lived experiences gathered through focus groups and interviews to make defensible decisions under time pressure.

2. Navigating the “Non-Linear” Reality (Learn)

Policy is often presented as a tidy, linear cycle, from problem identification to evaluation. In practice, it’s anything but. Policy processes loop back on themselves, stall, jump ahead, and sometimes tie themselves in knots.

This messiness is shaped by the interaction of the three Ps (Kingdon, 2014; Gerston, 2010):

  • Problem: Issues can emerge suddenly from new data or crises and must be framed amid uncertainty.
  • Politics: Power, agendas, and timing can trigger sudden pivots or bring progress to a halt.
  • Policy: Proposed solutions collide with both problems and politics, requiring constant adjustment.

💡 The toolkit focuses on identifying and acting on “evidence windows” within non-linear policy processes – critical moments when the right information can inform timely decisions, even amid uncertainty and shifting political dynamics.

3. Institutional Integration & Thinking over Compliance (Learn)

By reviewing the Indeks Kualitas Kebijakan (IKK)* and discussing it with MoPSE colleagues, we refined our understanding of the institutional ‘point of no return.’ This refers to the stage at which a policy enters formal legislative drafting, where timelines and legal requirements reduce the space for revision or course correction. Once this threshold is crossed, changes become difficult. As a result, the toolkit’s value must lie upstream, supporting reflection, sense-making, and informed judgment before drafting begins, rather than promoting box-ticking compliance after decisions are effectively locked in.

*IKK is designed to help policymakers evaluate their work across four key areas:

  • Policy Planning: Ensuring data is valid and stakeholders are involved.
  • Policy Implementation: Checking coordination and field effectiveness.
  • Evaluation & Sustainability: Measuring real community impact and alignment with other rules.
  • Transparency & Participation: Giving the public access to information and a real way to give feedback.

The toolkit embeds a clear Flex Loop, a built-in pause in the policy process where teams review new data, incorporate feedback, and reflect on what is and is not working. These moments are designed to enable intentional course correction before formal drafting begins, including revisiting problem definitions or revising plans as needed.

What This Means for Us Moving Forward

Taken together, these insights reaffirm some core design choices, while also sharpening where the toolkit must evolve to reflect the realities of policymaking in practice better.

What stays: Modular, not a manual

The non-linear nature of policymaking confirms that a sequential, step-by-step guide would be poorly matched to real decision-making contexts. Priorities shift, disruptions occur, and teams often need to jump ahead, loop back, or focus narrowly on a specific bottleneck. The toolkit, therefore, remains deliberately modular and ‘plug-and-play,’ with standalone components that can be used independently. Rather than reading from start to finish, users are guided by self-assessment tools to identify their specific friction points. whether at the ‘starting line’ or in accessing usable evidence and directed to the most relevant resources.

What needs to change: Optimisation over perfection

Our expanded understanding of evidence and the importance of timing requires a shift away from perfection-driven models of evidence use. Policy windows open and close unpredictably and waiting for ideal academic data can mean missing the opportunity to act. Going forward, the toolkit prioritises fit-for-purpose evidence that balances rigour with speed. This includes practical ‘hacks’ for generating useful evidence in hours or days rather than months, and rapid testing with small user groups to surface the majority of usability issues before policies are formally launched.

What’s next: Centering people and process in Sprint 2 

As we transition from Sprint 1: Prototyping and Initial Testing to Sprint 2: Development and Refinement, the focus shifts from learning what matters to translating those insights into concrete, usable toolkit components. The modular principles tested in Sprint 1 will now be operationalised into practical tools that can be applied in real policymaking contexts.

In Sprint 2, we will focus on:

  • Developing the ‘plug-and-play’ modules that allow teams to enter, exit, and loop through the toolkit as policy conditions change.
  • Testing these modules in real-time policy scenarios with MoPSE colleagues.
  • Refining the design based on how effectively the tools support reflection and professional judgment. 

As the toolkit takes shape, we will continue to share updates on its development. Further information on EdTech Hub’s Sandbox approach and our work in Southeast Asia is available on our website.

Acknowledgements

The team is grateful for the collaboration, continuous feedback and support of the Ministry of Primary and Secondary Education in Indonesia, primarily the team at the Centre of Education Standards and Policy (PSKP), primarily Dr Irsyad Zamjani, Mr. Andry Rihardika, Esy Andriyani, and team, as well as the EdTech Hub team, Resiana Rawinda and Aprilia Chrisani, who are leading this work, with support from Gita Luz, Sangay Thinley, Jazzlyne Gunawan, and Alice Carter.

Share: