Landing in Tokyo in November, my first data point wasn’t from a logframe – it was from my feet. They reminded me that you can in fact walk an entire city in one day chasing parallel sessions, coffee, and people you’ve only ever seen on Zoom.

I was at the 5th Asia Pacific Evaluation Association (APEA) Conference and EvalVisionAsia 2025 in Tokyo – four days of talking about institutionalisation of evaluation, capacity building, and why our MEL systems still behave like that colleague who shows up late but insists they were “in the meeting mentally.” If you read my earlier reflections from the German Evaluation Society Conference (#DeGEval25), you know I enjoy conferences that force us to ask uncomfortable questions about our profession. This time, Tokyo delivered those questions with a side of matcha.
In this piece, I write about my main takeaways – written for everyone interested in evaluation, whether you’re a policymaker, evaluator, student, or simply wondering why your beautifully designed theory of change keeps misbehaving.

1. From “Do More Evaluation” to “Actually Use It”

The big theme of #APEAConf2025 was not doing more evaluations, it was using them. The CEval Evaluation Globe keynote unpacked how 50 countries are institutionalising evaluation, where it’s stuck, and who’s actually paying attention to the findings. Then APEA did something clever: instead of talking in abstract, they threw parliamentarians, ministries, academia, VOPEs and civil society into separate working groups and said, “Okay, you, what exactly will you do to strengthen the use of evaluation?”

What struck me was this: Parliamentarians admitted they often receive evaluation reports after decisions are made – a bit like getting a weather forecast once the flood has receded. Ministries and agencies talked about fragmented systems, staff rotations, and the eternal excuse: “We don’t have time to read 120-page reports.” Academia proudly listed evaluation courses, while practitioners quietly wondered why many graduates still arrive not knowing the difference between monitoring and evaluation.VOPEs and civil society described doing evaluations that never quite make it past the “project partners” folder on someone’s laptop.

The message was simple and uncomfortable: if evaluation is not changing behaviour in these institutions, it’s just an expensive documentation exercise.The more hopeful part? Across sessions you could feel a growing consensus that use doesn’t happen by accident. It needs:

Embedded structures, like evidence labs inside ministries that sit with policy teams, not in a lonely “M&E unit” at the end of the corridor.
Clear rules of the game , clauses that mandate evaluations in laws, budget processes and program cycles. Asia Pacific Evaluation Association
Public-facing evaluation, more open reports, more media engagement, more citizens who know they’re allowed to ask, “What did we learn?”

In other words: institutionalisation is not a fancy word. It’s just what happens when evaluation moves from “nice-to-have” to “you can’t approve this policy without it.”

A cross-section of participants during the Opening Remarks by Yuriko Minamoto, President of the Japan Evaluation Society

2. Capacity Building for Policymakers: Less PowerPoint, More Politics

If you’ve ever watched a policymaker scroll through their phone during an “M&E capacity building” workshop, you’ll appreciate why this topic got a lot of airtime. From the Asia Pacific Winter School for Young and Emerging Evaluators, to sessions on national evaluation systems and diagnostic tools, there was a clear shift: capacity building is no longer just technical, it’s political.
Across some of the panels I attended, these were some recurring themes:

Teach what power actually looks like in an evaluation system. It’s not just indicators and logframes; it’s who decides the questions, the timing, the level of transparency, and whether an uncomfortable recommendation is implemented or quietly buried.

Help policymakers ask better questions, not become part-time statisticians. Most don’t need to run regressions; they need to distinguish between “This program works” and “This program works, but only for urban youth with internet access.”

Co-create, don’t “train-and-run.” The best examples came from long-term partnerships where ministries and evaluation teams jointly developed national evaluation agendas or indices, instead of consultants parachuting in with pre-packaged frameworks.

One panelist framed it beautifully: “If our capacity building doesn’t change how policies are debated in cabinet, then we’ve simply upgraded a few CVs – not the system.” For those of us who design and deliver training, that’s a useful (and slightly painful) KPI.

Evaluators live on coffee, policymakers eat, just like you!

3. Evaluative Thinking as the Bedrock of Transformational Change

If you followed #EvalVisionAsia on X and LinkedIn, you’ll have seen a recurring line: evaluation is more than accountability; it supports learning, innovation and transformation. In Tokyo, this came alive in three ways:

Indigenous and culturally responsive evaluation

Sessions led by EvalIndigenous and partners reminded everyone that evaluative thinking is not something invented in donor logframes; many communities have long-standing traditions of reflection, reciprocity and collective learning. The challenge is to stop treating these as “interesting case studies” and start redesigning our mainstream methods around them.

Evaluation in uncertain times (including AI hallucinations)

The panel on M&E in uncertain times asked what happens to accountability when states are apathetic, fake news travels faster than evaluation reports, and AI generates extremely confident nonsense.
The answer wasn’t “ban AI”,  it was to strengthen critical evaluative thinking among citizens, journalists, and students; Make evaluation findings more accessible than conspiracy theories, and Equip evaluators to interrogate algorithms just as we interrogate data sources.

Global Evaluation Agenda 2.0 and systems thinking

Discussions on the Global Evaluation Agenda 2.0 (GEA 2.0) pushed us to see evaluation as an ecosystem: enabling environment, institutions, capacities, and catalytic actions reinforcing each other.
It’s hard to insist on “transformational change” if our own profession is still organised as a series of disconnected projects.

The big takeaway? Evaluative thinking is no longer a niche competency for M&E officers. It’s a survival skill for societies navigating climate shocks, political instability, and technology that can automate both insights and misinformation.

4. Standardising the M&E Curriculum: Still a Needle in a Haystack

Now to the part that made me both hopeful and slightly exhausted: M&E education. The working group on academia and the later panel on the role of M&E education in countering state apathy, fake news, and AI hallucinations highlighted something many of us have experienced first-hand:

We are swimming in M&E courses, but wading in confusion about what a “competent evaluator” should actually know. Across the region, universities and training institutes are offering everything from one-week certificates to full master’s programmes in evaluation. Some are deeply grounded in theory and methods; others are, let’s say, “strongly inspired” by donor logframes and online templates.

Oludotun Babayemi, Cloneshouse speaking during one of the Education in M&E working group panel session

The conference conversations circled around a few recurring tensions: Breadth vs depth – Should core curricula prioritise methods, or equally emphasise ethics, politics, systems thinking, and communication?; Global standards vs local relevance – How do we balance competencies promoted by global networks with the realities of national evaluation systems that are underfunded, politicised, or both? Formal degrees vs non-formal pathways – Young and emerging evaluators are increasingly learning through VOPEs, mentoring, and practice-based programmes, not just universities.

The humorous part is that many of us have now attended conferences on “standardising M&E education” on at least three continents, and still struggle to agree on what should be in a basic “Intro to Evaluation” course. The crucial part is this: without clearer, widely owned competency frameworks, we risk reproducing a two-tier system – a small group with deep grounding and many others left to piece together skills via ad hoc trainings and YouTube.

So, What Do We Do With All This?

Walking out of the closing session – and into a cultural tour generously hosted by Tokyo – I kept coming back to four simple actions for anyone who cares about evaluation, whether in Asia-Pacific, Africa, or elsewhere:

Design for use from day one. Don’t wait until the final report to ask who will use the findings. Put parliamentarians, policymakers, and communities in the room early – and often.

Invest in political and relational capacity, not just technical skills. Our best theories of change are useless if we can’t navigate power, incentives and institutional realities.

Treat evaluative thinking as a public good. Bring evaluation into schools, media, and public debates. If fake news can trend, so can evidence.

Join the messy conversation on M&E curricula. Whether you’re a practitioner, lecturer, or student, your voice is needed to shape what “good enough” looks like for the next generation of evaluators.

If #DeGEval25 nudged us to think harder about AI and evaluation in Europe, #APEAConf2025 and #EvalVisionAsia reminded us that institutionalisation and capacity building are not abstract concepts. They are daily, negotiated practices between actors with very different power, incentives, and worldviews. And that, perhaps, is the biggest takeaway from Tokyo: evaluation will not institutionalise itself. It will take stubborn evaluative thinking, unapologetic advocacy, and a lot of cross-continental learning – preferably with decent coffee and good Wi-Fi.

Until the next conference, you can find us continuing these conversations under Cloneshouse resource page. Now your turn, what do you think?

Leave A Comment