Loading…
Venue: BlackBox clear filter
arrow_back View All Dates
Friday, June 5
 

09:00 EEST

Discussion Panel
Friday June 5, 2026 09:00 - 10:00 EEST
Join us for an exciting panel discussion with industry experts as they share perspectives on the state of AI, Quality Engineering, and Software Testing in 2026
Friday June 5, 2026 09:00 - 10:00 EEST
BlackBox Kultuurikatel

10:00 EEST

Coffee Break
Friday June 5, 2026 10:00 - 10:30 EEST

Friday June 5, 2026 10:00 - 10:30 EEST
BlackBox Kultuurikatel

10:30 EEST

Reinventing The Wheel
Friday June 5, 2026 10:30 - 11:10 EEST
Back in 2016 at trivago, we were building a new Selenium-based test framework with Cucumber, but the standard reporting tool wasn't quite fitting our needs. It showed lots of information, but finding the key details about which scenarios failed and why meant digging through charts and stats that weren't really helpful for our workflow. During a company hackathon, I decided to build something more focused on what we actually needed to see.


I used Cucumber's JSON output and some templating to create Cluecumber—a cleaner way to view test results that puts the important stuff up front. It worked well enough that we open sourced it with company backing, and eight years later it's had about 90 releases and is being used by testing teams around the world. It's been rewarding to see something that started as a weekend project actually help other people solve similar problems.


This talk covers the technical choices behind Cluecumber, but focuses more on what I learned from maintaining an open source project. From handling feature requests and common questions to keeping code clean while adding new functionality, plus the benefits of company-backed open source for everyone involved. I'll share why sometimes building your own solution makes sense, what works well for creating tools people want to use, and some insights from eight years of project maintenance.


Key takeaways:
  1. Understand why clear and concise reporting of test results is beneficial for all parties of the software development lifecycle
  2. Learn about when and where our test reports help in further exploratory testing and bug tracking
  3. See why it can be better to reinvent the wheel instead of going with using existing ones
Speakers
avatar for Benjamin Bischoff

Benjamin Bischoff

Test Automation Engineer, trivago N.V.
After 15 years of being a software developer and trainer, Benjamin transitioned to test automation in 2016. Currently, he works as a Test Automation Engineer at trivago N.V. in Düsseldorf, Germany. There, he focuses on backend and frontend test technologies and pipelines. Benjamin... Read More →
Friday June 5, 2026 10:30 - 11:10 EEST
BlackBox Kultuurikatel

11:10 EEST

Testing Agentic Ai Applications: Beyond Traditional QA
Friday June 5, 2026 11:10 - 11:50 EEST
Traditional software testing assumes deterministic behaviour: predictable inputs produce expected outputs. Agentic AI systems shatter this assumption. These autonomous agents make independent decisions, learn from interactions, and exhibit emergent behaviours that render traditional unit and integration testing insufficient.This talk examines critical testing challenges through three real-world case studies:Voice AI Agent: Deployed across 20+ corporate environments, this system processes natural speech, maintains conversational context, and autonomously decides what additional information to provide. Traditional testing covered individual components but missed integration issues where the agent would correctly understand "Q3 sales figures" but autonomously add irrelevant market trend analysis.Phone Caller Agent: Handling 5,000+ patient interactions for healthcare appointment scheduling and reminders. Standard integration tests passed, but the agent failed in production when encountering background noise, elderly patients requiring slower conversations, or unexpected human responses that weren't in test scenarios.Chat Agent: Processing 100+ daily customer service conversations with multi-session context retention. While individual NLP components performed well, the integrated agent exhibited unexpected behaviours during complex, multi-issue conversations that spanned several sessions.These case studies reveal five critical testing gaps:Non-deterministic behavior validation – the same inputs can produce different valid outputsContextual decision testing – validating autonomous choices about escalation, information depth, and communication styleMulti-modal integration complexity – components work individually but fail in integrated agent workflowsContinuous learning validation – ensuring agent improvements don't introduce biases or degrade existing capabilitiesReal-world variability simulation – testing across acoustic environments, human communication patterns, and infrastructure variationsThe presentation introduces a practical testing framework specifically designed for agentic systems: Behavioural Goal Testing (testing achievement rather than outputs), Probabilistic Validation (acceptable outcome ranges vs. exact matches), Adversarial Scenario Generation (systematic edge case creation), and Contextual Journey Simulation (multi-session user interactions).


Key takeaways:
  1. How to test non-deterministic AI systems with confidenceParticipants will learn how to move beyond exact assertions and design test oracles based on intent, semantics, and properties, enabling reliable validation of probabilistic LLM and agent outputs.Practical frameworks for validating LLMs and multi-agent architectures
  2. Attendees will gain hands-on experience testing AI systems across layers, including orchestration, inference, and inter-agent communication, using structured frameworks and real-world scenarios.
  3. Actionable tools to operationalize AI quality in productionThe workshop equips participants with Python-based evaluators, red teaming techniques, and automated quality metrics that can be integrated into CI/CD pipelines and governance strategies immediately.

Speakers
avatar for Srinivasan Sekar

Srinivasan Sekar

Director of Engineering, Lambdatest
Srini.codes
avatar for Sai Krishna

Sai Krishna

Director of Engineering, TestMu AI
I am a Director of Engineering at LambdaTest with a decade of experience in testing mobile applications and building automation frameworks. As an active contributor to Appium and a member of the Appium organization, I am deeply involved in the open-source community. I am passionate... Read More →
Friday June 5, 2026 11:10 - 11:50 EEST
BlackBox Kultuurikatel

11:50 EEST

Running A Thousand End-To-End Cypress Tests Every Day
Friday June 5, 2026 11:50 - 12:30 EEST
In this talk, I show how we run a lot of full end-to-end Cypress web application tests every day. In addition to running the full data set, we do separate feature test runs based on test tags. We also allow everyone from all teams to trigger the tests right from GitHub Actions UI. This lets every group quickly test their feature before merging into the main branch. For pull requests, we employ source code analysis based on data test IDs to run the affected tests first for quicker feedback. The software automation team uses the flake test information to chase the sources of the underlying errors to minimize noise and make every passing test run give us confidence in the released code, and every failing test run useful to quickly diagnose the real underlying issue. The presentation covers test writing, test organization, selecting tests to run based on the source code changes, running tests in different resolutions. I also look into making the tests faster by employing data creation and caching, as well as using API calls to bypass the user interface in some places. Finally, making the tests robust and flake-free and triaging the failed runs is an ongoing activity for the automation team.


Key takeaways:
  1. How to run 1000 of end-to-end tests quickly
  2. Which tests to run on a pull request
  3. How AI is helping us pick tests to run
Speakers
avatar for Gleb Bahmutov

Gleb Bahmutov

Sr Director of Engineering, Mercari US
Gleb Bahmutov is a JavaScript ninja, image processing expert, and software quality fanatic. During the day Gleb is making the engineers more productive at Mercari US in his position as the Senior Director of Engineering. At night he is fighting software bugs and blogs about it at... Read More →
Friday June 5, 2026 11:50 - 12:30 EEST
BlackBox Kultuurikatel

12:30 EEST

Lunch
Friday June 5, 2026 12:30 - 13:30 EEST

Friday June 5, 2026 12:30 - 13:30 EEST
BlackBox Kultuurikatel

13:30 EEST

When Life Gives You Lemons… Are You Counting Them Or Making Lemonade?
Friday June 5, 2026 13:30 - 14:10 EEST
Teams often rely on test cases executed, bugs reported, and pass rates to measure success. These numbers might look impressive, but do they truly reflect software quality? Vanity metrics can mislead teams, encourage the wrong behaviours, and create a false sense of progress.   This talk introduces a 7-step framework to move beyond superficial KPIs and focus on metrics that drive real value. Inspired by analytical approaches in competitive sports, this model helps teams make better decisions, align testing efforts with business goals, and ensure that data supports meaningful improvements.


Key takeaways:
  1. The risks of vanity metrics and how they can mislead decision-making.
  2. How to design KPIs that focus on value, not just activity.
  3. A practical framework to ensure testing metrics drive meaningful change.

Speakers
avatar for Chris Armstrong

Chris Armstrong

DevRel Test Advocate, SmartBear
Chris (he/him) is a strategic and context-informed quality engineering leader with nearly two decades of experience helping organisations improve their quality practices. Specialising in strategic test leadership, Chris excels at cross-functional leadership, working across QA, Development... Read More →
Friday June 5, 2026 13:30 - 14:10 EEST
BlackBox Kultuurikatel

14:10 EEST

Lessons Learned From Ai-Powered Visual Reasoning Feedback
Friday June 5, 2026 14:10 - 14:50 EEST
Visual testing is supposed to protect QA teams from the familiar “it looks wrong” bug, yet traditional pixel-diff approaches only show that something changed, not whether that change actually matters. As modern interfaces grow more dynamic and design systems become more complex, teams need smarter ways to detect meaningful visual regressions.

This talk presents a practical approach to automated visual bug detection using multimodal LLMs. Drawing on a real-world implementation, it shows how AI models from providers such as OpenAI, Anthropic, and Google can be orchestrated to analyze screenshots and identify issues that pixel-based tools often cannot interpret on their own. These include layout breaks, missing elements, accessibility concerns, color contrast problems, and platform-specific guideline violations.

The session explores how AI-driven visual analysis can move beyond pixel-perfect comparison toward semantic understanding, helping teams distinguish intentional UI changes from genuine defects. It also addresses one of the biggest challenges in visual testing at scale: false positives, demonstrating how agent-based review systems can reduce noise while still surfacing critical issues.

Attendees will leave with practical ideas for using multimodal AI to strengthen visual testing workflows and make automated UI validation more accurate, scalable, and useful.

Key Takeaways:
  1. How to evolve from “pixel diffs” to impact-based automated visual feedback
  2. Patterns that turn image feedback into structured results (what changed, where, severity, why it matters)
  3. Tips for integrating automated LLM-powered visual feedback into existing automated UI test frameworks
Speakers
avatar for Risko Ruus

Risko Ruus

Principal QA Engineer, Rush Street Interactive
I am a software quality enthusiast with over 20 years of experience in various companies and software projects. I enjoy both developing software and testing it (including test automation). Example applications I have worked on include Nokia smartphones, Skype, and mobile betting... Read More →
Friday June 5, 2026 14:10 - 14:50 EEST
BlackBox Kultuurikatel

14:50 EEST

From Chaos To Confidence: Building Rock-Solid Stability In Mobile E2E Testing
Friday June 5, 2026 14:50 - 15:30 EEST
99.59%.That’s not uptime, not code coverage - it’s our yearly stability rate for mobile end-to-end test runs. It sounds almost impossible, especially if you’ve ever managed a growing Slack thread titled #iHateMobile.For three years, we fought the usual suspects of mobile automation: Appium timeouts, vanishing selectors, and flaky infrastructure. This talk condenses that journey into a survival guide for anyone who has ever wanted to throw their test phone across the room.In this fast-paced session, we will bypass the basics and dive straight into the specific architecture decisions that turned chaos into trust. We will look at how we moved beyond standard WebdriverIO implementations to build a system that is fast, predictable, and relied upon by the entire engineering organization.We will cover the "Big Three" that solved our flakiness:The Framework: How small, low-level fixes in element interaction and strict state management snowballed into massive stability gains.The Shortcuts: Why we killed UI-based setup in favor of API data seeding and custom app states to drastically reduce execution time.The Orchestration: Introducing our homemade "device-thread balancer" and CI triggers that made testing "one-click" easy.Finally, we’ll touch on the human element: how stable builds transformed our culture, turning skeptics into believers and making "just run the tests" the team's favorite phrase.


Key takeaways:
  1. Root Cause Analysis: Techniques for diagnosing the real source of mobile flakiness (it's not always the device).Speed vs. Stability: How to use API seeding and backend shortcuts to stabilize frontend tests.
  2. DevOps Integration: Blueprints for a "device-thread balancer" that optimizes cost and speed.
  3. Culture: How to build trust with developers so they treat E2E tests as an asset, not a blocker.

Speakers
avatar for Dawid Pacia

Dawid Pacia

QA Consultant, PathcingIT
QA and Test Automation Manager as well as mentor and trainer. Tech freak following all the newest technologies (and implementing them on his own). Fan of the Agile approach to project management and products. Supporting companies in transformations toward better quality. Actively... Read More →
Friday June 5, 2026 14:50 - 15:30 EEST
BlackBox Kultuurikatel
  Track

15:30 EEST

Coffee Break
Friday June 5, 2026 15:30 - 16:00 EEST

Friday June 5, 2026 15:30 - 16:00 EEST
BlackBox Kultuurikatel

16:00 EEST

KEYNOTE: Testing =/= Fun?
Friday June 5, 2026 16:00 - 17:00 EEST
Testing is serious business. Fun is not. Or is it?

We use games to teach testing, gamification to motivate work, playful exercises to build skills, and “fun” as a selling point for products. But fun is slippery. It is personal, contextual, fragile, and surprisingly easy to ruin by trying to measure it.

In this keynote, Kristjan will explore what fun means in testing. Can testing be fun? Should learning testing be fun? Does gamification actually help, or does it simply decorate boring work with badges and points? And when a product is meant to be enjoyable, how can testers investigate that without reducing the experience to a lifeless checklist?

Through painful personal examples, testing games, teaching experiences, and a dangerous amount of theory, Kristjan will break fun into smaller pieces: challenge, surprise, flow, social interaction, and quality-of-life features we often confuse with fun.

You will leave with an urge to analyze your own enjoyment - and may never look at your relaxing hobbies the same way again.

Key Takeaways:
  • How to think about fun as something observable, discussable, and testable.
  • Why gamification, testing games, and playful learning can help - but can also fail badly.
  • Why fun should be treated as a serious quality attribute, not a vague bonus.
Speakers
avatar for Kristjan Uba

Kristjan Uba

Head of Developer Experience at Betsson Group, Betsson Group
Testing has always been fun for Kristjan - the thrill of finding important issues, nailing the sequence for that intermittent bug and, in a sense, besting the developers. It soon emerged that he needs to learn a lot to keep up but the only training courses that struck a chord with... Read More →
Friday June 5, 2026 16:00 - 17:00 EEST
BlackBox Kultuurikatel
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -