“We just want to see how you think about solving this.”
For two decades, the process of hiring software engineers has undergone a radical transformation. What began as informal discussions has evolved into a standardized, multi-stage marathon. Yet, as the number of interview rounds increases and as AI is performing increasingly more steps in the hiring process, a critical question remains: are we measuring engineering proficiency, or have we merely perfected the art of measuring interview endurance?
Based on ten years of experience across four distinct organizations—progressing from intern to principal engineer—and conducting over 85 technical interviews, a pattern of divergence emerged to me: the skills required to secure the role have become almost entirely decoupled from the skills required to excel in it.
Competitive programming is a mind sport where participants write code to solve specific algorithmic puzzles within strict constraints. The competitors must design and implement algorithms that not only produce the correct output, but do so under tight time and memory limits.
In the early stages of my career, I followed the community-endorsed policies of large corporations. These guides prioritized syntax knowledge, programming paradigms, and algorithmic mastery. Having engaged in competitive programming for five years prior to my professional career, I found this format played to my strengths. Under extreme time pressure, recall from memory was the primary driver of success.
The underlying assumptions of the policies I followed was that high performance in competitive programming served as a proxy for General Mental Ability (GMA). In “The Validity and Utility of Selection Methods in Personnel Psychology,” F.L. Schmidt showed that GMA strongly predicted job performance. However, this hypothesis falls victim to Goodhart’s Law:
"When a measure becomes a target, it ceases to be a good measure."
Because the algorithmic puzzle became the universal target, candidates began optimizing for the test rather than the competency. My hypothesis was that we as interviewers were no longer measuring raw problem-solving ability. Instead, we were measuring the volume of already-implemented patterns a candidate could successfully cache in short-term memory.
I wrote this article to give you my perspective over this hypothesis. It includes my progression as an interviewer, as well as what I observed about the rules, the systems and the people I’ve worked with.
In my first 20 interview sessions, the technical evaluation was compressed into a 30-minute window. This time-box created a high-pressure environment that bore no resemblance to my day-to-day engineering activities. In professional practice, complex challenges are rarely solved in a single, uninterrupted sprint without documentation or peer review.
By requiring an "optimal" solution within this frame, the format shifted the assessment away from maintainability, correctness, and architecture toward a test of sheer speed. A study by Behroozi et al. (2020), "Does Stress Impact Technical Interview Performance?", found that traditional technical interviews induce a "social-evaluative threat" that impairs cognitive load. The performance captured is more reflective of a candidate’s ability to handle public performance anxiety than their ability to solve business problems.
In "Why Nations Fail", economists Daron Acemoglu and James A. Robinson propose a framework for understanding why some systems thrive while others stagnate. They distinguish between "inclusive" institutions, which create a level playing field and encourage the participation of the many to generate prosperity - and "extractive" institutions, which are designed to concentrate power and resources in the hands of a select few. While inclusive systems incentivize innovation and share rewards broadly, extractive ones rely on gatekeeping, opaque rules, and the exploitation of the majority to maintain the status quo for the elite.
While some companies market their interview process as a simulation of a working relationship, the evaluation criteria often penalize the behaviors that define professional success. This has led to interviews becoming extractive rather than inclusive.
For instance, during hiring debriefs I attended, hinting was often viewed with skepticism. A candidate’s need for a nudge was interpreted as a lack of autonomy. I caught myself suppressing my engineering and social instincts. Instead of acting as a collaborator, I became a supervisor, withholding the partnership I would instinctively offer a teammate while still demanding results.
This led to what I call "half-hint" loops. To prevent a candidate from staying stuck for the full 30 minutes, which would yield zero data, I would provide the absolute minimum intervention. I observed that being trapped in this loop took between 5 and 10 minutes. In a 30-minute session, this was catastrophic. By discouraging collaboration, I inadvertently filtered for individualistic solvers and overlooked the analytical communicators who would bring the most business value through disambiguation and stakeholder alignment. I was extractive rather than inclusive.
The reason why extractive interview tactics fail is because they stifle the expressive power of the engineer. By stripping away the candidate's rights to reach the solution at their own pace and treating the session as a data-mining exercise rather than an inclusive collaboration, the organizations I’ve been part of built fragile hiring pipelines that prioritized administrative throughput over long-term engineering excellence.
In the organizations I was an interviewer for, the competitive programming acceptance criteria were generally grouped into two pillars:
Problem solving via algorithms: The candidate's ability to identify and implement a specific computational procedure (e.g. sorting, searching, or graph traversal) to solve a defined problem.
Data Structures: The proficiency in selecting and manipulating organizational formats (such as HashMaps, Linked Lists, or Trees) to manage data efficiently.
Despite these being the official metrics, the reality of the 30-minute format transformed these pillars into a test of low-level memorized fluency which I call the "core competencies". Instead of evaluating how an engineer approached problems, the criteria measured how many industry-standard solutions they had memorized. If a candidate couldn't immediately recall the specific syntax for a heap implementation or a library-specific method such as eliminating an item from a doubly-linked list, their algorithms and problem solving scores would plummet.
When the candidates I interviewed had a correct approach in designing an algorithm to solve a problem, they would have no chance of finishing within the allocated time frame if they didn’t memorize the industry-standard approaches that would speed up their implementation to the point they could finish within the allocated time. In other words, I evaluated candidates based on their ability to assemble memorized code-blocks, not solve new problems. You can imagine this as memorizing the various sizes and shapes of multiple Lego pieces before the interview and then trying to figure out which of them to use to fit into a figurine given by the interviewer during the interview, instead of being allowed to construct the fitting piece yourself during the interview.
By demanding that a candidate reproduced these standard algorithms from memory, I was testing their ability to perform a high-speed reenactment of computer science history. The table below highlights the staggering gap between the time these concepts took to mature and the 30-minute window I allotted for their optimal implementation in an interview:
Binary Search - ~15 years to correct form (1962)
Quicksort - ~2 years
Dijkstra’s Shortest Path (1956) - ~3 years
Heap Sort (1964) - ~1 year
By demanding candidates to provide memorized recipes, I was testing recall from memory rather than problem-solving.
Another non-negotiable aspect that the internal policies required was for candidates to master a programming language’s syntax, all without the aid of auto-completion or documentation. I observed talented engineers stumble - not because they lacked logic, but because they couldn't recall a specific library method under pressure. In my day-to-day work, I was never completely denied access to language specifications, StackOverflow-like platforms, approved AI assistants or peer collaboration. By stripping away these tools during the interview, I wasn’t measuring an engineer's ability to build systems. I was measuring their ability to function as a human compiler rather than their ability to bring value to the business on top of what compilers did.
Conclusions published by Fedorenko et al. (2024) in "Language is primarily a tool for communication rather than thought" provide a compelling scientific lens for this. The study argues that in modern humans, language is a tool for communication, contrary to the prominent view that we use language for thinking. The evidence shows that language does not appear to be a prerequisite for complex thought, including symbolic or logical reasoning. Instead, language is a powerful tool for the transmission of knowledge that only reflects, rather than gives rise to, the signature sophistication of human cognition.
The copy-cat and compiler mechanisms created a systematic construct underrepresentation: the interviews failed to capture the full scope of critical job role responsibilities such as disambiguating requirements, writing test suites, reviewing code, performing operational tasks and working collaboratively.
Combined with the extractive interview format, they stripped away the tools and cognitive processes that would be most likely to prove the candidate’s fit to the job role, nullifying a large part of their career deliverables and active knowledge.
I realized at that point that I wasn’t getting from the candidates a complete picture of what they were capable of, so I decided to switch my approach to be predominantly inclusive rather than extractive.
Moving from the first 20 interviews into the next 65, I decided to test a new hypothesis: if language was primarily a tool for communicating logic rather than the source of the logic itself, then an interview prioritizing analysis and disambiguation over the syntax and boilerplate could predict the performance of the candidate much better.
I replaced abstract algorithmic puzzles with scenarios that replicated my day-to-day operations. I presented candidates with ambiguous business requirements, much like the tasks I handled myself at work, all while calibrating the level of ambiguity to the seniority of the role. While a junior would have clear guardrails, a senior candidate was met with significant situational ambiguity, requiring them to lead a disambiguation phase before considering writing code.
Having to still obey organizational requirements for conducting technical interviews, I continued to verify programming but shifted the focus during interviews towards ensuring extensible code structure/layout, creating novel data structures on top of language-level ones, being able to use programming language control mechanisms agnostic of the language, and probing which programming paradigm the candidate would choose for the situation at hand. I continued to probe for fundamental algorithm knowledge, such as options on sorting and search efficiency, but I never asked candidates to write those from memory again. Instead, I would ask the candidates to use them in solving novel problems.
Finally, I stopped expecting that the candidate finished the problem or reached an optimal solution within the 30-minute window. Instead, I focused on their analytical skills and rigorousness in getting to a solution, as well as arguing about the correctness and efficiency of that solution. I focused on how candidates handled the problem, how they transitioned through implementation phases and how they managed the trade-offs between speed and maintainability.
The results were immediate and measurable:
Deeper engagement both ways: Because the cognitive load of syntax memorization was removed, candidates could dedicate their full attention to analyzing the problem and actively working with me to figure out the requirements. What they didn’t know was that I wouldn’t have the exhaustive set of requirements and edge-cases covered myself, so I was on the same ride with them collaborating towards one of the possible solutions.
Enhanced data quality: I received more data about the candidates’ engineering competence - how they handled edge cases and communicated trade-offs - rather than their ability to act as a human compiler or memorizing de-facto solutions.
Predictive accuracy: My feedback became more granular and evidence-based. By observing the thought before the expression, I could differentiate between an engineer who had memorized a solution and one who could navigate the requirements of the job role when faced with new situations.
This transition confirmed what neuroscience suggests: a software engineer’s value doesn't lie in their ability to know the intricacies of a specific language, but in the sophisticated, non-linguistic reasoning they use to solve problems, including recognizing problems about the (programming) language itself. The focus shifted from the finished product to the process behind delivering the product. I was now looking at the thought process itself, not its shadow.
A successful software engineer needs the ability to reason about and solve problems while going through endless cycles of requirements → disambiguation → collaboration → implementation → communication → operation → deprecation. In the work environments I was part of, this took a lot of time: from months to years per product. There were relaxed and critical phases, good and bad decisions, periods of reflection and periods of unknowns.
By looking back at my whole career I realize that all of these define its very fabric. I am not who I think or say I am, but what I did in all of those scenarios and my influence is embedded into the results. I had my income, happiness, social and professional statuses and career growth on the line. Since I am not special in any way, this must have been the case with a large part of my software engineer peers. It must follow that this experience - the fabric of a software engineer’s career - would be a strong predictor of future job performance in the same role and would be accounted for in the hiring process. But the reality of the candidate selection process I saw was the exact opposite: none of this mattered in the hiring decision.
Despite candidates providing detailed resumes and artifacts of their careers, they were ignored in favor of the immediate interview performance. In fact, many of my peers viewed browsing a candidate's CV or linked deliverables as a surprising or optional step. In psychology, this is often linked to the look-elsewhere effect or a form of selection bias, where evaluators prefer a fresh, controlled experiment (the interview) over broad data (the career). When faced with a complex history of deliverables, the human brain prefers the snapshot - the 30-minute interview impression - because it seems more "real" and easier to process and evaluate against other candidates or the company policies. This results in the virtual nullification of the candidate's career. The vivid performance in the room overwrites the candidate's abstract and complicated history of ups and downs.
In any other software engineering discipline, a system without a feedback loop is considered an "open-loop" and inherently unstable. Yet, the software hiring processes I’ve been part of operated almost entirely in the open-loop mode. I observed a startling lack of corrective mechanisms for those making the hiring decisions. When a candidate I interviewed was turned down by me or by the panel I was part of, that was the end of the story. There was no retrospective to see if that same individual was later hired by a competitor and became a top performer, which would have signaled that our criteria were too narrow or flawed. Even more critical was the mishire scenario. I witnessed cases where candidates were hired, joined a team, and ultimately proved inadequate for the role, destabilizing team cohesion. However, the data from these failures never traveled back to the original interview panel. The individuals who gave the thumbs-up were never informed of the outcome, depriving them of the opportunity to calibrate their judgment against reality.
Instead of optimizing for predictive accuracy related to the candidate’s performance, the organizations I’ve been an interviewer for prioritized throughput and administrative speed. I saw the pressure to be faster manifest in two distinct ways:
In some organizations, the hiring manager was under pressure to fill positions quickly because their own performance evaluation was tied to time-to-hire metrics. This is intuitive to visualize why: there were tasks to deliver and the later the hire, the later the project would finish. This created a bias toward rushing a candidate through the stages to avoid the time-consuming task of taking other potential candidates through the full loop. The goal was no longer to find the engineer that fit the job role description, but to stop the search as soon as someone sufficient from the hiring manager’s viewpoint was found.
In other organizations, being an interviewer was a mandatory part of job evaluations for software engineers. Here, speed meant writing feedback as quickly as possible just to tick the box that the interview was performed. When I confronted other software engineers about how much time it took them to write the feedback against what it took me to write similar feedback, some of them would say, proudly, that it took them at least 30 minutes less than me and that I would gradually become like them over time, with experience, so that it wouldn’t be such a big "distraction" from my day-to-day activities. As expected, when the primary incentive was to finish the paperwork rather than to provide a high-fidelity scientific analysis of a candidate's reasoning, the quality of the data collected plummeted. The focus on throughput created a recursive (pun intended) problem where the path to becoming an experienced interviewer or a mentor was purely transactional: once a person reached a minimum threshold of interviews conducted, they were deemed experienced enough to guide and onboard others to become interviewers.
When the feedback loop is open and the primary incentive is speed, the integrity of the interview panel is often compromised and the hiring decisions are no longer based on objective criteria.
I observed a profound imbalance of power, where the partnership between the interviewer and the candidate was revealed to be a mere formality. While organizations often broadcast a culture of equality and collaboration, the reality is a lack of rights for the candidate. I saw candidates enter high-stakes environments where their professional status, income, and future growth were on the line, yet they were granted zero transparency into the black box of the decision-making process. The interviewers would hold all the leverage: they would set the format, the acceptance criteria and the decision criteria. They would choose the abstract puzzles that nullified the candidate’s experience and they would face no consequences for choices that negatively affected the company.
This imbalance extended into the digital trail the interviews left behind. Under frameworks like the EU General Data Protection Regulation (GDPR), candidates theoretically had rights to the data collected about them, including the right to access and the right to rectification. However, in practice, there was a grey area regarding derived data: the subjective scores, interviewer notes and internal justifications. While the candidates would still be able to request a copy of their raw application, the logic used to reject them remained shielded by corporate claims of intellectual property or trade secrets.
This lack of data transparency meant that the candidate was the only one in the room acting with full professional accountability. While the interviewer might have been rushing to "tick a box" or a manager might have been overriding technical signals to meet a headcount quota, the candidate was the only one whose life was materially altered by the outcome. There was no right to an appeal, no access to the detailed evidence used for or against the candidate, and no mechanism to challenge a verdict based on certain criteria that was mistaken in the interview feedback.
There is a biting irony in the software engineering interviews I’ve been part of: it has likely taken you longer to read and reflect on this analysis of the system than the system would allow you to prove your technical worth as an engineer.
In the time it took to digest these scientific arguments and historical contexts, a candidate in a standard interview would already be expected to have parsed a problem, disambiguated the requirements, selected an optimal data structure, and be halfway through a bug-free implementation of a complex algorithm. As an anecdote, it’s as if we were asking pilots to build an engine from scratch in 30 minutes, and then failing them if they stop to ask why they aren't just using the perfectly good plane parked at the gate.
Too many of the software engineering interviews I’ve been part of were more like high-stakes theatrical performance rather than rigorous engineering assessments. By prioritizing the 30-minute sprint over the marathon of professional experience, the industry has built a filter that rewards preparation over capability and recall from memory over reasoning. I found most of the interview formats and questions neurologically misaligned with how humans think, professionally disconnected from how engineers worked, and systemically blind to their own failures.
When we treat candidates as human compilers and ignore the career fabric they have spent years weaving. We don't just lose out on talent; we degrade the very definition of what it means to be a software engineer.
Until the hiring process moves from an extractive open-loop exercise to a closed-loop system of inclusive collaboration, it will remain a broken mirror, reflecting only the shadow of an engineer's true value while the substance remains unseen.
"While we were impressed with your attention dedicated to reading this article, we have decided to move
forward with other candidates whose experience more closely aligns with our current needs.
We encourage you to keep an eye on our careers page for new opportunities."