We sat down to speak with Pavel Kostikov about building cross-functional fintech teams that work in constrained environments. Pavel has a background as an Engineering Manager and Team Lead. He has extended experience working with platforms and technologies that deal with legal boundaries, technical debt, and payment systems you can actually use in real life.
What unique challenges did you face when growing the fintech division’s market share and how did you overcome them?
Pavel Kostikov: Expanding our FinTech offerings at Yandex.Market demanded careful coordination of speed and stability. We had to be fast: every extra day to get a feature out had a calculable negative effect on our bottom line. Yet, we had to also ensure that our breakneck development pace didn’t impinge on the reliability of the services we offer to our customers.
A factor in our success was the clarity of our product goals. The product team did an excellent job of defining each change so that it was both well-justified and tied directly to improvements in key metrics, such as the share of financial products and overall product impact. For us, execution speed was paramount, but not at the cost of leaving potential pitfalls unaddressed. We tackled every new feature or refactor through well-structured stages: clear formulation, thorough design, development, experimentation, and resolving technical debt.
Teams that span functions are key players. In a large marketplace ecosystem where many products and many teams evolve in parallel, the basic problem isn’t just writing the code; it’s aligning schedules, coordinating responsibilities, and preventing conflicts with the work of other teams. Given the enormous scale of our plans, we invariably found ourselves collaborating with external developers. This introduced onboarding hurdles, which over time we converted into ramps. Today, our onboarding and documentation processes are as inviting as any in the business.
Naturally, in such a rapidly changing field, there are bound to be setbacks. There was even one instance where processing payments on the platform ceased for an hour. Incidents like that, though expensive, serve as vital learning opportunities. By talking things over and holding retrospectives, we kept on learning and kept on using what we learned to hone our development and collaboration practices so we could avoid similar failures in the future.
How do you approach integrating legal and compliance requirements into agile development processes, especially in fintech?
Pavel Kostikov: Legal and regulatory constraints are part of any product development effort, including those in FinTech. But they can be more complex and costly to implement and likelier to change as the product ages. To address that challenge, we engage developers and legal experts in vigorous discussions that help carry us through the development effort and the product lifespan. We build upward from a base of detailed documentation that serves as a translation from the legal mandates (in this case, from synthesis into signatory form) to the technical requirements that must be satisfied if we are to stay aligned with those mandates.
Take, for instance, our extensive work with the concept of “banking secrecy.” Certain data, like credit card balances, was governed by strict handling rules that varied depending on whether it belonged to Yandex.Bank or Yandex.Market. Some systems had legal permission to store this data; others could process but not store it; and still others could do neither. Because these rules were open to interpretation and often updated, we met regularly with syncs, or product managers and legal experts, to stay updated and avoid “unpleasant surprises” late in the project.
Meeting compliance isn’t just a project you can stop once you reach the finish line; in fact, it isn’t even a project in the conventional sense. You can’t launch a thing and then just forget it while you move on to the next shiny object. We did regular audits to ensure we were staying in line and were safe from any kind of best-sure liability, checks that included verifying that we weren’t logging or sharing any kind of sensitive info. Tight, unassailably direct documentation of our requirements ensured that everyone was on the same page and understood what was at stake.
In the end, regulatory limits are a big and persistent factor affecting the development of FinTech. They shape, for the most part, our project timelines, architectural decisions, and the overall price of building financial features. But we treat legal requirements as part of our design constraints, along with performance and security. And, just as we withhold the financial lifeblood from underperforming or unsecured products, we only launch those with solid compliance into the real world.
Could you elaborate on the Super Split project that resulted in substantial GMV growth and the technical challenges involved?
Pavel Kostikov: The evolution of our original “Standard Split” BNPL solution at Yandex led to the introduction of Super Split. While Standard Split was suited for shorter repayment terms, required no formal credit checks, and boasted an impressively low default rate, it was unregulated. Thus, we were unable to access favorable interbank financing, which made longer-term extensions (think 12-month installments) either too expensive or too risky.
One alternative was a standard loan model where users applied for credit each time they made a purchase. But that required a lot of applications and a lot of approvals, leading to no small amount of friction and a not insignificant number of lost orders. Also, it was necessary to integrate with a number of banks, which meant a lot of truly awful, truly time-consuming development.
Super Split addressed these challenges by creating an approval process that mimics a credit card. Users fill out one application and receive a defined credit limit, which they can use repeatedly without undergoing more credit checks. We decided to work with a single bank to keep things simple and reliable. The whole setup fits easily into our checkout flow, so customers see their available credit early on and don’t feel like they’re financing.
The credit card mechanism seamlessly integrated with our existing infrastructure. We only needed to introduce the card issuance process within a separate UI flow. This user journey could be embedded anywhere, but initially, we placed it at checkout before the purchase. Once users completed the flow and received their credit card, the payment proceeded through our established card processing system.
This approach worked well, but we went further. To simplify the process, we didn’t emphasize card issuance. Instead, we presented it as an “upgrade” to their Split. Users could get better terms, like lower interest rates and longer payment periods, by providing additional information and becoming a bank client. All necessary documents were signed, but for most users, it felt seamless.
By keeping the system technically simple, we implemented the project quickly. Since approval was required only once, the product gained traction fast and drove significant additional sales for the marketplace.
How do you balance technical debt management with the rapid delivery needs of a fintech team?
Pavel Kostikov: Technical debt is hard to navigate in the fast-moving space of fintech where we are pinned down by what I like to call “reliability demands on steroids.” We in fintech have to move fast, break things, and fix them again. It’s an iterative and infernal process and so is paying down technical debts. When is the right time to do it? How to balance the speed of moving up the (feature) elevator with simultaneously delivering “fixes” so that the structure of the elevator doesn’t collapse under the weight of too many people trying to go up at once (or a “reliability demand on steroids” that we in fintech have to satisfy)?
One of the key practices is dealing with bugs in production. We have a clear process for identifying, prioritizing, and fixing these issues. When critical bugs happen, we treat them as our top priority and fix them before going back to other projects, because we don’t tolerate critical bugs in production. For lower-level issues, such as normal bugs, we have a defined budget and we prioritize them according to the impact they have on the business. This ensures that we’re not just reacting to problems but controlling them in a manageable, predictable way.
Besides taking care of bugs, we also manage large-scale production incidents, such as when web BNPL payments go awry. We’ve constructed a strong incident management process to deal with these situations quickly. Every component has in-place designated on-call engineers, and our incident response systems enjoy a high level of automation. For instance, when the need arises, setting up a Zoom room for the incident team, finding the graphs we need, and initiating incident protocol is a breeze. And you know what? We address every incident at the moment of need, and by “need,” I mean we do address the moment with our full attention.
A technical status board that is clear and easy to understand, provides health metrics for our systems. This is a dashboard view, primarily for engineers and product managers, to see the current health of the system. This should be sufficient to align all the parties necessary to ensure a correct-negative presence and maintain a balance between rapid development and the quality necessary to succeed in the fintech sector. If not sufficient, then why not?
Alongside dealing with bugs and production incidents, we also deal with refactoring and technical restructuring. We treat these initiatives as separate projects with clear objectives; we set measurable metrics to track their success. When we plan, these refactoring projects compete with product feature development for resources and prioritization.
One example of a project is isolating the BNPL behavior management service, which was formerly spread out across multiple pages on the marketplace. By reworking this service into a single location, we have been able to iterate and improve the BNPL features much more quickly and easily.
We can’t measure that time savings directly, for instance, by running an A/B test. But it’s very easy to see that with less duplicated business logic, we’re just able to do more with the same amount of time. And if you take my word for it and look at the technical metrics we improved, the time savings add up to a more efficient development process and faster delivery overall.
What monitoring and analytics systems did you implement to ensure the reliability of financial transactions?
Pavel Kostikov: Alongside the advanced monitoring systems discussed before, we instituted standard technical metrics for each service in our infrastructure. These metrics are fundamental to monitoring the health of our system. They tell us how the system is performing at a basic level. We track metrics on the services that are fairly universal and very telling when something goes wrong. For instance, we monitor CPU, RAM, network usage, and disk consumption, which helps us gauge performance at the system level.
We additionally gauge response times at several percentiles, enabling us to spot performance bottlenecks in diverse system sectors. These standard measurements form the bedrock of our microservices architecture. They are truly elemental to the real-time performance tracking and issue detection we do with this setup.
Basic technical monitoring combined with transaction-specific anomaly detection ensures that we can maintain the reliability of our financial transactions and address issues swiftly. The impact of these issues on the user experience or revenue is minimized. Alongside the monitoring systems and technical metrics, an essential characteristic of the fintech engineering teams, particularly at Yandex.Market, is the vast reliance on various other components, like anti-fraud systems, the payment core, the checkout page, and the very infrastructure of Yandex Bank. These components mean that we need a very developed and also extensive set of product metrics at the high level.
We established these metrics in different slices, permitting us to dissect performance and transaction data along several axes. These axes included payment type, issuing bank, platform (iOS, Android, or web), platform version, application (whether the purchase was made through the Yandex.Market app or Yandex.Go), and user flow (payment vs. payment with additional confirmation). This segmentation yielded a clear view of how several components and user behaviors affected the overall transaction process.
We had sensitive thresholds in place to monitor any deviations from the standard transaction flow, which is a complex thing in itself to monitor. That allowed us to keep a close watch on issues that might otherwise go unnoticed. A real-world example is when we observed a 20% drop in transactions for Yandex Bank card payments on weekends, lasting for about 30 minutes. This represented a tiny blip, less than 1% of the total transaction flow, but it was still an anomaly we couldn’t ignore.
We found that the problem stemmed from failures with delivery of SMS messages. Yandex Bank relies on a certain mobile operator to send SMS messages for 3DS authorization, and this operator was having serious delivery problems. As soon as we identified the root cause, we began working on a fix. We had to ensure that the payment flow did not get interrupted while we worked on the fix.
This degree of oversight and the attention to detail were essential in guaranteeing that our system was both technically sound and financially secure. When problems cropped up, they allowed us to respond quickly and to prevent any significant transaction or revenue losses.
How did you approach the integration of new payment methods from both technical and business perspectives?
Pavel Kostikov: Integrating a new payment method is always a massive undertaking from a technical standpoint. Although our current systems can handle a wide variety of payment methods, the specific business requirements for each one can vary markedly, so much so that we have to customize our work for each type. New types of cards, for instance, might necessitate changes to our discount logic, while some methods, like paying in installments, might force us to develop a new relationship with the banks.
For post-payment methods like delayed payments, the software that supports our order pickup points or our courier systems may require changes. These kinds of payment add complexity and variability that we must account for in our plans.
From a business perspective, one of the key metrics we concentrate on is the conversion rate, how many users complete their purchases after they click the “buy” button. Different payment methods perform in this regard very differently. For instance, well-established and stable card payments tend to have higher conversion rates, whereas newer payment methods, such as token-based payments through the national Fast Payments System from the National Payment Card System, often start out with lower conversion rates.
From a product standpoint, the most important lever we possess in controlling payments in the marketplace is the arrangement of payment methods on the checkout page. We’ve conducted numerous experiments in this domain aimed at optimizing not only conversion rates but also revenue (GMV). By placing payment methods with the best conversion rates in the most prominent spots, we are in effect guiding users toward the methods that deliver the optimal combination of user experience and business outcomes.
What approach did you take to A/B testing financial features, and how did you measure success?
Pavel Kostikov: At Yandex.Market, we are a company rooted in data and strong evidence, and A/B testing is an integral part of our decision-making process. Almost every launch goes through this testing to gather solid data-backed evidence justifying its release. Financial services are no exception to this rule. However, what makes testing financial features different from testing other product features is the high stakes involved. Even small technical changes might have far-reaching consequences. Most of the time, we see statistically significant upticks in key business metrics, particularly GMV, after our projects pay out. But the flipside is that a small mistake in these experiments could lead to substantial financial losses.
We are much less concerned with proxy metrics, which are often employed when the business impact is relatively minor. When it comes to financial features, we care much more about the actual revenue, cost, and user behavior changes that our metrics reflect.
Although we usually deal with relatively small changes, which we can measure quite readily via A/B tests, a more complex scenario was understanding the long-term effect of the Split feature on the platform’s economics. We wanted to determine if we were really earning commissions from the feature or if it was motivating users to purchase more often on the platform.
The distinguishing feature of marketplaces like ours at that time was that users made purchases with a low frequency, averaging about one per user per month. This presented a traditional A/B testing challenge. If we simply turned off the Split feature for a subset of users, the test would need to stretch over several months to yield significant results, and it would be expensive, since a large fraction of users didn’t make any purchase during the experiment.
To tackle this problem, we used historical records to conduct a more advanced exercise. We compared two user cohorts matching on variables like age, gender, and other demographics that are relevant to our business. These cohorts were matched so closely that they were basically identical. The only distinction was that one cohort had very recently made its first purchase using the Split feature, while the other had not. This lets us see the long-term impacts of using Split without waiting for months and months of traditional A/B testing.
The findings were remarkable: users who began utilizing the Split feature completed 30-40% more purchases than those who had never used it. This illuminated the substantial influence that the Split payment option wielded in terms of propelling users toward a more frequent purchasing pattern. The evidence was clear: this feature was not only helping us to generate more commission, but it was also spurring users to interact with the marketplace in a more meaningful way.
How did you handle incident response for financial systems, and what processes did you implement to minimize financial impact?
Pavel Kostikov: The stakes are higher in financial systems: even small problems can have a direct effect on the bottom line and damage user trust. All sorts of components can fail: payment core, risk engine, and checkout infrastructure, but our fintech team is ultimately responsible for the end-to-end stability of the financial product. This means our incident management strategies have to be both responsive and proactive.
On the reactive side, we have built a deep monitoring system that tracks every slice of our user journey: platforms, payment methods, and more. At the first sign that a metric has dipped by even 10% from our baseline, attributable to any part of the user journey, including payment, we sound the alarm. By tracking each inch of the payment flow, we can rapidly detect and resolve issues that originate right along the payment flow, often before they escalate to more serious incidents. Since many issues arise from systems owned by other teams, we also lay great stress on cross-team communication. We keep hotline chats for quick escalations and strongly encourage our partner teams to subscribe to those chats, so that everyone is in the loop when critical incidents occur.
We invest in structured communication with neighboring teams. When another team’s system has a recurring error that is causing us to lose money, we make it clear to them just how much it’s affecting our mutual marketplace. We help them figure out on-call rotations and how to achieve stable releases. And when it comes to our own system, we’re very rigorous about testing the payment scenarios. We do manual and semi-automated tests of our payments, and we cover most core functionality with those two levels of testing. Also, we look at the payment function from a lot of different angles.
This combination of finely tuned surveillance, almost instantaneous alerts, cross-team collaboration, and rigorous testing keeps our financial systems reliable and responsive, even when the stakes are high.
How did you approach the architectural challenges of building a universal chatbot platform from scratch at Yandex.SupportAI?
Pavel Kostikov: SupportAI is a chatbot platform, a startup within a large company. It started as an automation project for Yandex.Taxi’s customer support and showed strong results. The solution was then expanded to other services and use cases. I joined the project after the first external sale, when our prototype was deployed in an online store.
At that time, each chatbot was built from scratch with almost no reusable components. Given our goal of rapidly expanding our client base, speed was the key factor. To address this, I focused on making the platform universal.
The first step was defining a common API for all chatbots. Previously, every bot required a separate API agreement with the client, which slowed down development. Introducing a unified API eliminated this step in 95% of cases since the necessary data was already standardized. At the same time, the API included customization points to accommodate specific client needs.
As our client base grew, the demand for integrations increased. Each chatbot automated customer support workflows that were already in place, meaning they had to work with existing CRMs, knowledge bases, and booking systems. Customizing integrations for each client consumed a lot of time. To solve this, we built a no-code integration platform, allowing users to set up simple HTTP-based interactions with third-party services or adjust parameters for existing standard integrations.
Initially, each new bot required a full data analysis cycle: collecting user queries, training an ML model for intent recognition, and then building the bot’s logic. This process was time-consuming. However, we noticed a strong overlap in user queries across different bots. For example, the question “Where is my order?” is common across all e-commerce chatbots. Recognizing this, we introduced cross-bot ML model reuse, enabling us to launch new bot prototypes without training a new model from scratch.
Through large-scale standardization and automation, we transformed SupportAI into a flexible yet highly scalable platform. It could meet the needs of any client while allowing new bots to be deployed quickly—often without developer involvement.
How did your experience transitioning from individual contributor to team leadership shape your management philosophy?
Pavel Kostikov: I began my professional journey as an engineer, and my sole focus was and still is to create the best product I can, working at the highest standard possible. With time, I realized that building a great product is not only about achieving technical excellence; it’s also about adjusting the dynamics of team collaboration to perfection.
From my experience, I know that processes really matter. Even the most talented engineers aren’t efficient if they’re working without well-structured workflows and effective processes. And what about team composition? Motivated, well-balanced teams achieve far more than groups of star individuals who aren’t working cohesively. And, of course, right-setting goals is essential. If all your effort isn’t aligned toward the right, clear, and precise objectives, then setting it is kind of a waste.
This understanding ultimately took me to the position of the development lead at SupportAI. As I rode along in a wave of new understanding and this new position, I became more and more convinced that managing the broader scope of what we now call product development, team dynamics, process optimization, and goal alignment, is really what excites me, that I have some skills in this area, and this could be a path for me. So, the next step on my path is to share this new wave of understanding with you.
I know how extremely vital it is to fathom the technical predicaments the team contends with and to steer clear of suffocating their decisions with too much oversight. I am skilled at spotting and obliterating blockers, and that skill became even more essential when I moved into a leadership role. Grasping the technical challenges enables me to set achievable expectations and keeps me from becoming a source of frustration for the team.
This brings us to the idea of shared ownership. For engineers to truly own their work, they have to grasp not just its immediate implications but also the overall vision it serves. I only see this happening, where engineers take genuine ownership of their work, when there’s a culture of transparency in the organization.
