We’re delighted to welcome Nikita Melnikov, the VP of Engineering at Atlantic Money, for an insightful conversation about his journey in the fintech world. With a career spanning over a decade in software development and engineering leadership, Nikita has played pivotal roles in shaping technological strategies at prominent financial institutions. His trajectory from a Senior Scala Developer at Tinkoff Bank to Vice CTO of Tinkoff Invest, and now his current position at Atlantic Money, offers a unique perspective on the evolving landscape of financial technology.
Nikita’s expertise in high-load systems, team building, and architectural design has been instrumental in driving innovation and operational excellence in the companies he’s worked for. Today, we’ll delve into his experiences, the challenges of international money transfers, and his vision for the future of fintech.
At Atlantic Money, you maintain a hands-on approach by contributing to coding. How do you balance this with your leadership responsibilities as VP of Engineering?
As a startup, we prioritize efficiency and lean operations. Our approach involves maintaining a small, agile team while simultaneously investing in the automation of routine tasks and duties. This strategy allows us to maximize our productivity and resource allocation, ensuring that our limited manpower is focused on high-impact activities. By automating repetitive processes, we not only save time but also reduce the likelihood of human error, thereby enhancing overall operational quality. This efficiency-driven model enables us to remain competitive and responsive in the fast-paced startup environment, allowing us to punch above our weight class in terms of output and innovation.
As our startup grew, we naturally expanded our team. In the early stages, I split my time evenly between developing core functionalities and handling administrative tasks like hiring and policy-making. This included designing critical components, while also shaping our overall product design. As we’ve evolved, my role has shifted. Now, I spend about 80% of my time on hands-on coding and infrastructure management. This deep technical involvement helps me maintain a thorough understanding of our systems and actively contribute to their ongoing development and optimization.
As a fintech startup, we are subject to a range of critical annual processes that demand our attention and resources. These include comprehensive security audits to ensure the integrity and safety of our systems, detailed regulatory reports to maintain compliance with financial industry standards. Additionally, we engage in other essential processes that are integral to our business operations and growth strategy. During these intensive periods, which typically occur on an annual basis, my direct contribution to product development necessarily decreases, typically to around 15% of my time. This reduction is a natural consequence of the increased focus on these crucial compliance and evaluation activities.
However, I firmly believe in the importance of maintaining a strong connection to the technical aspects of our product, even as a manager. This hands-on approach allows me to effectively manage and mitigate technical debt, which is crucial for the long-term sustainability and scalability of our systems. By staying closely involved with the codebase, I can maintain a deep understanding of how our system functions at a granular level. This insight is invaluable in identifying and addressing both the strengths and weaknesses of our code and overall architecture. Such knowledge enables me to make more informed decisions, guide the team more effectively, and ensure that our technical strategy aligns with our business objectives. It also allows me to better appreciate the challenges faced by our development team and to provide more meaningful support and leadership.
At Tinkoff Investments, you increased platform stability, reducing incidents from 1-4 per day to 1 per quarter. What strategies did you implement to achieve this remarkable improvement?
Any meaningful improvement in system stability begins with a fundamental and probing inquiry into the root causes of issues. By asking “why” repeatedly, we can peel back the layers of complexity and identify the underlying factors contributing to instability. This process of deep investigation and analysis forms the foundation for developing effective strategies to enhance platform reliability and reduce incidents. It enables us to move beyond surface-level fixes and address the core problems that impact our system’s performance.
In my case, the major issues stemmed from inefficient caching and traffic routing design. We initially used embedded Hazelcast as a cache store. Combined with a round-robin balancing algorithm, the serialization of objects in the cache store consumed about 40% of our CPU resources.
We encountered an amusing situation where we observed CPU and latency spikes simply from monitoring and real-time updating of the Hazelcast cluster metrics.
To address these issues, we implemented two key solutions. First, we migrated to a consistent hashing algorithm in Nginx. Second, we developed a highly efficient custom in-memory store. These changes significantly improved our system’s performance, reducing latency by approximately 60% and eliminating cluster-specific failures.
Of course, we implemented many other optimizations. We fine-tuned HTTP clients and servers, split our monolith into multiple systems, and replaced Oracle with more specialized databases such as Cassandra and Couchbase. I could go on about the features we improved, but the list is extensive.
You’ve worked extensively with Scala throughout your career. How has your experience with this language shaped your approach to building fintech systems?
I still love Scala, despite its declining adoption. Functional programming significantly simplifies code—you can often grasp a function’s purpose just by examining its signature. Scala remains a powerful, high-performance language that excels in complex, wide-domain systems like trading platforms.
However, Scala still has multiple “ecosystems”—standard library adherents, Cats Effect enthusiasts, and ZIO fans—which can widen the onboarding gap for engineers unfamiliar with your specific technology stack.
At Atlantic Money, we opted for Golang for several reasons:
- It’s easier to quickly hire skilled candidates
- It has low memory requirements
- It’s extremely fast with excellent concurrency control primitives
- It typically offers only one way to accomplish a given task
Atlantic Money offers a flat fixed-fee of £3 for transfers up to £1,000,000. What unique engineering challenges does this business model present?
Fintech systems, particularly those handling large-scale monetary transactions, often encounter a set of common yet complex challenges. These challenges are multifaceted and require robust engineering solutions to ensure the system’s reliability, security, and efficiency. One of the primary concerns is managing concurrency effectively to prevent double transactions, which could potentially lead to significant financial discrepancies. This is closely tied to the challenge of maintaining a consistent state across the system, ensuring that all components have an accurate and up-to-date view of transactions and account balances.
Another critical aspect is ensuring high availability of the system. In the financial sector, even brief periods of downtime can result in substantial losses and damage to reputation. Therefore, engineering robust, fault-tolerant systems that can handle high loads and recover quickly from failures is paramount. Additionally, the ability to deliver real-time data is crucial in today’s fast-paced financial landscape, where users expect instant updates on their transactions and account statuses.
While there are various approaches to addressing these challenges, particularly in terms of managing concurrency, we at Atlantic Money have adopted a comprehensive strategy centered around the actor model. We’ve implemented this model across all major components of our product, which has proven to be a game-changer in terms of system design and performance. The actor model provides a conceptual framework for designing concurrent systems, where actors serve as the fundamental units of computation.
This approach has significantly streamlined our codebase, making it more modular and easier to reason about. One of the most significant benefits we’ve observed is the near-elimination of race condition concerns. Race conditions, which occur when multiple processes access shared data concurrently, can lead to unpredictable behavior and are notoriously difficult to debug. By encapsulating state within actors and communicating through message passing, we’ve drastically reduced the potential for such issues.
The adoption of the actor model has allowed us to write code more efficiently and, crucially, more securely. In a fintech environment where security is paramount, this added layer of protection against concurrency-related vulnerabilities is invaluable. It has enabled our development team to focus more on implementing business logic and less on wrestling with complex synchronization issues.
For those interested in delving deeper into this topic, I’ve authored an article that provides a comprehensive explanation of the actor model and its specific benefits for our system https://dzone.com/articles/concurrency-in-financial-transaction-systems.
Additionally, working in fintech involves navigating a complex domain with numerous partner-specific integrations. This requires a deep understanding of underlying systems and industry-specific terminology. It typically demands knowledge of trading protocols and terms, as well as regulatory requirements. These requirements are often region-specific and involve partner-specific details for their respective countries and systems.
Your role at Atlantic Money involves overseeing every facet of the engineering department. Can you walk us through a typical week in your position?
As mentioned earlier, we strive to maintain a small team that delivers valuable products to our customers. This approach means our engineering team is lean, with each member taking on multiple responsibilities. Everyone participates in on-call rotations, day and night duty shifts, system monitoring, and resolving internal alerts for both our product and infrastructure.
My typical week starts with a two-week overview. I go through my notes, reminders, and calendar to update my personal inbox because not all tasks fit into Jira. Every Monday, we have a planning session with our CPO to set goals for the sprint. I’m not a Scrum/Kanban enthusiast, so let’s just call it “Weekly Goals and Roadmap Updates.” It works well at our team’s scale, and there are no pain points in the process for now.
We also have a daily 15–30 minute call for engineering team sync. It serves as a point of synchronization for the team because all members are located in different regions and time zones.
Additionally, I have weekly manager and all-hands meetings. Every other week, I have one-on-one calls with team members.
That all!
Our process is fully remote, and we don’t have many calls because we believe focus time is more important than getting instant answers in Slack DMs or “jumping on a quick call.”
It’s easier to post a message in Jira, make a comment in Confluence (yes, I’m an Atlassian fan), or drop an update in Slack than to disturb people from their current work.
Summarizing all of the above, I have plenty of focus time for writing code, making improvements to the product, designing new features, deploying new releases, and resolving any issues related to processes, integrations, or anything else that comes up.
At Tinkoff, you managed high-load systems with 99.95% uptime for internal ERP systems. How do you apply this experience to ensure reliability at Atlantic Money?
Experience inevitably shapes our decision-making processes and approaches to problem-solving. During my tenure at Tinkoff, we faced a myriad of challenges related to the integration of internal systems and partner platforms, as well as dealing with system failures. These experiences provided invaluable insights into the complexities of maintaining high-performance, reliable systems in a fintech environment.
At Atlantic Money, we’ve leveraged this wealth of experience to implement a comprehensive suite of best practices. Our approach is multi-faceted, encompassing various strategies to ensure system reliability and resilience. We’ve implemented robust fallback mechanisms to maintain service continuity even when primary systems fail. Our caching strategies are designed to optimize performance and reduce load on critical systems. We’ve also adopted a principle of state isolation to minimize the impact of failures in one part of the system on others.
Central to our philosophy is the “don’t trust, verify” principle. This approach involves rigorous validation of all data and processes, both internal and external. We’ve also invested heavily in comprehensive monitoring and alerting systems, allowing us to proactively identify and address potential issues before they escalate. Our fail-fast mechanisms are designed to quickly detect and respond to failures, preventing cascading issues. We’ve also implemented various resilience strategies to ensure our systems can withstand and recover from unexpected challenges.
Perhaps most importantly, we’ve cultivated a culture of critical thinking and continuous improvement. We consistently question our assumptions, rigorously examining every aspect of our systems. This includes regular audits, stress tests, and scenario planning to identify potential vulnerabilities and areas for enhancement. By applying these lessons and best practices, we strive to maintain the highest standards of reliability and performance at Atlantic Money, ensuring our customers can trust in the stability and security of our financial services.
As examples of our best practices, I can share three interesting facts:
- We always use the full name “production” for our production environment rather than the abbreviated “prod”. This gives us more time to think while typing, especially when executing frequent commands in the terminal.
- Our policy mandates using different color schemes in our IDEs for production and development environments. For instance, I typically use a dark theme for coding, but I always switch to a light theme for my production environment.
- All database migrations must pass through multiple stages of review: a merge request, a pre-release review, and finally, manual application in the production database. We acknowledge that we’re all human and can make mistakes. This multi-step review process allows more time to catch and handle potential errors.
You’ve been involved in hiring and team building throughout your career. What qualities do you look for when expanding an engineering team in the fintech sector?
I once heard a great description for company stages: “Era of heroes, era of leaders, and era of the crowd.” This aptly describes hiring tactics and questions for new candidates.
In the era of heroes, you can build a highly customizable process to hire new talent. In the era of the crowd, companies usually have a strict multi-staged hiring process with LeetCode-like interviews, a separate system-design interview, and finally, a culture fit assessment.
At Atlantic Money, we’re still in the heroes era, albeit with a mix of structured interviews.
We prioritize three primary traits in any candidate: proactivity, self-motivation, and result-orientation.
We don’t place much emphasis on a candidate’s previous language background. We believe a good engineer isn’t defined by being a “Golang engineer with 10 years of experience,” but rather someone who deeply understands computer science fundamentals, system design, and best practices. For instance, we hired someone in the pre-AI era who had never written a single line of Golang. Yet, they completed our test assignment—which required a strong understanding of concurrency issues and distributed systems—in just 9 hours using Golang. They’re still with the company and performing exceptionally well.
Your experience includes developing a new billing system for binary option trading. How has this background influenced your work on international money transfers at Atlantic Money?
Answer Here:
You’ve worked on various internal frameworks and ERP systems. How do you approach building internal tools to support Atlantic Money’s mission?
Based on my personal experience, I would say that the perfect choice for a backoffice application is something that doesn’t require much code and has rich CRUD-specific features. The best frameworks for this are Laravel, Yii, Ruby on Rails, and Django.
However, at Atlantic Money, we decided to eliminate the backoffice backend application entirely. We leverage Retool (a low-code solution) to build a dashboard with read-access to all databases and specific API endpoints to run processes. This helps us avoid writing new endpoints to show data to, for example, the support team. It also restricts modifications from Retool to control exactly how the data should be changed for a given process.
Low-code solutions still have many drawbacks, such as poor IDE and versioning support, limited deployment options and migration support, and—to be honest—a mess in custom JavaScript code in Retool. This is because there’s no good way to structure the code, no TypeScript support, and many other reasons.
Even considering these drawbacks, we’re still using Retool as our primary system for backoffice operations because it significantly reduces time-to-market for new features.
Given your experience across different companies and countries, how do you see the future of fintech evolving, particularly in the realm of international money transfers?
The future of fintech, particularly in the realm of international money transfers, holds immense potential for improvement across various aspects. However, I’d like to focus on one critical area that I believe is ripe for disruption: the cost of money transfers.
Traditionally, the financial industry has operated on a model where larger transfers incur higher fees, often justified by the perceived higher risk or value. However, this approach is fundamentally unfair and outdated in today’s digital age. The reality is that for a modern, efficient transfer system, the cost of processing a $10 transfer is essentially the same as processing a $100,000 transfer.
At Atlantic Money, we’re challenging this status quo. We believe that people should be aware that the actual cost to transfer money doesn’t significantly change with the amount being sent. The practice of taking a high margin on larger transfers simply because “that’s how the market operates” is no longer justifiable.
Looking ahead, I envision a future where flat-fee structures become the norm for international money transfers, regardless of the amount being sent. This shift will not only bring more transparency to the process but also democratize access to financial services, making it more equitable for all users, whether they’re sending small remittances or large business transactions.
This evolution in pricing models will likely force traditional financial institutions to reassess their fee structures and potentially lead to increased competition and innovation in the sector. Ultimately, this transformation will benefit consumers, fostering a more fair and efficient global financial ecosystem.