RFQ growth and DORA compel firms to focus on system capacity, high performance, complex workflow automation, and disaster recovery (DR).
The fixed income market structure is changing, which is driving automated trading. As you build trading systems you need to balance the requirements of complexity and high performance with the need for resiliency.
Dealers need to be able to stream millions of axes a day, quote list RFQs with hundreds of items, as well as manage increased volumes at month end. They need to do this while also meeting regulatory requirements including DORA (Digital Operations Resiliency Act), which state that systems need to meet stringent requirements for disaster recovery and be tested regularly.
Tom McKee, co-founder at TransFICC, sat down with The DESK to discuss the engineering approach to managing these competing requirements.
Why is system capacity so important now?
Growth in all-to-all trading, primarily driven by increased usage of algo execution tools from the buy side means that banks often receive 40,000+ RFQs per day across rates and credit. At month end the problem gets more concentrated when they receive thousands of RFQs per second. Dealers also need to distribute axes and prices to all available venues. Limiting distribution to only the largest two venues in government or corporate bonds means they are missing out on potential customers and liquidity.
In addition, the regulatory environment is about to become significantly more stringent with the implementation of DORA in January 2025. The new regulatory standards will compel financial institutions to set up and maintain a dedicated ICT third party risk strategy, implement comprehensive business continuity policies and a management process to monitor ICT-related incidents – all of which will need to be periodically tested.
What are the workflow complexity issues and how do you help customers manage these?
While it’s relatively easy to write code that auto responds to a single RFQ, the reality is that it’s not as simple as having one type of RFQ per venue. For example, we support 12 workflows on Tradeweb and 10 on Bloomberg just for IRS. Corporate bonds trade on several venues and you can match on price, spread, yield, discount margin before getting to various benchmark spotting models.
Auto response systems require a full understanding of negotiation stages and the allowed transitions at each negotiation point. Therefore, to support this complexity you first need to build an automated testing framework.
As we add new venues and workflows, and existing venues make API changes, we automatically re-test all our code and a new release goes to our UAT environment daily. This also allows our customers to test their systems quickly.
A good example of workflow complexity is the trade-at-close protocol. Dealers may negotiate several of these deals during the day but what if a server or a circuit goes down and connectivity with the venue is lost before the end of day price is received? We have written failover tests to check that all trade-at-close deals are replicated to a standby server and are fully available on re-connection.
With TransACT, our auto negotiation trading service, we provide a simpler API that just deals with the instrument pricing request, while we handle the negotiation. This simplifies the complexity of building automated customer quoting systems. We also have a testing simulator that comes with it, allowing for customers to test their quoting systems based on volume and inquiry type.
How do dealers manage high performance requirements and how can they test this?
Dealers need to be able to handle both high message throughput and the ability to respond quickly.
Most venue APIs for streaming prices or axes have throughput limits. For example, if you send one venue more than 2,000 messages per second the venue API will auto disconnect you. For a different venue the limit is 5,000 messages per second. While this sounds like a lot of messages, it is easy to hit these limits if you quote on thousands of instruments and support tiered pricing.
To help dealers, we provide a coalescing protocol to manage throughput. For example, if the dealer sends 20,000 messages in a second, the coalescer protocol will select the most up to date quotes per instrument by timestamp and keep below the venue throughput limit, ensuring the latest price makes it to the market. Rejected quotes are notified to the dealer.
With our One API product, we provide a test stub which dealers can use for testing RFQ throughput. The dealer can set their own throughput rate for in-bound RFQs (like 5,000, 10,000, 50,000, or 100,000 per second) through the TransFICC API to test their own applications. This has been very helpful for customers wanting compliance sign-off, as most of the trading venues do not simulate large traffic loads.
Are you getting more questions about DR Failover?
Yes, because this is a key requirement of DORA. At TransFICC, we take this very seriously and ensure we have automated testing that simulates failover scenarios for our One API, Trader Desktop, TransACT, and hosting offerings.
We run our services in multi-node clusters to make us fault tolerant from hardware failures. In addition, we allow for our customers to run in a “Hot Hot” configuration, where they can have multiple active sessions with us for a given venue from different regions. In the case where they have an outage (for example, when a datacentre goes down), their secondary connectivity takes over.
©Markets Media Europe 2024