ICE: Optimizing the data pipeline for buy-side trading desks

February 29, 2024

3540

Buy-side bond traders can enhance execution through a flexible data offering.

The DESK spoke with Mark Heckert, ICE’s Chief Operating Officer of Fixed Income and Data Services. As buy-side firms move from a paradigm of data warehouses and data at rest/data in motion, they have changed demands on data processing. While the concept of the big data pipeline is a priority for many asset managers, delivering it requires a broad approach to match client needs.

How is the growth of systematic investing and trading changing the demands placed on the data pipeline?
The buy-side trading function has a tremendous diversity of technological capability. There’s a small handful of shops in terms of traditional institutions, that have great technology capabilities. Then a broad collection have more less robust technological capabilities and look for third party assistance. Consequently, an interesting industry has developed in our community to try and service those firms. Recent examples include technology providers such as execution management system providers. However, this has been hard because both the buy side firms themselves and their software service providers have challenges with connectivity and data pipelines, getting connectivity to the relevant data that they need to accomplish a task.

How are data pipelines evolving to support them?
First, by having the right technological ability and connectivity to the platforms they need, be that execution venues, or counterparties. That basic building block has not been well tackled by many of the service providers or the institutions themselves. Absence of connectivity to relevant data and analytics in the world of trading has slowed down the evolution of systematic trading. There has been the need to build the data pipelines, to have data sources bringing correlated data into the execution process. We see some players who are very highly evolved, getting into the secondary and tertiary sources of data, but a long list of firms who can’t readily synthesize and assess highly relevant data when they execute. There is tremendous diversity and our industry is well-placed to take advantage of that.

How are you developing situations and offering solutions to such a breadth of firms?
What a broad-based data vendor needs to do in this somewhat challenging environment is be everywhere, for everyone. We need to explore the latest and greatest technologies they’re using, from cloud-hosted database technologies, low latency feeds and the mechanism they need to take data. Do I need to give them an environment where they can have data scientists running Python scripts across live data? How diverse are the options available to them? We’re looking at technologies now where we can deploy models to our customers rather than just data. If they can’t build the models themselves, do we deploy the model? That’s now possible with some of the cloud providers out there today, in a way that wasn’t possible or would have been deeply challenging with an on-site solution in the past.

What limits has cloud-based delivery removed?
Our capacity to constantly update and maintain a model lets us deliver a capability that clients would not have been able to maintain historically on-site. Equally, we can be more hands-off, and deliver it in the right position at the right time via an API, if the user has complete control over what they’re doing. We support both and everything in between. What’s interesting is the cloud service providers deliberately stepping in, noting that this is an area where there are not enough tools, so they’re building for our industry.

What sort of models are you developing and providing to clients?
We provide risk models to our clients, we have trading models in-house where clients give us a query and we give them an answer. We realize though, that for several activities, clients want to merge more of their data with our data to come up with new insights. In the energy space, clients have a lot of data on their own upstream activities that they don’t want to share with us, but they want to merge with market data to help make decisions. Within fixed income trading, deciding where to trade is an active decision that requires your own data. We can deliver high level information on venues, bid-ask spreads, and activity on those venues, but execution quality received from a particular dealer is bespoke to a buy-side firm and must be in-house. Those are the areas where the joining of proprietary data from customers and our own data is increasingly seeing deployment of new technology and helping that process.

What are the challenges around providing data at different speeds and different formats?
In the background, if we buy a new data set, integrating data collection, methodologies and calculation processes is critical because you want consistent answers; I don’t want my user interface to give a different answer than the API gets. Data consistency is critical but adds complexity. Being cognizant of that – intentionally building infrastructure that allows you to offer consistency despite the upstream challenges – is critical, so the same data warehouse powers our API’s and our user interfaces, because it only makes sense to get the same answer from the same place.

What key changes are you expecting over the next 12 months?
We’re providing linkages to connect underlying mortgages to mortgage backed securities. That is a big data exercise that is foundational. We have data assets all the way from origination of a mortgage to servicing and capital markets with RMBS. The ability to link and serve data, particularly around better areas like prepayments, is compelling. If you have a better understanding of which pool would prepay or not, we think that’s very important to know. It’s a big data exercise – mortgage pools have hundreds of fields – and linking that to individual mortgages with a high degree of certainty is a powerful tool. We’re launching that this year.