Cover Story: Glitch!

Industry Ponders Remedies as Software Malfunctions Multiply

Is trading out of control?

Given the spate of high profile blow-ups this year, industry professionals and the regulators are starting to wonder. All the snafus were caused by software glitches, which leads trading experts to question whether brokers and exchanges know what they’re doing when building complex automated systems.

First came the Ronin Capital debacle on Feb. 24, when the options market maker disrupted trading at NYSE Amex with more than 30,000 wildly mispriced quotes. Then came the failed public offering of BATS Global Markets by BATS itself on March 23. That was followed by the botched public offering of Facebook by Nasdaq OMX Group on May 18, resulting in millions of dollars in losses for market makers. Finally, on Aug. 1, Knight Capital Group lost $440 million due to a software malfunction that flooded the New York Stock Exchange with a series of bad orders.

Taken together, the perception is the industry is losing control. "The complexity of some systems overcomes the best efforts of designers to keep them under control," says Harish Devarajan, chief executive of Deep Value, a developer of trading algorithms used at the New York Stock Exchange and elsewhere. "All systems start off as things that do our bidding. But some rise in complexity to the point where we masters become the servants of the system."

New Rules 

A week after the Knight debacle, the Securities and Exchange Commission announced it would convene a roundtable of trading technologists in Washington on Sept. 14 in an effort to determine if brokers and exchanges are in control of their trading systems. The SEC has billed the session as an information-gathering workshop, but Chairman Mary Schapiro has also stated that her agency intends to roll out new rules, as well.

At a minimum, the SEC is targeting the exchanges and other market centers with a rule requiring them to make sure their systems function properly. The regulator has had a so-called "Automation Review Policy" for more than 20 years that requires exchanges to make sure their systems are working properly and to report any outages that occur. This is a statement of policy, however, and not a set of rules. Schapiro said last year ARP should become mandatory.

In the wake of the Knight crisis, at least one influential lobbying group is pushing the SEC to adopt new rules. The Managed Funds Association, which represents hedge funds, sent a letter to the regulator recommending several rule ideas. Chief among them were mandatory risk checks of all a broker’s orders; new rules requiring systems testing; and the requirement that firms designate an individual to stand watch over the broker’s trading systems armed with a "kill switch."

Whether or not regulators heap new rules on the industry, they are certainly not shy about using existing ones to punish firms for writing bad code. Last October, for example, the SEC reprimanded exchange operator Direct Edge for deploying untested code that resulted in $2 million in losses for the company. A month after that, CME Group, which regulates trading on its futures exchanges, fined market maker Infinium Capital Management $850,000 for failing to prevent its trading algorithms from going berserk in the E-minis and oil futures markets.

In penalizing Infinium, CME took into consideration that the trading house did take steps to clean up its act. After its algos got it into trouble in 2009, Infinium adopted some of the recommendations of the Futures Industry Association’s Principal Traders Group (PTG) regarding quality assurance controls and risk protections, according to CME Group.

PTG recommendations have become de facto standards for some firms. Over the past few years, the PTG has written a number of reports offering specific guidance to electronic trading firms regarding risk management and best practices. Earlier this year, the organization published a number of specific tests and controls that trading firms should consider whenever they make changes to their systems.

Headlands Technologies, a quantitative trading firm based in San Franciso and Chicago, is one firm that has adopted the guidelines. "We’re not going to eliminate mistakes," Matt Andresen, co-chief executive of Headlands said. "But we can try to minimize them by insuring best practices. That means mandating procedures and policies for how someone interacts with the marketplace."

While calls for best practices might be expected in the current climate, it is the issue of accountability or responsibility that generates the most debate. Who or what should be responsible if trading programs run amuck? If a broker algo disrupts the market, is it the broker’s responsibility to shut it down or that of the venue?

On Aug. 1, Knight’s algos-gone-wild traded about $7 billion in notional value during the first 40 minutes of the day, driving volume in such names as Molycorp and Lithia Motors to unprecedented levels. All and all, the NYSE was investigating trades in about 140 stocks, opting to bust those in six. The big market maker mistakenly bought at the offer and sold at the bid, losing the spread on every trade. Despite the havoc, the NYSE chose not to cut Knight off.

NYSE declined to comment. But other sources noted that exchanges are typically reluctant to unilaterally prevent trades from occurring. That’s because the trades may not be erroneous, so blocking them might open the exchange up to liability.

In any event, it was the second time this year an exchange run by NYSE Euronext decided not to intervene when an algo went wild in its market. In February, options market maker Ronin mistakenly transmitted over 30,000 mispriced quotes to NYSE Amex Options. The quotes were traded against and the firm reportedly lost somewhere between $500,000 and a few million dollars, according to sources.

Amex does offer market makers risk management tools to protect themselves against unwanted executions, but usage is voluntary. Amex does not take unilateral action to shut off a market maker, despite any concerns it might have. It does not second guess a dealer.

Unless instructed otherwise, the exchange assumes any quotes it receives from dealers are appropriate. "Market makers can send us quotes at any price they desire," Steve Crutchfield, Amex chief executive, said at a press briefing in May, "and we don’t reject them. I mean, who are we to say the price is wrong?" 

Sea Change

Still, the executive is uncomfortable with that policy. In April, Crutchfield called on other options exchanges to work together with the SEC to produce a policy that would allow exchanges to reject orders or quotes they deemed harmful to the marketplace. "It would be a sea change," Crutchfield colleague Amy Farnstrom, co-CEO of NYSE Arca Options, told reporters.

At least one brokerage executive believes such steps are necessary. Jim Michuda, chief executive of Wolverine Execution Services, notes that exchanges offer risk mitigation tools to their members, but they cost money and usage is optional. Michuda argues that exchanges should unilaterally be willing to cut a participant off if the firm’s trading appears out of control.

Michuda says exchanges should closely monitor the size and frequency of members’ orders and compare that data against the volume characteristics of the security. If it observes a participant trading relatively heavy volume in a short period of time it should take action. "The exchanges should have reasonable limits in place," Michuda said. "They need to have safeguards to stop or slowdown some of this stuff. The last stop is the exchange. They are the arbiter of sanity."

And what about inside the errant organization? Who should be on the hook if an algo goes wild? Software developers build the algorithms, but traders or those on the business side use them. If they malfunction, who should be held accountable?

After Nasdaq’s Facebook snafu [see sidebar], the blame was being placed on the technology side of the company, according to Wall Street Journal reports. At a conference at Stanford University, Nasdaq chief executive Bob Greifeld told attendees the business side relied too much on assurances from the technology side. The Journal also reported Nasdaq was considering a restructuring of its technology and operations departments, possibly firing Anna Ewing, Nasdaq executive vice president in charge of technology.

At least one veteran trading technologist argues the blame should fall on the business side, not the technology division. The idea of firing Anna Ewing is "ridiculous," he said, adding that top brass on the business side should be accountable. "There is a trader who is responsible for those trading systems. Somebody has to take responsibility and it’s not the programmer."

The SEC may have other ideas. In December 2010, at a conference sponsored by the Investment Company Institute, David Shillman, an associate director in the agency’s Division of Trading and Markets, told reporters the SEC was mulling the idea of imposing "baseline qualifications" on algo developers and users. The idea was to prevent unqualified individuals from designing algorithms.

That idea has appeal for one executive on the business side of algo trading. In the pages of this magazine, Dan Mathisson, who runs Credit Suisse’s U.S. equity trading department, came out on favor of such a move. "I suggest it’s time for a new license, "Registered Algorithm Developer," for people who are directly involved in coding or supervising algorithmic trading code, who do not already have a Series 7 license," Mathisson wrote in January 2011. 

Unstable Marketplace

Shillman and Mathisson made their comments in the aftermath of the ‘flash crash’ of May 6, 2010, when the market abruptly plummeted and rebounded in a matter of minutes. Software glitches were not blamed for the drawdown, but electronic trading was. According to the SEC, the quickie crash occurred after heavy algorithmic selling of E-mini contracts led to a liquidity crunch that was facilitated by speedy electronic markets.

Ever since, the event has cast a shadow over the U.S. stock market. The public perception is that the automation of stock trading has created an unstable marketplace. Incidents such as the Knight disaster or the IPO debacles only serve to validate that view.

The truth, however, is that trading mishaps stemming from software glitches have been a part of the trading landscape for at least the past 15 years. There have been glitches since markets and brokers began to automate their systems in the 1990s.

Ironically, the last time a huge market blow-up occurred, it was Knight that caused it. In 2004, the firm, then known as Knight Trading Group, again put its financial health in jeopardy by trading millions of shares worth of options in the QQQs at injurious prices. Due to a glitch in reading an incoming market data feed, Knight’s systems mispriced QQQ options representing nearly $1 billion in notional value. The trades, which took place on the old Pacific Exchange, covered about 300 million shares at about $30 per share.

Unlike the most recent blow-up, where relatively new erroneous trade rules prevented Knight from busting or adjusting most of its trades, in 2004 error trade rules were much looser. The P-Coast stepped aside and let Knight work with individual market making firms and other brokers to bust the trades, saving Knight millions. Still, Knight decided it had had enough of options trading. In August 2004, it sold the options market making group, which is run out of Minnetonka, Minn., to Citigroup.

The options industry as a whole had significant teething problems in the 1990s and early 2000s, as the exchanges and their members began to automate. Exchanges and brokers experienced problems almost daily during this period, sources tell Traders Magazine. Exchanges launched new auto-ex systems and the broker-dealers had to build their own systems to work with them.

Capacity was another issue. In 2003, when options volume started its multi-year rise, exchanges began to have concerns over message traffic. At the Pacific, "when OPRA feeds started to go over 25,000 messages per second every so often, we’d go into panic mode," said one ex-P-Coast executive. "If it lasted more than three or four minutes, we were going to have problems."

Software glitches plagued cash equities as well, shutting down both Nasdaq and the New York Stock Exchange several times in the 1990s and 2000s. In March 2007, shortly after the NYSE switched over to its hybrid system, it developed a glitch in its DOT system that forced it to temporarily retreat to manual trading. Two years after that, in July 2009, NYSE had to extend a trading day by 15 minutes after a replacement for DOT developed problems.

Nasdaq has not been immune. In 2000, Nasdaq had to halt or slow trading twice due to software glitches. In April of that year, problems with the old SelectNet system halted trading for over an hour. In December, problems with SOES halted trading for 11 minutes. In modern times, the Facebook blunder was not the first incident to raise the ire of market makers. In April 2011, a glitch in new software that allowed Nasdaq to post quotes on behalf of market makers led to losses for them. Nasdaq had to pony up $3 million as recompense.

Longtime observers say exchange reliability is actually greatly improved. "There’s no question," Andresen said. "It’s like any business that matures, you learn from your mistakes."

Andresen suggests that the IPO problems at BATS and Nasdaq were actually not part of their core businesses. The exchange operators are in the business of continuous matching sessions, not point-in-time auctions. "Running a one-off Dutch auction is not the core business of an electronic market," he said. "That’s a different functionality. It’s not something that is done every day. At BATS, it had never been done."

And that’s the rub. Most software glitches occur in new programs. A broker writes a new trading algorithm or updates an old one. An exchange adds new functionality or updates old functionality. Then…kablooey! The development process is supposed to involve supervision and testing. That doesn’t always happen.



At Direct Edge, for instance, programmers created new software to comply with changes to the SEC’s Regulation SHO. According to the SEC, the exchange operator never bothered to test the software. It subsequently went haywire, delivering excessive positions to three hapless brokers.

At Credit Suisse, in 2007, programmers working for one of the firm’s proprietary trading desks added a new feature to an old algo that disrupted trading at the New York Stock Exchange. The NYSE fined Credit Suisse $150,000, charging the broker with a failure to supervise the development process and to even monitor the performance of the algorithm.

Technology executives tell Traders Magazine incidents such as these are not surprising as software development processes and procedures are often haphazard. Pressure to rush a new feature to market can override the need to get it right, they say.

"There’s a lot of pressure to speed up the development life cycle when it comes to trading-related applications," explained Michael Chin, chief executive at Mantara, a vendor of risk management technology.

Chin adds that banks have always claimed that software development is not a core competency, "yet at the same time trading is turning into software development," he said. "Do they have all the checks and balances in place? The proper quality assurance processes in place to roll something out like an IBM or a Microsoft or an Apple does? One could argue that ‘no’ they don’t necessarily have those best practices in place because it’s not their core competency."

Longtime trading technology executive Bill Harts-who designed one of the first program trading systems in the early 1990s-has had firsthand experience with the time pressures inherent in software development. In the middle of the last decade, Harts was working for Bank of America and was responsible for the firm’s NYSE specialist operation. At the time, the NYSE was converting to its hybrid market and was urging its specialists to get ready. Bank of America was taking longer than the rest-and was getting bad press in the New York Post for its tardiness-but Harts would not be rushed.

"We had a lot of code we had to get into place," Harts said. "We were under a lot of pressure from the exchange to get it done. But I insisted on making sure our testing was done. We wanted to be 100 percent ready. We were late, but it worked."

Harts maintains it is impossible to eliminate every bug, especially as trading systems have grown more complex over the years. "There’s more that can go wrong," he said. "Today, you have layers upon layers of algorithms, each with the capability of interacting in unforeseen ways. A typical trading system may have hundreds of thousands of lines of code. The opportunities for problems to arise get greater and greater."

Others agree, saying incidents like the Knight debacle will only increase. According to Mike Gualtieri, an analyst with Forrester Group, part of the problem is due to shortened development cycles. "The process is getting sloppy," Gualtieri said recently on Bloomberg Television. "Part of it is because of the speed (of development). Part of it is because the software is inter-related with other software. So sometimes there are unintended consequences."



An executive with a vendor that builds systems that help others to build algorithms says software development is often balkanized and uncoordinated. Different groups-developers, quality assurance, quantitative analysts, and business types-working independently leads to poorly written code.

"People get tunnel vision," says Richard Tibbets, a co-founder and chief technology officer at Streambase. "And then they build problems into software that only come to light later when the system is used or modified. It is important to make sure the whole team is responsible for the quality of the software."

So, what’s to be done? More testing, supervision and monitoring are the typical responses. And for some that means more and better human involvement. A key aspect of the Managed Funds Association’s proposal is to require a registered principal to be "on duty" whenever a firm is trading. The principal has the authority to turn off all or part of a given trading algorithm.

Deep Value’s Devarajan agrees that more human involvement is necessary. "The complexity is such that you don’t know who makes the call and the people who make the call don’t have the information they need at the time that these disasters are unfolding to make the call," he explained.

The exec added: "I don’t have a precise prescription, but there should be some level of required human overlay exercising judgment on top of these automated programs. Human overlay is the only response to a future where machines are going to get even faster, the overall system is going to get even more complex, and specializations are going to get even more narrow."



Nasdaq’s Face Plant

So what actually happened during the initial public offering of Facebook on May 18? Two things, according to a Nasdaq regulatory filing. Glitches prevented Nasdaq’s IPO cross system from working properly. Glitches delayed the dissemination of Cross transaction reports for over two hours. Due to problems with the Cross itself, the opening trade was delayed from 11:05 a.m. to 11:30 a.m. At 11:05 a.m., Nasdaq reported an indicative opening price of $42 for Facebook, based on volume of about 72 million shares. At that time, Nasdaq attempted to execute the Cross and print the opening trade. After the initial calculation, but before the trade was printed, Nasdaq’s system accepted several new orders as well as cancellation and replace orders. That led to a recalculation of the price. Still more order changes entered the system. Because the system kept accepting new modifications, it was unable to print the trade.

Nasdaq eventually fixed the glitch and, at 11:30 a.m., completed the auction and printed the trade. The stock opened at $42-based on volume of approximately 76 million shares-and continuous trading began. At this point, Nasdaq assumed that all eligible orders had participated in the Cross and that trade confirmations would be sent out immediately. This was not the case. In fact, only orders received prior to 11:11 a.m. participated in the Cross. For those orders that did participate, confirmations did not go out until 1:50 p.m.




Algos Gone Wild: A Recent History Turnover

August 2012

Knight Capital Group loses $440 million after a software glitch causes unintentional trades at the New York Stock Exchange


May 2012

Nasdaq OMX Group fumbles Facebook’s initial public offering after software glitches delay opening cross and transmission of confirmations


March 2012

BATS Global Markets pulls its own IPO after software glitches mar trading


February 2012

Ronin Capital reportedly loses between $500,000 and a few million dollars after flooding NYSE Amex Options with over 30,000 mispriced quotes


April 2011

Nasdaq reimburses market makers $3 million after quote update software malfunctions


November 2010

Direct Edge loses $2 million trading out of positions caused by a glitch related to untested software installed to comply with Regulation SHO


October 2009

Infinium Capital Management algorithm malfunctions, trading about 7,000 E-mini contracts in 7 seconds; CME Group fines Infinium $850,000 for this and other problems


November 2007

Credit Suisse prop desk algo runs amuck at New York Stock Exchange due to a design flaw; NYSE fines the broker $150,000



Gatekeeper Turnover

"Technology is very complex and requires consistent focus and updates. The past several years has seen a lot of turnover in the financial industry, which is partly to blame as new gatekeepers take time to get up to speed. Glitches can and often do happen during these personnel disruptions. Another issue is the increasing complexity of trading centric technology in data centers. It is quite common for several hundred servers to be needed to manage one aspect of the trading food chain."

-Keith Ducker, Chief Investment Officer, Tora Trading Services

(c) 2012 Traders Magazine and SourceMedia, Inc. All Rights Reserved.