Wednesday, June 30, 2010

I'm baaack

After nearly a 2-year hiatus from this blog, I'm back. I've been working hard on the trading system, things are going well.

Going to be posting some results soon.

Monday, April 06, 2009

IP4 Exhaustion

Friday, June 13, 2008

Market fragmentation

It's fascinating how the US equity markets have evolved as electronic trading has come into its own. One of the areas of particular interest right now is market fragmentation. First, a little bit of introduction on the topic for the uninitiated.

In the late 90's (before decimalization), there was a scandal that had to do with market makers not protecting their customer orders. In other words, a market maker would be making a market and then a customer order arrived that would better their quote. The market makers at times would fail to put the customer order in front of their own quote. The problem went even deeper than that, with market makers actually only quoting in even-eighths (i.e., they kept spreads artificially wide by having a gentleman's agreement not to quote in odd-eighths). Needless to say, when the SEC discovered this they were not amused.

I'm greatly simplifying things here, but what happened was that the SEC came down with various order-handling rules (informally called the "Manning Rule": http://en.wikipedia.org/wiki/Manning_rule). Limit order protection, limit order handling, and limit order display being chief among the rules, their intent was to ensure that all orders are handled fairly when in the hands of a market maker.

So how did the market makers respond? Well, they started dumping all of their "nuisance" orders (these are the customer orders that they now had to protect) into ECNs. ECNs, being designed to electronically handle massive amounts of orders, were ideally suited to carry out the market makers' obligations and make sure that the order was protected. What happened, though, was that the ECNs grew to the point of being larger than many of the market makers. A lot of this had to do with day-trading... With all of the resident liquidity from the market makers, all an ECNs needed was liquidity takers (day traders) and they had a nice bit of crossing volume. We all know the story behind Island, Archipelago, etc. Keep in mind that the ECNs did not participate directly in price discovery... this was still the province of the market maker (or exchange specialist). More on this in a moment.

So when decimalization came in, the scene changed drastically... Volumes skyrocketed, trade sizes decreased, and in general the market truly started to fragment. One one side of the fence, you had old-school institutions and brokers who both lamented the decrease in size (and spread!). They suddenly had to work for a living and actually add value on their respective trading desks. You can still bump into these guys, who fervently cling to the idea that decimalization was a Bad Idea. They'll even try to explain how it ultimately hurts the retail investor because mutual funds have to pass the higher trading costs on to them. These guys are the same guys who are mad that algorithmic trading is replacing them on the trading desk for certain functions. 'Nuff said.

So I just mentioned algorithmic trading; this is certainly a big buzzword these days. In a nutshell, algorithmic trading was developed to cope with the fast pace of the post-decimalized world. It started off with smart routing, evolved into basic algorithms (VWAP, Implementation Shortfall), and continues to evolve to this day. It was a way to block trade in decimalized markets that had no more blocks. But there was another solution to this problem: Dark-pools. A dark-pool allowed traders to put large blocks of liquidity into non-publicly displayed markets, and then put additional restrictions on them (minimum execution size, etc). This was a way of un-fragmenting the liquidity in the market; the other side of the same coin as algorithms.

But what is happening now should cause everyone to sit up and take notice. The dark-pools (there are more than 40 now, I've lost count) have grown to the point where they (collectively) have significant liquidity. And NONE of this liquidity is protected by Reg NMS, nor does it participate in price discovery on exchanges. Doesn't this sound like a step backwards?

Adding fuel to the fire are the industry participants who want to perpetuate (and further) this lopsided market structure.
Take this recent article, for example: here.

Seth Merrin, undeniably an industry pioneer, comes up with an incredibly wrong argument to make his point that Exchanges should route flow to ATSs. First, he draws an analogy between the GAP buying jeans from a guy on the corner. We all know that this is not how things work, because it would be very inefficient. But he uses this analogy to justify the argument that exchanges should route their retail flow to ATSs (oh, and he just happens to run an ATS by the way). First off, Seth, you need to go pick up an economics textbook and realize that in the tangible goods markets, there are costs that do not exist in electronically traded markets: Storage costs, shipping costs, and decay costs, for example. These and other reasons are why the GAP doesn't buy from a guy on the corner. With electronically traded equity shares, those costs do not exist, or they are quite minimal because of the efficiencies of electronic marketplaces.

So, what is the answer? I firmly believe in free-market forces. We need to abandon the notion that retail and institutional liquidity should be treated differently. We need to stop removing large amounts of liquidity from the price discovery process (like LiquidtNet and other darkpools do). That being said, I am VERY much in favor of competition, and so I am not advocating a Central Limit Order Book (CLOB). I do like Reg NMS, and all that needs to change is for ALL pools of liquidity to protect quotes ala Reg NMS. Also, all venues should be forced to display a quote (and thus participate in price discovery). Finally, note that these restrictions do NOT preclude venues from creating alternative trading models. For example, mid-point matching is still a very viable model, as is call-market auctioning (ala POSIT).

BTW you might be wondering why I am a fan of POSIT and not other parasitically-priced dark pools; it's simple: POSIT doesn't let you cancel or replace your order for a certain amount of time before the match, and they also start the match at a random time somewhat close to the published time. This greatly reduces gaming. Also, POSIT lets anyone in; buy-side, sell-side, and even retail (if they are big enough). LiquidNet, on the other hand, only allows buy-side shops in.

Ah, I'm done. Drop me a line and let me know what you think.

Sunday, March 02, 2008

The probable improbable

OK, as promised to my 1 riveted reader(s), (and no, that was not generated by a script... just my attempt to be funny), I am resuming the blog.

So, after a long trek down math lane, I've got three strategies that are operating now. And hell yeah, I was RIGHT to code my own infrastructure. First off, I am easily able to do things that (from inside knowledge) are not so simple with "professional" packages. And it didn't cost me anything (unless you count two years of my life). Assuming I make $0 an hour, that is a good bargain.

Don't think too much about that last statement. Please.

Anyhoo, I've learned several things (not all at once... this stuff takes months (that means 15-18, not 1 or 2) of live trading to learn):

1. Shit happens, so be ready for it. The part of your code that is commented with, "I'll code this later -- very improbable" will get run more often than you think.
a. You can get fills before ACKs and many other seemingly incorrect behaviors... be ready for them.
b. You can have orders "hanging" with no ACK, or REJ for quite some time (minutes if not longer!)
c. Be ready for odd lots. Plenty of them, mein froind. If you're trading less than round lots, I would (seriously) recommend doing at least 100 shares. Dealers LOVE you if you do odd-lots with them, and market structure rules don't protect you as much.
d. Don't assume that you get good fills from your broker. You often get fills waaaay outside of the market, and you have to phone them and have a FIX log pissing contest. Be a turd, because it's your money. 9 times out of 10 you are right.
e. Quotes frequently get stale. I could have a whole blog on data sources, scrubbing, filtering, and sampling.
f. Trades get printed late (as per the 90-second rule). Don't use trade volume or price as an "immediate" indicator.
f. Symbols change names. They get acquired. They delist. Other symbols take their place (oh, that was fun when that one happened).
g. Corporate actions. Learn it, live it, love it. "f" is really a corollary to this.

2. You model will break. Then it will SEEM to unbreak. Then it will break. Trust me, it was broken to begin with. Go back and see what was going on. Many times you have overfitted (i.e., biased your model to the data). To get rid of this, you can do out-of-sample tests. You may also have a model that is not explanatory. Add more indicators but only enough such that you don't overfit. I would recommend either the AIC (Akaike Information Criterion) of BIC (Bayesian Information Criterion) to make sure that the new indicators are worth it.

3. Don't count on all the academic papers/theories to be right. They're brilliant works to be sure, but they are very academic. It's good to use as a way of viewing a problem, not always as a way of solving a problem. Kind of like how all that trig you took is an interesting way of viewing triangles, but not a great way to build a pyramid

Here's some more tidbits.

First, trading is a LOT of psychology. Seriously. Listen to the guys like Jim Cramer with his rules. Another one I like is George Sleezak (futures trader, used to play with a guy who founded the band "Chicago". Very cool). This stuff is pretty sound advice. One of my favorites that a buddy (Chris H.) came up with:

"If you have a position, imagine yourself on the opposite side. If you feel comfortable, then get out NOW." This does have psychological underpinnings because we tend to see potential gains differently than potential loss in terms of risk.

Another one (very important, even for a guy like myself):

Reverse entry points are NOT exit points. Every trade should have a separate piece of expectational logic to decide when to ENTER and when to EXIT. Translation: If your model goes long a name, then generates a sell signal in the same name, you are going to get hurt if you use that ENTRY signal (in the reverse direction) as an exit point. Worse yet, as a reverse point.

Each trade is it's own journey. You exit the desert because you run out of water.... you don't exit Atlantis for the same reason. OK I was never that good at literature but hopefully you get my analogy.

OK now for the geeky part of things:

1. Trading is about (duh) maximizing gains while minimizing risk. What the heck does that MEAN though? Well, financial engineers will tell you it's about maximizing the Sharpe (or for some, Sortino) ratios. Simply not true unless you are trading for a mutual fund. More on this in a moment. Remember, these are marketing terms and mutual funds use them to raise money because they make a ton of dough on MAINTENANCE fees.

2. Why is it that GAINS and LOSSES and so easily quantified in terms of dollars, whereas RISK is such a slippery pig? BARRA will have one definition, Northfield will have another, there are many models out there. One thing is true though: If you completely eliminate risk from your portfolio, then you should (theoretically) have returns like a T-Bill but with more transaction costs and tax implications. Instead of using other people's models, how about this notion: What does risk mean to YOU? Think upon that and come up with an answer before you hit the button.

3. Diversification. Wow. Markowitz was a very smart dude, but there are a few key assumptions that we should ALL take a closer look at (disclaimer: he has a Nobel Prize and I don't. But I'm hopeful.) :

a. Most of modern economics assumes rational investors. This is why they (when talking about Paraeto efficiency) speak of purchasing a baskets of GOODS. Who would buy a basket of BADS? In truth, some investors are NOT rational.

b. Also realize that people either "can" or "should" do rational things. If you "should" do something but for some reason "can't", well then you have an inefficiency.

More on the next blog, I hope you enjoyed my humor and content (I think I'd prefer it, in fact, if you enjoyed them in that order).

Thursday, February 21, 2008

No time like the present

I've not blogged in a long while. I shall resume presently. Stay tuned. Many new updates.... just need to get them all down on e-paper.

Sunday, June 03, 2007

Timer chains

My last post dealt with how I added deep subscriptions and I hinted at timers. Well, I've just about gotten the timers working, and it was a pretty challenging task.

For starters, I am creating a distributed system (i.e., timers firing over a network). This means I have to create a thread to listen for timer requests. When a timer request comes in, I have to notify the timer thread (this is the one that blocks) of the new timer.

The tricky part is the timer thread itself. I used pthread_cond_timedwait() as the main blocking function; it basically lets me wait on either a signal from another thread OR a timer. Keep in mind that since that thread can only block on one timer at a timer, each new timer added to the system must "chain" itself in an in-memory structure. For example, if I have a timer that is set to expire in 5 seconds and then another thread requests a timer that will go off in 2 seconds, I have to block for 2 seconds (the new timer) and then block for 3 seconds (the old timer, minus the 2 seconds that have already passed). This isn't too difficult to code. The main thing to watch out for is timers that expire at the same time.

One interesting thing I found on multiprocessor systems with pthread_cond_timedwait() is that you can have spurious signals and timeouts (actually, one doc said you can get only spurious signals, not spurious timeouts, but I assumed that either could be spurious). Therefore, you can't assume that everything's cool when the call unblocks.

Lastly, timers are not exact. You can have a signal that expires JUST before a timeout, but because it's possible that slightly more time expired than the timer was set for, you also have a timeout that you have to simulate. Fun.

Sunday, May 27, 2007

Timers and deep subscriptions

OK, the next Big Project is to add timers to the DB. Why do I need this? Well, for starters, the order placement strategy needs them. I need to be able to create an order that will change it's characteristics (limit price) or cancel itself after a certain amount of time.

The trick about timers is that there may be more than one timer per stock. I won't get into the details as this is part of my strategy, but it creates an interesting architectural problem. To see why, let's review how my tables and subscriptions work.

A table has 2 descriptive characteristics:
TableNum: This is a unique integer that is the "name" of each table
NumFields: Each table has a nonzero integer number of fields.

In SQL databases, you would also have a type for each field (e.g., string, float, etc) but my DB only supports "double" as the type for a field.

In DB, when you read or subscribe to a table, you get ALL rows (i.e., there is no filtering like with the WHERE clause in SQL). This has worked well so far, because each stock has it's own tables for bids, asks, and trades. This means that for 100 stocks, I have 300 tables for quotes (each stock has 3 tables: bids, asks, trades)

This makes things easy from a programming point of view, but to add more stocks to the system, I have to change the DB schema, which is a pain. The right way to do things is to create what I will call "deep subscriptions". This means that every subscription will have a MASK that specifies which rows are interesting to the caller.

For example, suppose that I have a "trades" table that has 3 fields:
Time
Price
Size

Remember, the number of the table "encodes" which stock I'm looking at.

OK, so if I create a mask that looks like this:

*, *, 10000

then I am interested in all trades that have an exact size of 10000 shares. The asterisk is a wildcard an allows all values to pass through.

Well, being that I have limited the DB to only doubles, how can I specify a wildcard and yet not use up any values that I might want to filter on? Clearly I cannot simply use 0 or -1, as these are values that I might want to filter on. Well, there's a simple answer: Nan (for the uninitiated, a NaN is a special value in floating-point land that indicates that the number is Not-A-Number. This is different from inifinity. You would get NaN if you take the square root of a negative number. You would get inf when you divide a real number by zero). For information on this somewhat arcane subject, here ya go:

http://en.wikipedia.org/wiki/NaN

In the standard math.h file, there is a function called "nan()" that provides me with a "quiet-nan". By "quiet", it means that the FPU hardware will not generate an exception when this number is operated upon (e.g., a compare operation). There is also a macro "isnan()" that I can use to see if a number is a NaN.

Now, by simply altering the TableSubscribe() function to support a mask, I have "deep subscriptions" that are effectively like a WHERE clause in SQL. Note that I am limited in my comparisons.... In SQL you can say:

SELECT * FROM table1 WHERE field1 > 5

Here, the "greater-than" operator is used. I can only support the equality operator, like this:

SELECT * FROM table1 WHERE field1 = 5

But that's fine for now..... Baby steps.

So getting back to timers, I need to create a TIMER_REQUESTS table and a TIMER_ACTIVITY table which allow me to implement timers in tables yet selectively subscribe to only the timers that I care about.

With the subscription masks, this is now a piece of cake. The deep subscription feature also lets me compress my DB table structure to something much more maintainable and scalable.

I'll leave it to the reader to figure out exactly how I plan to implement timers. The trick here is that they need to provide the correct firings regardless of whether I'm running in realtime or when backtesting. I'll give you a hint: Check out the dynticks feature in the new Linux kernel. Same idea, sort of.