I thought I'd revisit this topic now that I've taken some time to examine the different offerings and lay out my thoughts and decision process, in case anyone else comes in the future with similar questions. I want to give back, and all At the end my choice came down to Tradestation and Interactive Brokers. Tradestation's historical data offering is exceptional for the price. I'm in the market for a broker as well as data so I have no problems signing up with them as my brokerage. The issue comes down to software/API. While I'm given to understand that Tradestation (the .exe) is solid software for trading, there is no external API. To script it for automated trading and backtesting strategies etc. you have to use their internal language, EasyLanguage. That's simply a non-starter for me. I'm a software engineer but I would rather use my time writing and testing models, not reimplementing basic machine learning algorithms in a proprietary scripting language that is likely not even using optimized libs like MKL/AMCL under the hood. There is a HTTP API for trading, but it is explicitly not supported to gather historical data using it. I don't want to copy/paste historical data whenever I need it, I want a real API. If someone from Tradestation ever reads this, please open up your software with an API in a real programming language, like Python or anything.NET! IB's historical data offering is total crap compared to Tradestation (I would only be able to go back up to a year in history) with awful throttling limits. However, they have an API that supports actually using historical data. Their platform is very well-understood too, so I can ignore their own software and use stuff like MultiCharts and RightEdge with it. Also their commissions are a lot better. Except for that $1.50 charge to cancel a limit order! Ultimately I think that IB gives me, or anyone in the same situation, more room to grow. If I start working with algorithmic trading and start strongly depending on data farther back than a year for actually good backtesting in different market conditions, I can (I think) stop buying their data stream and purchase an offering like ActiveTick or IQfeed, while still trading through their brokerage and their API. Other places I considered were Lightspeed and Cobra Trading. For both of those cases I'd need to use RealTick, which has a base $250/mo cost (waived at Cobra Trading for a volume that I wouldn't come close to right now) with an extra $100/mo for API access. Those prices are all before the cost of any data feeds. They start big, rather than starting small and giving me room to grow. So in summary: looks like IB is the best offering, unless you don't care about API access or programming your own algorithms, in which case Tradestation would probably work well for you.
I'm getting much more than a year's worth of data with IB using multiple requests for different periods but it might depend on how much you're paying in commissions.