Home > Tools of the Trade > Trading Software > VBA question

VBA question

  1. Hi guys,
    I'm writing a macro that would look up firms in a database by ticker-date or name-date combination and then spit out the stuff I need. The ticker-date part was easy. With the names being written in a myriad of possible ways, what do you recommend is the best way to program the look up to minimize the error rate? It seems like something that there should be plenty of ready code for on the net, but can't seem to find anything.

    Thanks a bunch in advance.

  2. Depends on how you've got the source data recorded, flexibility you have in pre-processing the source data before entry into the repository, size of repository you're scanning (if it's large - an interpreted environment like VBA isn't your best tool), and frequency of searches, expected size of result set, etc.

    For example in the search mechanism, there are a variety of techniques you could consider depending on the specific situation like Soundex-based filtering, partial key search, fuzzy recognition, etc.

  3. Hi ArchAngel,
    Thanks for getting back to me so fast. The data files aren't very large. The file I'm looking for the info in is about 45000 rows and about 10 columns. I could pre-process it, but I'd rather have it in the original format, b/c it'd likely be updated/revised periodically with info in similar layout with maybe different firms.
    Basically, several times a month, a user would provide a column of firm names and dates (not more than a thousand usually). The program needs to locate either exact matches or closest matches in the bigger spreadsheet and return a bunch of values. It seems like something that could be done with vlookup, but apart from precise matches, it does very poorely.
    Say I specify "DELL" as the user, but the larger file has "DELL INC" or it has "D E L L" etc, I'd like to have it flexible enough to find such occurences.
    The techniques you mentioned sound interesting. I'd appreciate it if you could let me know which one(s) might be best suited so I could follow up on them (hopefully something a VBA dabbler like me can program relatively easy).
  4. In excel you can simply write a vba function to search using the various rules you mention - pattern, soundex, character match etc. Just open up the visual basic editor and use the built-in help. There are also examples at Microsoft.com

    I would start thinking about moving this into a database: updating the database with the fresh data and using the more powerful built in functions that come along with modern database systems. Even though its not a lot of data, you are probably losing a lot of time with the inflexibility of being bound to excel for this work. If the data and its integrity is important to your operation all the more reason to start down this path.
  5. If you're saving the data as a spreadsheet, you've got no indexing - so you'd be doing a lateral scan of all 45K+ rows in the spreadsheet which could take a while.

    Probably not worth creating a computed column in the spreadsheet unless you're going to run a lot of searches in a row while you have the sheet loaded. You can build a VB function that Soundexs the company name and then search the Soundex values comparing against the Soundex of the search company name. That'll give you a subset of potential hits that you could do a little more of a search on (e.g., convert the company names to all caps and remove all spaces and then do an Instr for the search company name).

    Better approach would be to process the data into a small Access database (process any new data into the database as you get it). As you process the data into the DB you could compute Soundex and compressed/cap fields too - with the Soundex field indexed. You could then easily and quickly (with very little overhead) initiate a search along the Soundex index and then just check the returned subset of rows.
  7. Thanks for the tips guys!
  8. A match metric as you describe is OK, although probably not even as good a filter as a Soundex.

    Example: While you'd match DELL to "D E L L Inc.", you'd also match it to suff like:

    Hugo DarnELL, Inc.
    stuDio EL Loco
    aDriana Enterprises LLc

    Good luck.
  9. Hi Vald

    I had the same problem a while ago. I found the solution in the form of a VBA macro that does a fuzzy match between 2 strings. The code is listed on the mrexcel.com site, as part of the monthly challenge. Have a look at http://www.mrexcel.com/pc07.shtml and ammend the code to what you need.


  10. Hi All,

    Archangel and CalTrader did a great job in their posts. As the thread keeps going, I thought to add or confirm something that he already said.

    I have seen so many cases over the last 10 years of overuse of Excel. Excel of course is a terrific tool. People get started with it and pick up skills. If you get more ambitious, many people can't get rid of it anymore. Like Archangel said, they keep torturing themselves to get things done that would be child's play in other environments.

    As mentioned, Access is a great tool. You have to learn a bit to get started, like you had to with Excel, but a whole new world will open for you. If you really needed Access, you will very quickly leave Excel for what it's good at.

    One more thing. I myself thought VBA only really interesting for Word. It enables you to do many neat things in text processing. For Excel, I always had my doubts. If you need VBA there, it's time to look further.

    Be good,

  11. Well, I beg to differ. I have found VBA and Excel the single most important drivers in my trading profitability. They are excellent tools to develop strategies and evaluate data. Access is not really suited to develop customized trading strategies (incl order management) and Tradestation, Wealthlab simply are a joke to what you can achieve writing your own code (using VBA, VB or whatever else you chose).


  12. You are quite right. But relative to the length of the string, the match would be closer for D E L L inc. Of course, there might be some other name that matches better, but D E L L Inc would likely be one of the top three hits. ... I hope :)
    Thanks a lot for your insights!

  13. I partially agree with you that it's not the best tool for many tasks. I also agree with chs245 that you are not giving it enough credit. Under the present circumstances, I'm writing a macro which would be simple enough for junior staff to use - hence, simper platform is better :D
  14. Thanks OLiver! Will check it out in a sec. Sounds promising - I may end up using a combination of the techniques!
  15. I don't really want to argue with you about this. You are certainly right to love the tool that helped you grow rich. However, name me one thing that your VBA/Excel combination can do better than the Acces programming environment (Is also VBA in fact).

    You certainly cannot have been very seriously engaged in "developing strategies" based on "evaluating data". The torturing that Excel virtuosi impose on themselves by ignoring and refusing to learn about elementary database support is deplorable.

    I will not convince you with this, however I felt obliged to correct your view which goes against reason. I simply tried my best to help.

    May you continue on your beloved path of masochism. :D
  16. ...with pleasure :p

    I do not know Access well enough, but since it supports VBA, it must be able to do similar things as Excel.
    For what I'm doing, Excel is the best solution in terms of development time versus trading reward. In the end, I'm a trader not a hard core programmer, although I do enjoy these challenges such as the one described by vlad.

  17. Both sides in the debate are correct: A lot can be done - for not a lot of upfront money - using only word, excel and access.

    However things like SQL server or oracle are more specialized and tailored for manipulating data and multidimensional data products like the MS OLAP services offer even more specialized services for handling data.

    When to switch ? When you can calculate - with reasonable certainty - that you will incur a cost savings by doing so. If you cant calculate a cost savings in advance and your business is not paralized or put in jepardy of failure by the lack of an immediate solution then just stick with the solution that works or can be made to work well enough....
  18. Hi CalTrader,

    I believe you raised the ante a bit, but that makes it more interesting. I thought that up till now we were talking about Excel/VBA versus a simple db like Access.

    However, you are right that when you grow, you are likely to outgrow Access sometime. I run SQLserver for years now. I'm still using Access often because it meshes very nicely with SQLserver but it's amazing how easily I keep bumping in the 2Gb db size limit with Access.

    FWIW, in my struggle for "freedom" you may have read about, I first set up a migration plan from SQLserver to PostgreSQL. As of late l am on MySQL which runs on both M$ and Linux. With the latest development announcments for MySQL and given its popularity this looks like the best for me. My "bread and butter" stuff still goes on SQLserver though. I'd love to hear about any comments.

    Right now for a starter, skipping Access and moving straight to MySQL becomes more attractive as they seem to have much better support/management tools.

    Be good,

  19. OLiver,

    I checked out that link, very interesting - where were you yesterday! Kidding :) It turns out the macro I have now outputs identical results - it does exactly what that challenge was asking for. Whether it's as efficient as the one they have - I don't know yet, I'll check. Mine seems a bit longer code wise, I think...
    Thank a lot though!
  20. A symbol lookup database isn't going to grow significantly, so Access or MySQL is probably fine. Personally, I'd use SQL Server or MSDE, but I already know those tools. It's up to you if you want to learn but Access/MySQL is easier and cheaper and you can always upgrade later.

    Rather than program a bunch of complex searching methods based on messy data, just clean up the data. Dell vs. D E L L?? Fix the data as it's imported. Makes searching much easier and faster and simplifies your code.
  21. Not sure about that. If this was a one time deal, yes, you are undeniably right. But this is something that will be done over and over. The way I have it now is rather flexible, it will find the best three matches and if what you were looking for was there, odds are very high it'll be one of them. the number of various way you would have to "clean" the data is uncountable and doing it every time would be time consuming. Also, having this program means you can be less precise entering the name - a couple of typos won't matter usually.
  22. Well, I was thinking a program would be performing the import and clean it every time data was updated or added. The program would be updated with specific rules as time went on. Then add soundex for fuzzier searches. Clean data is critical as you grow into unanticipated needs.

    That's a programmer's perspective though. Your approach will work, but I wouldn't hire you as my DBA. :D
  23. nononsense, you couldn't be more mistaken about Excel and VBA if you tried. Go to Borders and read a few of the books on it.
  24. Actually he's not mistaken (and no I don't need to go to Borders to get a book - I've been working with all the aforementioned software for a VERY long time).

    I believe his point was that while you can make Excel do a lot of things, it's fundamentally a spreadsheet - not a database, not a programming environment, etc.

    Can you drive nails with a pair of pliers? Sure. You could put in screws with a hammer too. It's not nearly as efficient or easy, and the results wouldn't be as good as using a more appropriate tool, but you COULD do it.

    If you were so inclined, you could VBA program Word to do this company name lookup exercise too. You could even Javascript or VBscript it for local execution in a browser.

    The overall point wasn't whether you CAN make Excel do a whole spectrum of non-spreadsheety things - it was whether it made sense to always try to force every problem into an Excel-based solution.

    For one off hacks - who really cares. But for something that'd be used repeatedly over a long period - at least a little extra consideration is due.

    Many things that I've seen "Excel gurus" hack around and kludge Excel into doing (for a recurring application) are in fact quite trivial to have done in Access or straight VB. I've certainly used Excel to do a bunch of non-spreadsheet things for quicky one-time needs, but it's not the first or only tool that comes to mind to solve every problem - although admittedly I have more than one tool in my toolbox which isn't true of everyone.

    And if a person's not (or is only marginally) familiar with an alternate tool (e.g., Access), they really don't have a clue what can actually be done quite easily with it.

    To the guy who only knows how to use a hammer, every problem looks like a nail.
  25. That's the best way to put it! :D
  26. Well, if you consider using IB/Excel for arb monitoring, ES fair value calculation, audio - visual pace feedback, and assorted other goodies a 'one off hack' then I fully agree.

    Who am I to argue with experience. My 25 years programming is nothing, so I wouldn't pretend to butt up against my betters :p
  27. I agree with everyone on this debate.

    Excel is the most over used application in the financial world. It takes a few turns for people with an IQ over a hundred to get the hang and start running with it. Visual Basic for Applications requires a little bit more patience and thus weeds out a fair share of those who tinker with Excel. A precious few people have actually delved into the world of VBA/Excel as far as I have, so I feel the right, and the need to step up and make a few points here.

    First of all, Excel/VBA and Access make an excellent combination toolset for the serious VBA coder. VBA is not an object oriented program that you would use for enterprise systems, but it sure as hell can get you very far and is quite power and succinct. The multitude of object libraries which Microsoft allows VBA to run really bring Excel into the Big league through its back end. I have created Applications which have seamlessly interfaced with Access, Oracle, the internet, other Office products, XML, etc.

    What I have seen is a lot of prejudice from the OOP world against Excel, usually because of the feeling that it threatens there livelihood more so than anything else. Typically ignorance runs wild in these debates - the pros feel they know it all but they really never have looked at Excel for what it really can do, and that is give the user an environment where he/she feels they have control and familiarity but in the right hands can also really rock in the background.

    My $.02 - and that folks, is a bargain compared to what companies pay to have Excel on the desktops of the analysts who work for them.


  28. Hi dbTrader,

    I'm all with you. If I may say, I don't think Excel / VBA needed any defense, it hadn't been attacked. I have used VB a lot and still do (as I didn't liberate myself 100% from M$'s stranglehold :D ).

    I think the point that was not well understood by all is that if you become serious about "backtesting" and so, out of necessity you will be dealing with a lot of data. This is where "torturing yourself" has been correctly used by some, describing the fate of the ignorant-about-db guys.

    You dbTrader don't belong in this category as you state: "Excel/VBA and Access make an excellent combination toolset for the serious VBA coder". A couple of people didn't think a db package like Access to be of great use.

    Of course you could become a lot more ambitious, but this was not the point in this thread.

    Be good,

  29. I'm always using Excel and VBA to first prototype, then I asked a programer to port to a development platform, but sometimes it's too costfull so it stays in Excel and VBA until I'm fed up :) (at the moment I have still a gigantic excel spreadsheet but have decided to eliminate it for it consumes much memory, takes a lot of time for calculation and is difficult to maintain). But without Excel it would be more costfull and difficult.

  30. Maybe you database folks have a point. I have found Excel to be a fantastic front end for my trading but I have indeed experienced problems using Excel as a database. Usually because i have more than 64000 records in my spreadsheet.

    I have not been willing to invest time to learn SQL. I guess this thread has given me the impulse to start getting familiar with the possibilities that a SQL database gives you.

    I would like to ask all SQL-experts to share some of their resources:
    - what is the best tutorial site to learn SQL on the net ?
    - what kind of software does one need to get started ?

    Thanks in advance,

  31. Hi Oliver,

    Of course, learning something new is always a bit difficult. If you're handling lots of data, you'll never regret looking at db.

    A first question is where do you want to end up? I myself, a long time user (and prisoner) of M$ am now very seriously looking at MySQL as replacement for my SQLserver. MySQL is also a server/client db, runs on Windoz and on Linux and is free. It has nice books (O'Reilley).

    If you're very much M$ based, Excel/VBA, Access is perhaps the best way to go. Lots of books, M$ tutorials, etc. Access has a VBA tightly integrated with it. Access is not client/server which is probably not required if you're in standalone and a newbie. Another feature of Access is that it gives you an optional MSDE database which is in fact identical with SQLServer. This makes later upwards migration easier. Of course Access supports SQL.

    The main thing to get started is to install it. Once you start going with some of the examples everything should fall in place very quickly. Studying without hands on experience is almost futile I believe.

    Be good,

  32. I do not mind ending up in the M$ camp. I have heard a lot of good things about VB.Net and in that respect, i probably would like to go with a SQL db from MS. Problem is that they offer 7 edition of SQL server! Which one to choose ?

    If i take the plunge, I want to end up with a long term solution - seems like a true SQL db is the way to go.

    By the way, are there any sample databases out there which one could study in order to understand architecture and design ?

    Thanks for any advice,

  33. Oliver - take a look at MS Access - it comes with sample DBs and tutorials and you'll find a load of books at Borders and local libraries on it
  34. Excel is not meant to be a database. Period.

    At some point - you need to decide for your situation - Excel will be a money losing solution for handling data. The point at which it goes negative depends upon the quantity of data and the demands of the calcualtions you are basing within the product and the attributes of the data - is it critical data that would bring down your operation were it not available ?

    Other solutions are SQl server which is really the best solution if you are willing to be locked into MS software - typically the most expensive solution, although less than Oracle. The product help file is usually enough to get most people up and running. If you dont know the SQL language and T-SQL (Microsofts version with extensions) then you will need to learn it: it is not difficult.

    MySQL is the most cost effective solution if your demands are modest but exceed excel, and particularly if you wish the freedom to not be bound to a microsoft only solution: My recommendation to most midsize clients these days is to not get locked into MS unless there begins to be a more cost effective licensing agenda put forth by MS and even then based upon past behaviour it is best to hedge your costs and have multiple choices of vendor.

    Acccess is a great alternative to excel but it does not provide the level of programming flexibility on the database side nor does it have all the tools that SQL server has like backup/restore, DTS, and other useful services. It is not the choice for critical time-sensitive data repositories in most cases.

    To learn SQL I would recommend a beginner get the books by Joe Celko - We still use his as part of our training for junior DBA's.
  35. Of course Cal. A pity that for many we may have to tell them later: "I told you so!" :D