VBA question

Discussion in 'Trading Software' started by vladiator, Feb 26, 2004.

  1. Hi guys,
    I'm writing a macro that would look up firms in a database by ticker-date or name-date combination and then spit out the stuff I need. The ticker-date part was easy. With the names being written in a myriad of possible ways, what do you recommend is the best way to program the look up to minimize the error rate? It seems like something that there should be plenty of ready code for on the net, but can't seem to find anything.

    Thanks a bunch in advance.

    V.
     
  2. Depends on how you've got the source data recorded, flexibility you have in pre-processing the source data before entry into the repository, size of repository you're scanning (if it's large - an interpreted environment like VBA isn't your best tool), and frequency of searches, expected size of result set, etc.

    For example in the search mechanism, there are a variety of techniques you could consider depending on the specific situation like Soundex-based filtering, partial key search, fuzzy recognition, etc.
     

  3. Hi ArchAngel,
    Thanks for getting back to me so fast. The data files aren't very large. The file I'm looking for the info in is about 45000 rows and about 10 columns. I could pre-process it, but I'd rather have it in the original format, b/c it'd likely be updated/revised periodically with info in similar layout with maybe different firms.
    Basically, several times a month, a user would provide a column of firm names and dates (not more than a thousand usually). The program needs to locate either exact matches or closest matches in the bigger spreadsheet and return a bunch of values. It seems like something that could be done with vlookup, but apart from precise matches, it does very poorely.
    Say I specify "DELL" as the user, but the larger file has "DELL INC" or it has "D E L L" etc, I'd like to have it flexible enough to find such occurences.
    The techniques you mentioned sound interesting. I'd appreciate it if you could let me know which one(s) might be best suited so I could follow up on them (hopefully something a VBA dabbler like me can program relatively easy).
    Thanks!!!
    V.
     
  4. CalTrader

    CalTrader Guest

    In excel you can simply write a vba function to search using the various rules you mention - pattern, soundex, character match etc. Just open up the visual basic editor and use the built-in help. There are also examples at Microsoft.com

    I would start thinking about moving this into a database: updating the database with the fresh data and using the more powerful built in functions that come along with modern database systems. Even though its not a lot of data, you are probably losing a lot of time with the inflexibility of being bound to excel for this work. If the data and its integrity is important to your operation all the more reason to start down this path.
     
  5. If you're saving the data as a spreadsheet, you've got no indexing - so you'd be doing a lateral scan of all 45K+ rows in the spreadsheet which could take a while.

    Probably not worth creating a computed column in the spreadsheet unless you're going to run a lot of searches in a row while you have the sheet loaded. You can build a VB function that Soundexs the company name and then search the Soundex values comparing against the Soundex of the search company name. That'll give you a subset of potential hits that you could do a little more of a search on (e.g., convert the company names to all caps and remove all spaces and then do an Instr for the search company name).

    Better approach would be to process the data into a small Access database (process any new data into the database as you get it). As you process the data into the DB you could compute Soundex and compressed/cap fields too - with the Soundex field indexed. You could then easily and quickly (with very little overhead) initiate a search along the Soundex index and then just check the returned subset of rows.
     
  6.  
  7. Thanks for the tips guys!
     
  8. A match metric as you describe is OK, although probably not even as good a filter as a Soundex.

    Example: While you'd match DELL to "D E L L Inc.", you'd also match it to suff like:

    Hugo DarnELL, Inc.
    stuDio EL Loco
    aDriana Enterprises LLc
    etc.

    Good luck.
     
    #10     Feb 27, 2004