Python - Read and split lines from text file into indexes.

Discussion in 'App Development' started by OTM-Options, Apr 28, 2015.

  1. Python - Read and split lines from text file into indexes.



    I have a text file with hundreds of lines and 10 columns of data separated by commas. I want to split the lines at the commas into 10 indexes and access each index individually. The code below only works on the first index - items[0] - and will print the first column and all the rows. If I change it to items[1] it will crash.

    Can anyone come up with a solution? Thanks. :)


    Code:
    with open(csvIn, "r") as ins:
       for line in ins:
         items = line.split(",")
    
         print items[0]
     
  2. i960

    i960

    volpunter likes this.
  3. This is nicer, but probably still won't work:

    Code:
    lines_list=ins.readlines()
       for line in lines_list:
         items = line.split(",")
    
    As another poster suggested you can use csv:

    Code:
    import csv
    with open(csvln, 'rb') as f:
      reader = csv.reader(f)
      items = list(reader)
    
    
    Which will give you a nested list.

    If using the native python csv parser doesn't work you could also try the pandas (http://pandas.pydata.org/) csv parser. Assuming you're going to be doing some data analysis on the data, might be worth doing it with pandas dataframes rather than arrays.

    If neither works then upload the .csv file you're trying to read.
     
  4. southall

    southall

    The libraries are general case routines.
    You can often optimize for your specific field requirements and get a 2x improvement in performance. Ok you wont get 10x faster. But 2x faster if you are processing millions of lines of data can still be worth it for a few hours of coding.
     
  5. i960

    i960

    He doesn't even have code that works yet.

    Make it work -> Make it right -> THEN make it fast.

    Additionally, I highly doubt CSV processing is going to be some major bottleneck here - and the second it involves anything non-trivial such as embedded commas or quotes it's going to go off the rails.

    This is standard "don't reinvent this shit 101" type stuff.
     
    eusdaiki likes this.
  6. OK .... Thanks for the replies, I'm new to Python and have made some progress on this project. Coming from PHP the Python CSV module seems to be unnecessary. With PHP you can easily read a file line by line and split the lines at any character into an array of indexes - manipulate that array and write it back to a file. No special module required.

    Another headache is Python math, PHP will automatically convert strings into floats and integers, while Python appears not to.



    :)
     
  7. i960

    i960

    You're approaching the problem wrong. You're thinking of the CSV format literally when you should be thinking like a library user and using a pre-provided API so you don't waste time on these micro-details *and* can stand on the shoulders of the library writers who've already done all the hard work of the details for you. You're also completely ignoring the possibility of embedded/escaped commas which I highly doubt you're ready to write a grammar for.
     
  8. 2rosy

    2rosy

    This is highly advanced so you might need to study theory behind whats going on here

    download python https://store.continuum.io/cshop/anaconda/
    and use the pandas library

    Code:
    import pandas
    
    listOflists=list(pandas.read_csv("FILE.csv").to_records())
    
     
    eusdaiki likes this.

  9. I think of a CSV file as:

    • A limited amount of "columns" separated by a character, in this case a comma.
    • Unlimited amount of "rows" which break at the new line character.
    • This creates a spreadsheet type format with columns and rows.
    • You should not need a special CSV module to manipulate data in a CSV file that uses commas as the separator.




    Any embedded/escaped commas do not concern me since I will not be working with a file like that.




    :)
     
  10. Thanks for all the replies. I have included the actual CSV data I am using to give you a better idea how it is formatted. My objective is:

    • Read the file line by line.
    • Break each line at the comma into it's own column or index- index[0] index[1] index[2] ...... index[9]
    • Preform basic math on some of the columns and put the results into a new column.
    • Only use rows that match a certain criteria, such as: index[2] == Buy,Sell or Expired.
    • Output only the columns I want to a new CSV file, also comma separated.

    I have completed all of the above on one line hard coded into my script. Now I'm working on reading an entire file - line by line - then writing to a new file.



    CSV FILE
    Code:
    Transaction Type=All,Product Type=All,Symbol=,From=2014-05-09,To=2014-12-31
    Transaction Date,Settlement Date,Activity Description,Description,Symbol,Quantity,Price,Currency,Total Amount,Currency
    ----------------,---------------,--------------------,-----------,------,--------,-----,--------,------------,--------
    2014-05-09,2014-05-12,Sell,POWERSHARES QQQ TR SR 1,CALL QQQ 2014MAY09 86.50,-3,0.16,USD,34.29,USD
    2014-05-09,2014-05-12,Sell,APPLE,CALL AAPL 2014MAY09 610.00,-2,0.01,USD,0.00,USD
    2014-05-08,2014-05-09,Buy,POWERSHARES QQQ TR SR 1,CALL QQQ 2014MAY09 86.50,3,0.30,USD,-103.70,USD
    2014-05-08,2014-05-09,Sell,TESLA MOTORS,CALL TSLA 2014MAY09 240.00,-1,0.01,USD,0.00,USD
    2014-05-07,2014-05-08,Buy,TESLA MOTORS,CALL TSLA 2014MAY09 240.00,1,0.67,USD,-78.20,USD
    2014-05-06,2014-05-07,Buy,APPLE,CALL AAPL 2014MAY09 610.00,1,0.98,USD,-159.40,USD
    2014-05-06,2014-05-07,Buy,APPLE,CALL AAPL 2014MAY09 610.00,1,0.39,USD,0.00,USD
    2014-05-02,2014-05-05,Sell,KEURIG GREEN MOUNTAIN COM,PUT GMCR 2014MAY02 92.00,-1,0.37,USD,25.79,USD
    2014-04-30,2014-05-01,Buy,KEURIG GREEN MOUNTAIN COM,PUT GMCR 2014MAY02 92.00,1,0.68,USD,-79.20,USD
    2014-04-29,2014-04-29,Expired,BAIDU SPONSORED ADR REPSTG ORD,CALL BIDU 2014APR25 170.00,1,,,0.00,USD
    2014-04-28,2014-04-29,Sell,APPLE,CALL AAPL 2014MAY02 590.00,-1,5.88,USD,576.78,USD
    2014-04-28,2014-04-28,Expired,BAIDU SPONSORED ADR REPSTG ORD,CALL BIDU 2014APR25 170.00,-1,,,0.00,USD
    2014-04-28,2014-04-28,Expired,SPDR DOW JONES INDL AVERAGE ET,CALL DIA 2014APR25 165.00,-2,,,0.00,USD
    2014-04-28,2014-04-28,Foreign exchange,SELL USD @ 1.0870,,,,,163.05,CAD
    2014-04-28,2014-04-28,Foreign exchange,SELL USD @ 1.0870,,,,,-150.00,USD
    2014-04-25,2014-04-25,Foreign exchange,SELL USD @ 1.0860,,,,,-500.00,USD
    2014-04-25,2014-04-28,Buy,APPLE,CALL AAPL 2014MAY02 590.00,1,0.72,USD,-83.20,USD
    2014-04-25,2014-04-25,Foreign exchange,SELL USD @ 1.0860,,,,,543.00,CAD
    2014-04-25,2014-04-28,Sell,BAIDU SPONSORED ADR REPSTG ORD,CALL BIDU 2014APR25 170.00,-1,0.04,USD,0.00,USD
    2014-04-24,2014-04-25,Buy,SPDR DOW JONES INDL AVERAGE ET,CALL DIA 2014APR25 165.00,2,0.24,USD,-60.45,USD
    2014-04-24,2014-04-25,Sell,APPLE,CALL AAPL 2014APR25 560.00,-1,8.13,USD,801.78,USD
    2014-04-24,2014-04-25,Buy,BAIDU SPONSORED ADR REPSTG ORD,CALL BIDU 2014APR25 170.00,1,1.06,USD,-117.20,USD
    2014-04-22,2014-04-23,Buy,APPLE,CALL AAPL 2014APR25 560.00,1,1.09,USD,-120.20,USD
    


    :)
     
    #10     Apr 29, 2015