Data sorting help: Nordic ITCH data

evira · Jan 6, 2013

My data consist of like 20 million rows of this:

T33013
M000
D 431630
X 431629 1000
M003
D 431571
A 431665S 100 67272 1834000
M006
A 431666S 2600 1027 1176000
D 430996

In which program could I sort it the way that it would look like this:

33013000 D 431630
33013000 X 431629 1000
33013003 D 431571
33013003 A 431665S 100 67272 1834000
33013006 A 431666S 2600 1027 1176000
33013006 D 430996

So the that every action a,b,c,d would get a column with the Previous T and M number. And I could sort out different rows.

Please help if you can thanks!

Kevin Schmit · Jan 6, 2013

Quote from evira:

In which program could I sort it the way that it would look like this:

33013000 D 431630
33013000 X 431629 1000
More...

Perl, AWK, Python.... the list goes on. Which do you prefer? I will write you a script.

evira · Jan 7, 2013

Quote from Kevin Schmit:

Perl, AWK, Python.... the list goes on. Which do you prefer? I will write you a script.
More...

Maybe Python if its easier? Can you also make the script that way, that all the rows starting with S,O,R,H,B and Q would be deleted.

Thanks in advance!

Kevin Schmit · Jan 11, 2013

Code:

import sys
fin = open(sys.argv[1], 'r')
skipLn = ['T','M','S','O','R','H','B','Q']
while 1:
  line = fin.readline()
  if not line: break;
  if line[:1] == 'T': bigTS = line[1:].rstrip()
  if line[:1] == 'M': milliTS = line[1:].rstrip()
  if line[:1] not in skipLn: 
    print '%s%s %s' % (bigTS, milliTS, line[:].rstrip())     
fin.close()

evira · Jan 11, 2013

Quote from Kevin Schmit:

Code:

import sys fin = open(sys.argv[1], 'r') skipLn = ['T','M','S','O','R','H','B','Q'] while 1: line = fin.readline() if not line: break; if line[:1] == 'T': bigTS = line[1:].rstrip() if line[:1] == 'M': milliTS = line[1:].rstrip() if line[:1] not in skipLn: print '%s%s %s' % (bigTS, milliTS, line[:].rstrip()) fin.close()
More...
Im getting an error.. It says invalid syntax with red on the %s' Can you help?

If my file is called: test.txt how should I open it in Python and run the script?

Thanks!

Kevin Schmit · Jan 11, 2013

Quote from evira:
Im getting an error.. It says invalid syntax with red on the %s' Can you help?

More...

What version of Python are you running, under what operating system? I tested it on Python 2.6.8 under Cygwin/Win7.

Quote from evira:
If my file is called: test.txt how should I open it in Python and run the script?

More...

Copy the script to a file with the extension ".py" e.g. test.py
Then call it from the command line like this:

python test.py test.txt

See the attached gif for an example of how to do this.

evira · Jan 13, 2013

Quote from Kevin Schmit:

What version of Python are you running, under what operating system? I tested it on Python 2.6.8 under Cygwin/Win7.

Copy the script to a file with the extension ".py" e.g. test.py
Then call it from the command line like this:

python test.py test.txt

See the attached gif for an example of how to do this.
More...

Im a total beginner in programming.. I just need to edit my files to that new order

Im running python 3.3.0 under Win32/Win7. I made the test.py from the script using python.
I donÂ´t know how to command a file. Like where should the file be in for example C:/Python33/test.txt

I appreciate your help

evira · Jan 13, 2013

Quote from Kevin Schmit:

What version of Python are you running, under what operating system? I tested it on Python 2.6.8 under Cygwin/Win7.

Copy the script to a file with the extension ".py" e.g. test.py
Then call it from the command line like this:

python test.py test.txt

See the attached gif for an example of how to do this.
More...

I got your script working! Thanks a lot!

evira · Jan 17, 2013

Quote from Kevin Schmit:

What version of Python are you running, under what operating system? I tested it on Python 2.6.8 under Cygwin/Win7.

Copy the script to a file with the extension ".py" e.g. test.py
Then call it from the command line like this:

python test.py test.txt

See the attached gif for an example of how to do this.
More...

Thanks It worked! I really appreciate your help!

Would it be possible to add to that script that it would delete rows that donÂ´t for example have the number "24311" or make another simple one for the new txt file.

And a harder one: is it possible to make a script that would delete all rows that donÂ´t have a same number line than all the rows containing A and 24311.

Like this:
46138100 A 4568834S 35000 24311 24280
46139111 A 4569028S 2000 24311 24520
46138350 X 4568834
32823978 X 4480746
46140239 D 4569028
32823978 D 324847

So it would delete the rows without the same number line:

46138100 A 4568834S 35000 24311 24280
46139111 A 4569028S 2000 24311 24520
46138350 X 4568834
46140239 D 4569028

Log in or Sign up

Data sorting help: Nordic ITCH data

evira

Kevin Schmit

evira

Kevin Schmit

evira

Kevin Schmit

python_test_itch_parser.gif

evira

evira

evira