Ticks (Last Traded Price) ... float or double?

hftvol · Apr 2, 2013

well, and structs in C# are "almost" always a bad choice, you can also look that up on SO.

Quote from hft_boy:

Don't know how C# works. But in C you want to order the struct members from largest to smallest in terms of size to avoid extra padding.

http://stackoverflow.com/questions/2748995/c-struct-memory-layout
More...

ddecker · Apr 6, 2013

Quote from vincegata:

Why to use int instead of float? Both int and float are 4 bytes, at least that's how C++ stores them. Stock prices use 2 digits after decimal point, currencies use 5 digits (except usd/yen which uses 2 digits). Float type with 7 digits can comfortably store those prices. However, when doing calculations they need to be converted to double not to loose precision.
More...

Because in any currency you're going to be using, prices are all fractions of 10^n. They can therefore be represented exactly with scaled int/longs.

Floating point representations (float/double) while very very precise, are not exact as they use a base-2 representation. They therefore cannot represent certain values exactly - try setting a float to 0.1 and printing it's value.

This may not matter in some applications, but the errors do add up.

vincegata · Apr 6, 2013

Quote from ddecker:

Because in any currency you're going to be using, prices are all fractions of 10^n. They can therefore be represented exactly with scaled int/longs.

Floating point representations (float/double) while very very precise, are not exact as they use a base-2 representation. They therefore cannot represent certain values exactly - try setting a float to 0.1 and printing it's value.

This may not matter in some applications, but the errors do add up.
More...

Right, I know what you talking about -- thanks.

PocketChange · Apr 6, 2013

Integers avoid rounding issues...
they are clean, reliable and faster
64 bit integers can also be set as db primary index keys
Consider storing timestamps as milliepoch integers.

Quote from hftvol:

that is highly inefficient, given that you trade memory (cheap and available in large enough quantities in terms of system limit) for precious CPU power. The conversion has to be performed each time you want to operate on such variables. If forced to make a compromise I would always free up precious computational power vs memory (this obviously applies to 64bit code bases)

Another, and possibly largest drawback of your suggestion is the sheer amount of places to introduce potentially hard to debug errors.
More...

hftvol · Apr 7, 2013

so you are arguing that conversions to int including the rounding errors that are, no question, have to be accepted and dealt with are more precise than a double or float? --> ;-) <--

(that is the politest and most respectful way I can express my amusement).

Quote from ddecker:

Because in any currency you're going to be using, prices are all fractions of 10^n. They can therefore be represented exactly with scaled int/longs.

Floating point representations (float/double) while very very precise, are not exact as they use a base-2 representation. They therefore cannot represent certain values exactly - try setting a float to 0.1 and printing it's value.

This may not matter in some applications, but the errors do add up.
More...

hftvol · Apr 7, 2013

What? of course you deal with rounding issues, are you joking with me? Not only that but you also have to carry around with you another int or char that tells you how exactly you converted to int (multiplier, so to speak). How efficient is that?

Quote from PocketChange:

Integers avoid rounding issues...
they are clean, reliable and faster
64 bit integers can also be set as db primary index keys
Consider storing timestamps as milliepoch integers.
More...

PocketChange · Apr 7, 2013

In binary, there is no way to write 9.95 in a finite number of bits. The closest to you can get to 9.95 in a 64-bit IEEE float is 9.949999999999999289457264239899814128875732421875. So when you type "9.95", understands the number will be the much longer value shown above. And that value rounds down.

This kind of problem comes up all the time when dealing with floating point binary numbers. The general rule to remember is that most fractional numbers that have a finite representation in decimal (a.k.a "base-10") do not have a finite representation in binary (a.k.a "base-2"). And so they are approximated using the closest binary number available. That approximation is usually very close, but it will be slightly off and in some cases can cause your results to be a little different from what you might expect.

Consider following FINRA's lead and standardize all price values to be represented as integers with 6 decimals of precision.

9.95 = 9950000
2012-09-27 18:47:18.250 = 1348771638250

Use integer prices and timestamps internally, these integers can be converted to a time string, floating or a double-precision value. But for absolute performance critical code, they allow integer math to be performed and typically require less storage. If you do any type of database work you'll see significant performance benefits using integer time stamps as primary keys.

Throwing more hardware at the issue doesn't fix the small approximation errors introduced that will inevitably bite you somewhere down the chain.

Quote from hftvol:

What? of course you deal with rounding issues, are you joking with me? Not only that but you also have to carry around with you another int or char that tells you how exactly you converted to int (multiplier, so to speak). How efficient is that?
More...

abattia · Apr 8, 2013

Quote from PocketChange:

In binary, there is no way to write 9.95 in a finite number of bits. The closest to you can get to 9.95 in a 64-bit IEEE float is 9.949999999999999289457264239899814128875732421875. So when you type "9.95", understands the number will be the much longer value shown above. And that value rounds down.

This kind of problem comes up all the time when dealing with floating point binary numbers. The general rule to remember is that most fractional numbers that have a finite representation in decimal (a.k.a "base-10") do not have a finite representation in binary (a.k.a "base-2"). And so they are approximated using the closest binary number available. That approximation is usually very close, but it will be slightly off and in some cases can cause your results to be a little different from what you might expect.

Consider following FINRA's lead and standardize all price values to be represented as integers with 6 decimals of precision.

9.95 = 9950000
2012-09-27 18:47:18.250 = 1348771638250

Use integer prices and timestamps internally, these integers can be converted to a time string, floating or a double-precision value. But for absolute performance critical code, they allow integer math to be performed and typically require less storage. If you do any type of database work you'll see significant performance benefits using integer time stamps as primary keys.

Throwing more hardware at the issue doesn't fix the small approximation errors introduced that will inevitably bite you somewhere down the chain.
More...

many thanks!

hftvol · Apr 8, 2013

Ok, care to show how its done correctly with your approach?

Lets say you have a loop and divide within the loop by 1000.123456789. You not only need to carry around the multiplier/char (which indicated how to convert back and forth between your int and double/float) but you have to do so 10 times and hold potentially 10 different such multipliers or at least have to make 10 separate adjustments to such factor.

So, it really depends, but for most all financial applications floating point representation is way precise enough, pending choice of correct variable type (double vs. float, ...).

Your approach wastes not only memory but also computational power (imagine you do that in a loop that iterates over tick based data of the magnitude of million observations and that not just for one price point but bid/ask/last/... your code will run several orders of magnitude slower than just using the correct floating point type which is way precise enough for most all calculations pertaining to finance.

Quote from ddecker:

Because in any currency you're going to be using, prices are all fractions of 10^n. They can therefore be represented exactly with scaled int/longs.

Floating point representations (float/double) while very very precise, are not exact as they use a base-2 representation. They therefore cannot represent certain values exactly - try setting a float to 0.1 and printing it's value.

This may not matter in some applications, but the errors do add up.
More...

hftvol · Apr 8, 2013

your post is wrong on so many ends:

1) please show me a financial application where a double or decimal variable type would not be sufficient, precision wise and compare results vs your integer type approach.

2) doubles take up the same memory as an int of type long. The only variable types that are consuming more memory and which are of floating type nature are decimals.

3) FINRA's recommendation? Sorry but with all due respect in that regards I listen to how its done in pretty much all sell side investment banking algo teams and hedge fund programming teams.

4) You conveniently omit HOW you convert the types to int and back and what memory consumption and computational power is required to achieve the same. I pointed that out in my previous post.

Why do you not walk us through a full calculation (looping through a division 100,000 times, and show how you precisely perform the transformations and how and where you store the multipliers and how much time each loop takes up and how much memory it consumes, you would be stunned by yourself of the results. )

Quote from PocketChange:

In binary, there is no way to write 9.95 in a finite number of bits. The closest to you can get to 9.95 in a 64-bit IEEE float is 9.949999999999999289457264239899814128875732421875. So when you type "9.95", understands the number will be the much longer value shown above. And that value rounds down.

This kind of problem comes up all the time when dealing with floating point binary numbers. The general rule to remember is that most fractional numbers that have a finite representation in decimal (a.k.a "base-10") do not have a finite representation in binary (a.k.a "base-2"). And so they are approximated using the closest binary number available. That approximation is usually very close, but it will be slightly off and in some cases can cause your results to be a little different from what you might expect.

Consider following FINRA's lead and standardize all price values to be represented as integers with 6 decimals of precision.

9.95 = 9950000
2012-09-27 18:47:18.250 = 1348771638250

Use integer prices and timestamps internally, these integers can be converted to a time string, floating or a double-precision value. But for absolute performance critical code, they allow integer math to be performed and typically require less storage. If you do any type of database work you'll see significant performance benefits using integer time stamps as primary keys.

Throwing more hardware at the issue doesn't fix the small approximation errors introduced that will inevitably bite you somewhere down the chain.
More...