Sunday, February 12, 2012

Bitwise operators

Is there such a thing in MS SQL? In MySQL you can use shift operators (<< >>)
and comparison operators such as &.
This is very useful for storing multiple values in a single number. For
example, the number 3 containts two values of power 2.
2 ^ 0 and 2 ^ 1
Because:
2 ^ 0 = 1
and
2 ^ 1 = 2
as a result
1 + 2 = 3
Useing the & operator, we can determine this in a comparison. For example,
SELECT 3 & 1;
this would return 1, indicating that 2^0 is in there. If a zero was
returned, we know that the value is not there.
Anyone have any suggestions?
TSQL has & and |; what would you need bit shifting for?
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
"stooky" <stooky@.discussions.microsoft.com> wrote in message
news:CC3F21DA-7333-4E4E-9A10-F4269F249282@.microsoft.com...
> Is there such a thing in MS SQL? In MySQL you can use shift operators (<<
> and comparison operators such as &.
> This is very useful for storing multiple values in a single number. For
> example, the number 3 containts two values of power 2.
> 2 ^ 0 and 2 ^ 1
> Because:
> 2 ^ 0 = 1
> and
> 2 ^ 1 = 2
> as a result
> 1 + 2 = 3
> Useing the & operator, we can determine this in a comparison. For example,
> SELECT 3 & 1;
> this would return 1, indicating that 2^0 is in there. If a zero was
> returned, we know that the value is not there.
> Anyone have any suggestions?
>
|||Just to add to what Adam said, where @.i is an integer, bigint, etc. type:
@.i / 2 = integer shift left one bit
@.i * 2 = integer shift right one bit
Depending on what you're trying to achieve with your bitwise operators,
there may be better ways to do it... Most of the times I've seen bitwise
operators used in SQL, it's to save space by completely denormalizing data
(99.99999% of the time unnecessarily...), which saves a little space but can
make queries and maintenance unnecessarily convoluted.

"stooky" <stooky@.discussions.microsoft.com> wrote in message
news:CC3F21DA-7333-4E4E-9A10-F4269F249282@.microsoft.com...
> Is there such a thing in MS SQL? In MySQL you can use shift operators (<<
> and comparison operators such as &.
> This is very useful for storing multiple values in a single number. For
> example, the number 3 containts two values of power 2.
> 2 ^ 0 and 2 ^ 1
> Because:
> 2 ^ 0 = 1
> and
> 2 ^ 1 = 2
> as a result
> 1 + 2 = 3
> Useing the & operator, we can determine this in a comparison. For example,
> SELECT 3 & 1;
> this would return 1, indicating that 2^0 is in there. If a zero was
> returned, we know that the value is not there.
> Anyone have any suggestions?
>
|||By the way, I wrote a series of blog posts on this subject (which I still
need to write the final two installments of)... in case anyone is
interested:
http://sqljunkies.com/WebLog/amachan...1/25/6897.aspx
http://sqljunkies.com/WebLog/amachan...1/26/6929.aspx
http://sqljunkies.com/WebLog/amachan...1/28/6972.aspx
http://sqljunkies.com/WebLog/amachan...2/04/7345.aspx
http://sqljunkies.com/WebLog/amachan...2/10/7651.aspx
... eventually I'll write the installment on math operators and a bonus
installment that I had in mind on one other topic -- that I can't remember
right now!
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
"Michael C#" <xyz@.abcdef.com> wrote in message
news:wdI2e.9988$Zz2.1370@.fe12.lga...
> Just to add to what Adam said, where @.i is an integer, bigint, etc. type:
> @.i / 2 = integer shift left one bit
> @.i * 2 = integer shift right one bit
> Depending on what you're trying to achieve with your bitwise operators,
> there may be better ways to do it... Most of the times I've seen bitwise
> operators used in SQL, it's to save space by completely denormalizing data
> (99.99999% of the time unnecessarily...), which saves a little space but
can[vbcol=seagreen]
> make queries and maintenance unnecessarily convoluted.
>
>
> "stooky" <stooky@.discussions.microsoft.com> wrote in message
> news:CC3F21DA-7333-4E4E-9A10-F4269F249282@.microsoft.com...
(<<[vbcol=seagreen]
example,
>
|||"Adam Machanic" wrote:

> TSQL has & and |; what would you need bit shifting for?
>
Well...we are being sent data (we have no choice in the matter) that is
setup as 0s and 1s. Basically, it's marketing data and the data gatherer
doesn't care about querying the data so much as storing it. So, we have the
possibility of 4000+ fields and each field has a value of 0 or 1. Here is an
example of how bad this setup is. There is a section of data, let's call it
"Year of Car You Own". There are a series of fields from 1900-2005. If the
field 2004 is 1, that means you own a 2004 vehicle. If 2004 and 1972 are both
1, that means you have 2 vehicles , one from each year respectively. This
kind of info can be stored in a bitvector to speed things up.
It is also possible to store the 1s and 0s in strings and use substring
functions but the increased data size along going to strings introduces a
slowness factor of about 3.
|||It seems like this is a data integration problem that you should solve when
you load the data. Load it into a normalized schema rather than have
bitmapped columns just because that's how the customer supplied it. Bitmaps
are a lousy way to store data in SQL.
David Portas
SQL Server MVP
"stooky" wrote:

>
> "Adam Machanic" wrote:
>
> Well...we are being sent data (we have no choice in the matter) that is
> setup as 0s and 1s. Basically, it's marketing data and the data gatherer
> doesn't care about querying the data so much as storing it. So, we have the
> possibility of 4000+ fields and each field has a value of 0 or 1. Here is an
> example of how bad this setup is. There is a section of data, let's call it
> "Year of Car You Own". There are a series of fields from 1900-2005. If the
> field 2004 is 1, that means you own a 2004 vehicle. If 2004 and 1972 are both
> 1, that means you have 2 vehicles , one from each year respectively. This
> kind of info can be stored in a bitvector to speed things up.
> It is also possible to store the 1s and 0s in strings and use substring
> functions but the increased data size along going to strings introduces a
> slowness factor of about 3.
>
|||Storing this info denormalized in a single int value can speed up the load
process - although depending on how many records, it might not be a
significant savings. Presumably you'll be doing this a lot less than you'll
be querying the data. However, it will complicate and slow down your
queries, since you have to perform all that processing on the individual
bits. I would recommend that when you load up the data, normalize it.
i.e.,
Person Table
+--+--+
|PersonID | Name |
+--+--+
| 1 | George |
| 2 | Louis |
+--+--+
Car Table
+--+--+
|PersonID | YearOfCar |
+--+--+
| 1 | 1972 |
| 1 | 2004 |
| 2 | 2001 |
+--+--+
(Note, if the samples above come are illegible, copy and paste into NotePad
with a fixed-width font).
In this example, when you load the data, based on the bits that are set, we
put 2 entries in the CarTable for George (he owns a 1972 car and a 2004
car), and we put one entry in for Louis (he owns a 2001 car). Querying for
this info then just requires a simple join [with no bitwise operators],
which SQL is optimized for. The storage space might increase (I say might,
because you don't need to include irrelevant info - notice that we didn't
include years that individuals *don't* own cars for in the table above), but
SQL is optimized to query normalized, properly indexed tables. You can't
create an index on just bits 2, 3, 6 and 13 of an integer column.
Just a thought...
"stooky" <stooky@.discussions.microsoft.com> wrote in message
news:8E631902-984F-4175-862C-E59DD2757DFA@.microsoft.com...
>
> "Adam Machanic" wrote:
>
> Well...we are being sent data (we have no choice in the matter) that is
> setup as 0s and 1s. Basically, it's marketing data and the data gatherer
> doesn't care about querying the data so much as storing it. So, we have
> the
> possibility of 4000+ fields and each field has a value of 0 or 1. Here is
> an
> example of how bad this setup is. There is a section of data, let's call
> it
> "Year of Car You Own". There are a series of fields from 1900-2005. If the
> field 2004 is 1, that means you own a 2004 vehicle. If 2004 and 1972 are
> both
> 1, that means you have 2 vehicles , one from each year respectively. This
> kind of info can be stored in a bitvector to speed things up.
> It is also possible to store the 1s and 0s in strings and use substring
> functions but the increased data size along going to strings introduces a
> slowness factor of about 3.
>

No comments:

Post a Comment