Saturday, February 25, 2012
Blobs in SQL
We are looking at scanning certain paperwork and storing it with the
appropriate records in our database. The two options are to store the images
as blobs or to store them as file system objects with a referral pointer in
the db.
I have heard a number of different people say that storing blobs is
inherently evil but they haven't explained why that is.
Any help will be appreciated.
This is one of those topics that has been debated countless times, and there
is no universal truth.
Storing in the database
-transactional integrity
-maintenance simplicity
Storing in the filesystem
-performance
-efficiency
-accessibility
"Jawallaby" wrote:
> Can anyone tell me what the pros/cons are of using blobs in sql?
> We are looking at scanning certain paperwork and storing it with the
> appropriate records in our database. The two options are to store the images
> as blobs or to store them as file system objects with a referral pointer in
> the db.
> I have heard a number of different people say that storing blobs is
> inherently evil but they haven't explained why that is.
> Any help will be appreciated.
>
|||Thanks Dennis.
When you list performance as a pro of using the filesystem what are the
factors that increase performance? What is the performance hit when storing
in the db?
Most if not all of these files will be less than 1mb. There is a potential
for roughly 20 different files to be stored for each person/record in the db
and there are roughly 46000 records active.
I am trying to decide if this is worth worrying about or not and just don't
know enough about blobs in sql.
"Dennis Forbes" wrote:
[vbcol=seagreen]
> This is one of those topics that has been debated countless times, and there
> is no universal truth.
> Storing in the database
> -transactional integrity
> -maintenance simplicity
> Storing in the filesystem
> -performance
> -efficiency
> -accessibility
> "Jawallaby" wrote:
|||Dennis,
We've been scanning and storing document images as 'blobs' in our Informix
db
basically for years. We opted for the transactional integrity &
maintenance simplicity
over the performance and efficiency. Most of our scanned .pdf documents
are
fairly small and average around 100k.
We're not a highly transactional environment and we may scan (insert or
update)
from 50 to 100 documents a day.
Using this method we totally eliminate any file naming concerns, pathing,
directory
file limit issues, etc. (since I've never tried to maintain a pointer /
file based I can't
really speak to the up and down side of the method)
We're currently in the process of migrating to Sql Server and I've verified
that the
Sql Server Image column works in a very similar manner to the Informix Byte
(blob)
columns. The up side of going to Sql Server is that we can now manage our
image table insert, updates & deletes using stored procedures. We don't
have that
ability with Informix using oledb connectivity.
You can see the system in action at the following urls by clicking the blue
permit
or certificate numbers at
http://apps.wrd.state.or.us/apps/wr/...ID&snpid=64734
The downside can be database backups ... we currently have around 80G of
scanned imagesand a full db backup takes about 3 to 4 hrs to DLT tape.
Hope this helps.
Barry
in Oregon
"Jawallaby" <Jawallaby@.discussions.microsoft.com> wrote in message
news:24797C1E-595B-4C48-A4C7-AE264E8423BA@.microsoft.com...[vbcol=seagreen]
> Thanks Dennis.
> When you list performance as a pro of using the filesystem what are the
> factors that increase performance? What is the performance hit when
> storing
> in the db?
> Most if not all of these files will be less than 1mb. There is a potential
> for roughly 20 different files to be stored for each person/record in the
> db
> and there are roughly 46000 records active.
> I am trying to decide if this is worth worrying about or not and just
> don't
> know enough about blobs in sql.
>
> "Dennis Forbes" wrote:
Blobs in SQL
We are looking at scanning certain paperwork and storing it with the
appropriate records in our database. The two options are to store the images
as blobs or to store them as file system objects with a referral pointer in
the db.
I have heard a number of different people say that storing blobs is
inherently evil but they haven't explained why that is.
Any help will be appreciated.This is one of those topics that has been debated countless times, and there
is no universal truth.
Storing in the database
-transactional integrity
-maintenance simplicity
Storing in the filesystem
-performance
-efficiency
-accessibility
"Jawallaby" wrote:
> Can anyone tell me what the pros/cons are of using blobs in sql?
> We are looking at scanning certain paperwork and storing it with the
> appropriate records in our database. The two options are to store the imag
es
> as blobs or to store them as file system objects with a referral pointer i
n
> the db.
> I have heard a number of different people say that storing blobs is
> inherently evil but they haven't explained why that is.
> Any help will be appreciated.
>
Blobs in SQL
We are looking at scanning certain paperwork and storing it with the
appropriate records in our database. The two options are to store the images
as blobs or to store them as file system objects with a referral pointer in
the db.
I have heard a number of different people say that storing blobs is
inherently evil but they haven't explained why that is.
Any help will be appreciated.This is one of those topics that has been debated countless times, and there
is no universal truth.
Storing in the database
-transactional integrity
-maintenance simplicity
Storing in the filesystem
-performance
-efficiency
-accessibility
"Jawallaby" wrote:
> Can anyone tell me what the pros/cons are of using blobs in sql?
> We are looking at scanning certain paperwork and storing it with the
> appropriate records in our database. The two options are to store the images
> as blobs or to store them as file system objects with a referral pointer in
> the db.
> I have heard a number of different people say that storing blobs is
> inherently evil but they haven't explained why that is.
> Any help will be appreciated.
>|||Thanks Dennis.
When you list performance as a pro of using the filesystem what are the
factors that increase performance? What is the performance hit when storing
in the db?
Most if not all of these files will be less than 1mb. There is a potential
for roughly 20 different files to be stored for each person/record in the db
and there are roughly 46000 records active.
I am trying to decide if this is worth worrying about or not and just don't
know enough about blobs in sql.
"Dennis Forbes" wrote:
> This is one of those topics that has been debated countless times, and there
> is no universal truth.
> Storing in the database
> -transactional integrity
> -maintenance simplicity
> Storing in the filesystem
> -performance
> -efficiency
> -accessibility
> "Jawallaby" wrote:
> > Can anyone tell me what the pros/cons are of using blobs in sql?
> >
> > We are looking at scanning certain paperwork and storing it with the
> > appropriate records in our database. The two options are to store the images
> > as blobs or to store them as file system objects with a referral pointer in
> > the db.
> >
> > I have heard a number of different people say that storing blobs is
> > inherently evil but they haven't explained why that is.
> >
> > Any help will be appreciated.
> >
> >|||Dennis,
We've been scanning and storing document images as 'blobs' in our Informix
db
basically for years. We opted for the transactional integrity &
maintenance simplicity
over the performance and efficiency. Most of our scanned .pdf documents
are
fairly small and average around 100k.
We're not a highly transactional environment and we may scan (insert or
update)
from 50 to 100 documents a day.
Using this method we totally eliminate any file naming concerns, pathing,
directory
file limit issues, etc. (since I've never tried to maintain a pointer /
file based I can't
really speak to the up and down side of the method)
We're currently in the process of migrating to Sql Server and I've verified
that the
Sql Server Image column works in a very similar manner to the Informix Byte
(blob)
columns. The up side of going to Sql Server is that we can now manage our
image table insert, updates & deletes using stored procedures. We don't
have that
ability with Informix using oledb connectivity.
You can see the system in action at the following urls by clicking the blue
permit
or certificate numbers at
http://apps.wrd.state.or.us/apps/wr/wrinfo/wrinfo.php?search_type=SNPID&snpid=64734
The downside can be database backups ... we currently have around 80G of
scanned imagesand a full db backup takes about 3 to 4 hrs to DLT tape.
Hope this helps.
Barry
in Oregon
"Jawallaby" <Jawallaby@.discussions.microsoft.com> wrote in message
news:24797C1E-595B-4C48-A4C7-AE264E8423BA@.microsoft.com...
> Thanks Dennis.
> When you list performance as a pro of using the filesystem what are the
> factors that increase performance? What is the performance hit when
> storing
> in the db?
> Most if not all of these files will be less than 1mb. There is a potential
> for roughly 20 different files to be stored for each person/record in the
> db
> and there are roughly 46000 records active.
> I am trying to decide if this is worth worrying about or not and just
> don't
> know enough about blobs in sql.
>
> "Dennis Forbes" wrote:
>> This is one of those topics that has been debated countless times, and
>> there
>> is no universal truth.
>> Storing in the database
>> -transactional integrity
>> -maintenance simplicity
>> Storing in the filesystem
>> -performance
>> -efficiency
>> -accessibility
>> "Jawallaby" wrote:
>> > Can anyone tell me what the pros/cons are of using blobs in sql?
>> >
>> > We are looking at scanning certain paperwork and storing it with the
>> > appropriate records in our database. The two options are to store the
>> > images
>> > as blobs or to store them as file system objects with a referral
>> > pointer in
>> > the db.
>> >
>> > I have heard a number of different people say that storing blobs is
>> > inherently evil but they haven't explained why that is.
>> >
>> > Any help will be appreciated.
>> >
>> >
Friday, February 10, 2012
Bit or Char?
Generally, unless you have a number of bits in a tables, either would be OK. I often use Y/N and haven't had issues yet. Have a look at the following.
http://www.codecomments.com/archive352-2005-11-712750.html
|||There is a major difference when you are using BIT because it is Boolean in SQL Server and from SQL Server 7.0 and above BIT is three valued in SQL Server because Boolean in ANSI SQL is three valued True/False/NULL(unknown). So you cannot compare the use of BIT to Char in SQL Server 7.0 and above. Hope this helps.|||If the value you are storing is truly a true/false or yes/no, then use a bit.
Bits are smaller (1/8th of a byte, and when combined with multiple bits in a row, this can add up significantly).
They are also faster (Although probably insignificantly faster).
Using a bit also restricts the values to 0 or 1 (or null if nullable).
Let's assume that you have a table in which you are storing whether an item has been sold. The column is named IsSold, and you want to mark a particular item as being sold. What do you put in the IsSold column?
What if it is defined as a bit? The answer is 1.
What if it is defined as char? Hmm...do I put '1'? Maybe 'Y', or 'T', oh... Maybe 'y' or 't'? What happens if I put in 'Q'?
|||For boolean-type values I always prefer to use Bit datatype. One advantage is that you can directly convert an SqlServer bit datatype to a .NET Boolean using Convert.ToBoolean()
eg.
Boolean isApproved = Convert.ToBoolean(myReader["IsApproved"]);
Obviously you have to make sure your column has a not null constraint for this to be guaranteed to work!|||
Motley wrote:
If the value you are storing is truly a true/false or yes/no, then use a bit.
Bits are smaller (1/8th of a byte, and when combined with multiple bits in a row, this can add up significantly).
They are also faster (Although probably insignificantly faster).
Using a bit also restricts the values to 0 or 1 (or null if nullable).
Let's assume that you have a table in which you are storing whether an item has been sold. The column is named IsSold, and you want to mark a particular item as being sold. What do you put in the IsSold column?
What if it is defined as a bit? The answer is 1.
What if it is defined as char? Hmm...do I put '1'? Maybe 'Y', or 'T', oh... Maybe 'y' or 't'? What happens if I put in 'Q'?
No BIT is Microsoft proprietry implementation of ANSI SQL Boolean but it is defined as INT which means it is four bytes. This is because Boolean in ANSI SQL is three valued so ANSI SQL will not define a data type for it. Hope this helps.
|||If i use a Y/N field, I set a rule that these are the only options.
Another valid point is that Y or N might more closely relate to the business data definition, and be more welcome by business data users who might have to work closely with the database.
Most databases are set up (the default) as case-independant, so case is not usually an issue - but if have already set Y or N in the rule - it's covered.
There's no right or wrong here, its all a matter of taste
|||If Binary Sort is enabled SQL Server must be case sensitive. Hope this helps.|||
Caddre wrote:
Motley wrote:
If the value you are storing is truly a true/false or yes/no, then use a bit.
Bits are smaller (1/8th of a byte, and when combined with multiple bits in a row, this can add up significantly).
They are also faster (Although probably insignificantly faster).
Using a bit also restricts the values to 0 or 1 (or null if nullable).
Let's assume that you have a table in which you are storing whether an item has been sold. The column is named IsSold, and you want to mark a particular item as being sold. What do you put in the IsSold column?
What if it is defined as a bit? The answer is 1.
What if it is defined as char? Hmm...do I put '1'? Maybe 'Y', or 'T', oh... Maybe 'y' or 't'? What happens if I put in 'Q'?
No BIT is Microsoft proprietry implementation of ANSI SQL Boolean but it is defined as INT which means it is four bytes. This is because Boolean in ANSI SQL is three valued so ANSI SQL will not define a data type for it. Hope this helps.








1. A proprietor.2. A group of proprietors.3. Ownership; proprietorship.4. A proprietary medicine.5. One granted ownership of a proprietary colony.




No, it is not.
ISO/ANSI defined the bit datatype in SQL-99. See ISO/IEC document 9075-1:1999.
The majority of the large RDMSs support the bit datatype. Off the top of my head, that'd be MSSQL, MySQL, Postgress, Mimer, Frontbase, and SyBase.
Maybe you think the word "proprietry" (not spelled correctly by the way) means cool or useful, in that case I'd agree.
|||EDIT
Try the link below for answer from Joe Celko about BIT and Joe Celko is a member of ANSI SQL and wrote to date the best ANSI SQL book. And BIT in SQL Server is not a data type because it is defined as INT. There is no data type defined for Boolean in ANSI SQL because it is three valued and I have ANSI SQL 2003.
http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=9&messageid=213719#bm214191
Sorry permissions required to access the link, the text below was posted by Joe Celko.
(There is no Boolean data type in SQL because of the 3VL. BIT is a horrible, proprietary, non-relational thing that you should avoid. Instead define a location type code that can hold all of the various kinds of locations you will need. Make 0= "fiscal" in the code.
CREATE TABLE ClientLocations
(location_id
REFERENCES Locations(location_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
client_id INTEGER NOT NULL
REFERENCES Clients(client_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
location_type INTEGER DEFAULT 0 NOT NULL
CHECK (location_type >= 0),
PRIMARY KEY (location_id, client_id, location_type));)
|||
I think I'm done here. There really isn't anything else I can do but keep repeating what I have already stated.
BIT is not defined as INT - As proven, and provable by anyone else here with MS SQL 2000. BIT fields do not take 4 bytes of data storage as a INT would. They act like I said.
The BIT datatype is defined in SQL-99, however you are correct, it appears to be removed in SQL-2003.
The BIT datatype is not proprietary, and your (or others) claiming it is, simply does not make it so.
Just on a side note, Joe Celko may have written a good book, I don't know, I didn't read it, however, it is apparent that he tends to make claims that are based on his opinion and state them as fact when they are not facts. A bit is not a "horrible, proprietary, non-relational thing". That statement is absurd. A bit is more relational, and more normalized than a CHAR field with checks. And I've already disproven the proprietary part, so all we are left with is the truth. Joe thinks bits are horrible. Great, that's his OPINION, one that most people well versed in database design, structure, and standardization would disagree with. However, it is just an opinion.
|||If you are through with the thread you will not post again and your opinion compared to one of the most brilliant minds in ANSI SQL?
( A bit is more relational, and more normalized than a CHAR field with checks)
(So you cannot compare the use of BIT to Char in SQL Server 7.0 and above)
I covered that a long time ago.