PC TRAINING: ADO.NET

ADO.NET

At the releasing time of the .NET Framework, Microsoftintroduced a new data access model, called ADO.NET. TheActiveX Data Object acronym was no longer relevant, asADO.NET was not ActiveX, but Microsoft kept the acronym due to the huge success of ADO. In reality, it’s an entirely new data access model written in the .NET Framework.

ADO.NET supports various data sources through both ODBCand OLE-DB, but it also offers another option of using database-specific data providers.

These data providers offer greater performance by being able to take advantage of data-source-specific optimizations.

The original release of ADO.NET included a SQL provider and an OLE-DB provider, with the ODBC and Oracle providers being introduced later. But it supports a number of other providers too.

History of Data Access

Data Access in starting days

Over the years, many APIs have been released, all of which work toward the goal of providing universal data access. Universal data access is the concept of having a single code base for accessing data from any source, from any language.

Having universal data access is important for four reasons: First, developers can easily work on applications targeting different data stores without needing to become experts on each one.

Second, developers can have a common framework for data access when switching between programming languages, making the transition to new languages easier. This is especially important in the .NET Framework, in which developers are expected to be able to easily switch between VB.NET and C#.

Third, it enables developers to more easily write a single application that can be deployed against multiple data stores. Finally, it provides a level of abstraction between the application and direct communication to the database to simplify the code the average developer needs to write.

Microsoft has conducted surveys to determine which key factors companies are looking for in a data access layer. They came back with four main points, which they have tried to implement in their databases and data access components:

High performance: As any developer knows, performance can make or break almost any application. No matter how much a data access layer may simplify accessing the data, it absolutely must perform nearly as well or better than the alternatives before it becomes a viable solution for the majority of applications.

High reliability: If a component consumed by an application is buggy or occasionally stops working, it is perceived by the users as an error in that application. In addition to being a liability and annoyance to the company that implemented the application, it also reflects very poorly on the developer(s) who wrote the application.

Any issues, such as memory leaks, that cause unreliable results are unacceptable to the development community. It’s also very important to the support personnel that it be fairly maintenance-free. No one wants to have to reboot a server on a regular basis or constantly apply patches just to keep an application running.

Vendor commitment: Without the widespread buy-in of vendors to build drivers/ providers for their products, any universal data access model wouldn’t be universal. Microsoft could provide the drivers for some of the most common vendor products, but it really takes an open, easily extensible model in order to gain widespread acceptance.

No matter how much companies try to avoid it, almost all of them become “locked-in” to at least a handful of vendors. Switching to a vendor that supports the latest data access components is not really an option, so without widespread buy-in from vendors, a data access model cannot succeed.

Broad industry support: This factor is along the same lines as vendor commitment, but includes a wider arena. It takes more than the data access model to be able to easily create good applications with it; it also requires good tools that can work with the data access model.

Furthermore, it requires backing by several big players in the industry to reassure the masses. It also requires highly skilled people available to offer training. Finally, of course, it requires willing adoption by the development community so employers can find employees with experience.

Steady progress has been made, improving databases and universal data access over the last few decades. As with any field, it’s important to know where we’ve come from in database and data access technologies in order to understand where the fields are heading. The following section looks at some early achievements.

Data Access in starting days

In the 1950s and early 1960s, data access and storage was relatively simple for most people. While more advanced projects were under development and in use by a limited number of people, the majority of developers still stored data in flat text files.

These were usually fixed-width files, and accessing them required no more than the capability to read and write files. Although this was a very simple technique for storing data, it didn’t take too long to realize it wasn’t the most efficient method in most cases.

CODASYL

As with the Internet, databases as we know them today began with the U.S. Department of Defense. In 1957, the U.S. Department of Defense founded the Conference on Data Systems Languages, commonly known as CODASYL, to develop computer programming languages.

CODASYL is most famous for the creation of the COBOL programming language, but many people don’t know that CODASYL is also responsible for the creation of the first modern database. On June 10, 1963, two divisions of the U.S. Department of Defense held a conference titled “Development and Management of a Computer-Centered Data Base.” At this conference, the term database was coined and defined as follows:

A set of files (tables), where a file is an ordered collection of entries (rows) and an entry consists of a key or keys and data.

Two years later, in 1965, CODASYL formed a group called the List Processing Task Force, which later became the Data Base Task Group. The Data Base Task Group released an important report in 1971 outlining the Network Data Model, also known as the CODASYLData Model or DBTG Data Model. This data model defined several key concepts of a database, including the following:

• A syntax for defining a schema
• A syntax for defining a subschema
• A data manipulation language

These concepts were later incorporated into the COBOL programming language. They also served as a base design for many subsequent data storage systems.

IMS

During the same period CODASYL was creating the Network Data Model, another effort was under way to create the first hierarchical database. During the space race, North American Rockwell won the contract to launch the first spacecraft to the moon.

In 1966, members of IBM, North American Rockwell, and Caterpillar Tractor came together to begin the design and development of the Information Control System (ICS) and Data Language/I (DL/I). This system was designed to assist in tracking materials needed for the construction of the spacecraft.

The ICS portion of this system was the database portion responsible for storing and retrieving the data, while the DL/I portion was the query language needed to interface with it. In 1968, the IBM portion of this system (ICS) was renamed to Information Management System, or IMS.

Over time, the DL/I portion was enhanced to provide features such as message queuing, and eventually became the transaction manager portion of IMS. IMS continued to evolve and was adopted by numerous major organizations, many of which still use it today.

Relational Databases

Both the Network Data Model from CODASYL and IMS from IBM were major steps forward because they marked the paradigm shift of separating data from application code, and they laid the framework for what a database should look like.

However, they both had an annoying drawback: They expected programmers to navigate around the dataset to find what they wanted—thus, they are sometimes called navigational databases.

In 1970, Edgar Codd, a British computer scientist working for IBM, released an important paper called “A Relational Model of Data for Large Shared Data Banks” in which he introduced the relational model.

In this model, Codd emphasized the importance of separating the raw, generic data types from the machine-specific data types, and exposing a simple, high-level query language for accessing this data. This shift in thinking would enable developers to perform operations against an entire data set at once instead of working with a single row at a time.

Within a few years, two systems were developed based on Codd’s ideas. The first was anIBM project known as System R; the other was Ingres from the University of California atBerkeley. During the course of development for IBM’s System R, a new query language known as Structured Query Language (SQL) was born.

While System R was a great success for proving the relational database concept and creating SQL, it was never a commercial success for IBM. They did, however, releaseSQL/DS in 1980, which was a huge commercial success (and largely based on System R).

The Ingres project was backed by several U.S. military research agencies and was very similar to System R in many ways, although it ran on a different platform. One key advantage that Ingres had over System R that led to its longevity was the fact that the Ingres source code was publicly available, although it was later commercialized and released by Computer Associates in the 1980s.

Over the next couple of decades, databases continued to evolve. Modern databases such as Oracle, Microsoft SQL Server, MySQL, and LDAP are all highly influenced by these first few databases. They have improved greatly over time to handle very high transaction volume, to work with large amounts of data, and to offer high scalability and reliability.

History of Data Access

Birth of Universal Data Access

1. ODBC

Open Database Connectivity (ODBC) helped address the problem of needing to know the details of each DBMS used. ODBC provides a single interface for accessing a number of database systems. To accomplish this, ODBC provides a driver model for accessing data. Any database provider can write a driver for ODBC to access data from their database system.

This enables developers to access that database through the ODBC drivers instead of talking directly to the database system. For data sources such as files, the ODBC driver plays the role of the engine, providing direct access to the data source. In cases where theODBC driver needs to connect to a database server, the ODBC driver typically acts as a wrapper around the API exposed by the database server.

With this model, developers move from one DBMS to another and use many of the skills they have already acquired. Perhaps more important, a developer can write an application that doesn’t target a specific database system.

This is especially beneficial for vendors who write applications to be consumed by multiple customers. It gives customers the capability to choose the back-end database system they want to use, without requiring vendors to create several versions of their applications.

ODBC was a huge leap forward and helped to greatly simplify database-driven application development. It does have some shortfalls, though. First, it is only capable of supporting relational data. If you need to access a hierarchical data source such as LDAP, or semi-structured data, ODBC can’t help you. Second, it can only handle SQL statements, and the result must be representable in the form of rows and columns. Overall, ODBC was a huge success, considering what the previous environment was like.

2. OLE-DB

Object Linking and Embedding Database (OLE-DB) was the next big step forward in data providers, and it is still widely used today. With OLE-DB, Microsoft applied the knowledge learned from developing ODBC to provide a better data access model.

OLE-DB marked Microsoft move to a COM-based API, which made it easily consumable by most programming languages, and the migration to a 32-bit OS with the release of Windows 95.

As with any code, ODBC became bulky through multiple revisions. The OLE-DB API is much cleaner and provides more efficient data access than ODBC. Oddly enough, the only provider offered with its initial release was the ODBC provider.

It was just a wrapper of the ODBC provider and offered no performance gain. The point was to get developers used to the new API while making it possible to access any existing database system they were currently accessing through ODBC. Later, more efficient providers were written to access databases such as MS SQL Server directly, without going through ODBC.

2-1. OLE-DB Providers

OLE-DB is also much less dependent upon the physical structure of the database. It supports both relational and hierarchical data sources, and does not require the query against these data sources to follow a SQL structure.

As with ODBC, vendors can create custom providers to expose access to their database system. Most people wouldnt argue with the belief that it is far easier to write an OLE-DBprovider than an ODBC driver. A provider needs to perform only four basic steps:

1. Open the session.
2. Process the command.
3. Access the data.
4. Prepare a rowset.

2-2. OLE-DB Consumers

The other half of the OLE-DB framework is the OLE-DB consumer. The consumer is the layer that speaks directly to the OLE-DB providers, and it performs the following steps:

1. Identify the data source.
2. Establish a session.
3. Issue the command.
4. Return a rowset.

History of Data Access

Data Access Consumers

Developers who use languages that support pointers, such as C, C++, VJ++, and so on than speak directly to the ODBC and OLE-DB APIs. However, developers using a language such as Visual Basic need another layer. This is where the data access consumers such asDAO, RDO, ADO, and ADO.NET come into play.

DAO

With the release of Visual Basic 2.0, developers were introduced to a new method for accessing data, known as Data Access Objects (DAO). This was Microsoft first attempt to create a data consumer API.

Although it had very humble beginnings, and when first released only supported forward-only operations against ODBC data sources, it was the beginning of a series of libraries that would lead developers closer to the ideal of Universal Data Access.

It also helped developers using higher-level languages such as Visual Basic to take advantage of the power of ODBC that developers using lower-level languages such as C were beginning to take for granted.

DAO was based on the JET engine, which was largely designed to help developers take advantage of the desktop database application Microsoft was about to release, Microsoft Access. It served to provide another layer of abstraction between the application and data access, making the developer’s task simpler.

Although the initial, unnamed release with Visual Basic 2.0 only supported ODBCconnections, the release of Microsoft Access 1.0 marked the official release of DAO 1.0, which supported direct communication with Microsoft Access databases without using ODBC.

Figure 1-2 shows this relationship. DAO 2.0 was expanded to support OLE-DB connections and the advantages that come along with it. It also provided a much more robust set of functionality for accessing ODBC data stores through the JET engine. Later, versions 2.5 and 3.0 were released to provide support for ODBC 2.0 and the 32-bit OS introduced with Windows 95.

RDO

Remote Data Objects (RDO) was Microsoft’s solution to the slow performance created byDAO. For talking to databases other than Microsoft Access, RDO did not use the JET engine like DAO; instead, it communicated directly with the ODBC layer. Figure 1-3 shows this relationship.

Removing the JET engine from the call stack; greatly improved performance to ODBC data sources. The JET engine was only used when accessing a Microsoft Access Database.

In addition, RDO had the capability to use client-side cursors to navigate the records, as opposed to the server-side cursor requirements of DAO. This greatly reduced the load on the database server, enabling not only the application to perform better, but also the databases on which that application was dependant.

RDO was primarily targeted toward larger, commercial customers, many of whom avoidedDAO due to the performance issues. Instead of RDO replacing DAO, they largely co-existed.

This resulted for several reasons: First, users who developed smaller applications, where performance wasn’t as critical, didn’t want to take the time to switch over to the new API. Second, RDO was originally only released with the Enterprise Edition of Visual Basic, so some developers didn’t have a choice.

Third, with the release of ODBCDirect, a DAO add-on that routed the ODBC requests through RDO instead of the JET engine, the performance gap between the two became much smaller. Finally, it wasn’t long after the release of RDO that Microsoft’s next universal access API was released.

ADO

Microsoft introduced ActiveX Data Objects (ADO) primarily to provide a higher-level API for working with OLE-DB. With this release, Microsoft took many of the lessons from the past to build a lighter, more efficient, and more universal data access API. Unlike RDO, ADOwas initially promoted as a replacement for both DAO and RDO.

At the time of its release, it (along with OLE-DB) was widely believed to be a universal solution for accessing any type of data from databases to e-mail, flat text files, and spreadsheets.

ADO represented a major shift from previous methods of data access. With DAO and RDO, developers were expected to navigate a tree of objects in order to build and execute queries. For example, to execute a simple insert query in RDO, developers couldn’t just create an rdoQuery object and execute it.

Instead, they first needed to create the rdoEngine object, then the rdoEnvironment as a child of it, then an rdoConnection, and finally the rdoQuery. It was a very similar situation with DAO. With ADO, however, this sequence was much simpler. Developers could just create a command object directly, passing in the connection information and executing it.

For simplicity and best practice, most developers would still create a separate command object, but for the first time the object could stand alone. As stated before, ADO was primarily released to complement OLE-DB; however, ADO was not limited to just communicating with OLE-DB data sources.

ADO introduced the provider model, which enabled software vendors to create their own providers relatively easily, which could then be used by ADO to communicate with a given vendor’s data source and implement many of the optimizations specific to that data source.

The ODBC provider that shipped with ADO was one example of this. When a developer connected to an ODBC data source, ADO would communicate through the ODBC provider instead of through OLE-DB. More direct communication to the data source resulted in better performance and an easily extensible framework. Figure 1-4 shows this relationship.

ADO.NET

With the release of the .NET Framework, Microsoft introduced a new data access model, called ADO.NET. The ActiveX Data Object acronym was no longer relevant, as ADO.NET was not ActiveX, but Microsoft kept the acronym due to the huge success of ADO. In reality, it’s an entirely new data access model written in the .NET Framework.

ADO.NET supports communication to data sources through both ODBC and OLE-DB, but it also offers another option of using database-specific data providers. These data providers offer greater performance by being able to take advantage of data-source-specific optimizations.

By using custom code for the data source instead of the generic ODBC and OLE-DB code, some of the overhead is also avoided. The original release of ADO.NET included a SQLprovider and an OLE-DB provider, with the ODBC and Oracle providers being introduced later. Many vendors have also written providers for their databases since. Figure 1.5 shows the connection options available with ADO.NET.

With ADO.NET, the days of the recordset and cursor are gone. The model is entirely new, and consists of five new objects:

Connection: The Connection object is responsible for establishing and maintaining the connection to the data source, along with any connection specific information.

Command: The Command object stores the query that is to be sent to the data source, and any application parameters.

DataReader: The DataReader object provides fast, forward-only reading capabilities to quickly loop through the records.

DataSet: The DataSet object, along with its child objects, is what really makes ADO.NETunique. It provides a storage mechanism for disconnected data. The DataSet never communicates with any data source and is totally unaware of the source of the data used to populate it. The best way to think of it is as an in-memory repository to store data that has been retrieved.

DataAdapter: The DataAdapter object is what, bridges the gap between the DataSet and the data source. The DataAdapter is responsible for retrieving the data from the command object and populating the DataSet with the data returned. The DataAdapter is also responsible for persisting changes to the DataSet back to the data source.

ADO.NET 2.0

Data access technologies have come a long way, but even with ADO.NET, there is still room to grow. The transition to ADO.NET 2.0 is not a drastic one. In the 2.0 Framework, the basic design is the same, but several new features have been added to make common tasks easier, which is very good for backward compatibility. ADO.NET 2.0 should be 100% backward compatible with any ADO.NET 1.0 code you have written.

With any 2.0 product, the primary design goal is almost always to improve performance.ADO.NET 1.0 does not perform poorly by any means, but a few areas could use improvement, including XML serialization and connection pooling, which have been reworked to provide greater performance.

In the 2.0 Framework, Microsoft has also been able to improve performance by introducing several new features to reduce the number of queries that need to be run and to make it easier to run multiple queries at once.

For example, the bulk insert feature provides the capability to add multiple rows to a database with a single query, instead of the current method of inserting one at a time. This can gently reduce the amount of time it takes to insert a large no of rows.

The introduction of Multiple Active Result Set (MARS) provides the capability to execute multiple queries at once and receive a series of results. Removing the back and forth communication that is required by executing one query at a time and waiting for the results greatly improves the performance of an application that needs this functionality.

If you prefer to do other work while waiting for your data to return, you also have the option of firing an asynchronous command. This has been greatly simplified in the 2.0 Framework.

Another major design goal is to reduce the amount of code necessary to perform common tasks. We all heard with the release of .NET Framework 1.0 was “70 percent less code” than previous methods. The goal with the .NET 2.0 is the same: to reduce the amount of code needed for common task by 70% over .NET 1.0.

The rest of the enhancements are primarily new features. For Example: there is now a database discovery API for browsing the schema of the database. Also offered is the option of writing provider-independent database access code. This is very beneficial if you sell application to the customer who wants to run it against numerous data sources. Keep in mind that the queries you write still must match that provider’s syntax.

Introduction ADO.Net

Overview of ADO.NET

What is ADO.Net?

ADO.NET is the latest data-access API from Microsoft. As its name suggests, it is an evolution of Active Data Objects (ADO), built as part of the Microsoft .NET Frameworks.ADO.NET presents a very different programming model from Microsoft’s previous data-access APIs.

The old connected model, in which a client program maintained a connection with the server for the duration of the application; the new paradigm underpinning ADO.NET is a disconnected model where clients only hold connects for the minimum time necessary.

The .NET runtime and frameworks are also available for non-Windows platforms as a result of the Mono open-source project from Novell, which is committed to providing a compatible environment for developing .NET applications for cross-platform use.

Why use ADO.Net?

As an integral component of the .NET Frameworks, it is the standard Data-Access method for developing database centric (SQL or XML) Windows Forms applications, Web Services, and other data consuming .NET components. The Mono project also includes a full implementation of the ADO.NET Frameworks that extends the data access benefits ofADO.NET to platforms such as Linux, Mac OS X, Solaris, and HP-UX.

Introduction ADO.Net

ADO.NET Architecture

The ADO.NET architecture can be divided into two logical pieces: command execution and caching. Command execution requires features like connectivity, execution, and reading of results. These features are enabled with .NET data providers. The DataSet handles caching of results.

The provider enables connectivity and command execution to underlying data sources. Note that these data sources do not have to be relational databases. Once a command has been executed the results can be read using a DataReader. A DataReader provides efficient forward-only stream level access to the results. In addition, results can be used to render a DataSet a DataAdapter. This is typically called "filling the DataSet."

The DataSet object represents a disconnected cache of data. This cache is made up of DataTables and DataRelations that represent the results of the command. The DataSet tracks changes to the underlying data in the cache. Changes can be submitted back to the original data source by using the DataAdapter and applicable Insert, Update, or Delete commands. The DataSet also provides a direct XML view of the underlying data in the cache. This XML can be manipulated with XML standards like XPath and XSLT.

Figure 1: Architecture of ADO.NET

I will describe the individual ADO.NET objects that provide the functionality of the ADO.NETRecordset.

ADO.NET Object	Description
DataReader	Provides a forward-only, read-only stream of data from a data source. The DataReader is similar to a Recordset with CursorType=adOpenForwardOnly and LockType=adLockReadOnly.
DataSet	Provides in-memory access to relational data. The DataSet is independent of any specific data source and therefore can be populated from multiple and differing data sources including relational databases and XML, or can be populated with data local to the application. Data is stored in a collection of one or more tables, and can be accessed non-sequentially and without limits to availability, unlike ADO in which data must be accessed a single row at a time. A DataSet can contain relationships between tables, similar to the ADO Recordset in which a single result set is created from a JOIN. A DataSet can also contain unique, primary key, and foreign key constraints on its tables. The DataSet is similar to a Recordset with CursorLocation = adUseClient, CursorType = adOpenStatic and LockType = adLockOptimistic. However the DataSet has extended capabilities over the Recordset for managing application data.
DataAdapter	Populates a DataSet with data from a relational database an resolves changes in the DataSet back to the data source. The DataAdapter enables the user to explicitly specify behavior that the Recordset performs implicitly.

Cursors: In ADO within a common Recordset object, it is possible to request multiple and differing cursor types (dynamic, keyset, static and forward-only) with different properties that define how the cursor behaves, for example whether the cursor is updateable or is read-only, or whether it is implemented on the client or on the server.

In ADO.NET, however different classes are exposed that give greater control over each type of interaction. The DataReader provides an extremely fast, forward-only, read-only cursor on the server side that enables the user to retrieve a stream of results from the database.

The DataSet provides a completely disconnected “client” cursor, through which the user can scroll and update that is equivalent to the static cursor in ADO. These objects, along with the DataAdapter enable the user to move data between the DataSet and the database, and provide the user with optimal access methods for the most common types of data interaction.

Version 1.0 of ADO.NET does not expose a scrollable, updateable server-side cursor. Applications that require scrolling and positioned updates on the client side generally involve user interaction.

Because server-side cursors requires state to be held on the server, the application will not be robust or scalable if they hold these valuable resources while users interact with the data on the client side. Most applications that currently use scrollable server-side cursors on the client side can be much more efficiently written according to one of the following designs.

Use stored procedures to handle custom logic, to run on the server instead of the client.

• Use a forward-only, read-only cursor to return data from the server, and execute commands to process any updates.

• Populate a DataSet with results, modify the data locally, and then propagate those changes back to the server.

In the next sections I will examine each one of the objects and also delve into detail on each one of them.

NOTE: This article is for purely educational purposes. This article is entirely original, unless specified. Any resemblance to other material is an un-intentional coincidence and should not be misconstrued as malicious, slanderous, or any anything else hereof.

Introduction ADO.Net

Benefits of ADO.NET

ADO.NET offers several advantages over previous versions of ADO and over other data access components. These benefits fall into the following categories:

Interoperability

ADO.NET applications can take advantage of the flexibility and broad acceptance of XML. Because XML is the format for transmitting datasets across the network, any component that can read the XML format can process data.

In fact, the receiving component need not be an ADO.NET component at all: The transmitting component can simply transmit the dataset to its destination without regard to how the receiving component is implemented. The destination component might be a Visual Studio application or any other application implemented with any tool whatsoever.

The only requirement is that the receiving component be able to read XML. As an industry standard, XML was designed with exactly this kind of interoperability in mind.

Maintainability

In the life of a deployed system, modest changes are possible, but substantial, architectural changes are rarely attempted because they are so difficult. That is unfortunate, because in a natural course of events, such substantial changes can become necessary.

For example, as a deployed application becomes popular with users, the increased performance load might require architectural changes. As the performance load on a deployed application server grows, system resources can become scarce and response time or throughput can suffer.

Faced with this problem, software architects can choose to divide the server's business-logic processing and user-interface processing onto separate tiers on separate machines. In effect, the application server tier is replaced with two tiers, alleviating the shortage of system resources.

The problem is not designing a three-tiered application. Rather, it is increasing the number of tiers after an application is deployed. If the original application is implemented inADO.NET using datasets, this transformation is made easier.

Remember, when you replace a single tier with two tiers, you arrange for those two tiers to trade information. Because the tiers can transmit data through XML-formatted datasets, the communication is relatively easy.

Programmability

ADO.NET data components in Visual Studio encapsulate data access functionality in various ways that help you program more quickly and with fewer mistakes. For example, data commands abstract the task of building and executing SQL statements or stored procedures.

Similarly, ADO.NET data classes generated by the designer tools result in typed datasets. This in turn allows you to access data through typed programming. For example, consider the following line of code that accesses a data member in an untyped dataset:

if (totalCost >
(double)dataSet1.Tables["Customers"].Rows[n]["AvailableCredit"])

The equivalent line to access a data member in a typed dataset looks like the following:

if (totalCost > dataSet2.Customers[n].AvailableCredit)

The code for the typed dataset is easier to read. It is also easier to write, because statement completion is provided. For example, "AvailableCredit" is among the list of choices for completing the following statement:

If totalCost > dataSet2.Customers(n).

Finally, the code for the typed dataset is safer, because it provides for the compile-time checking of types. For example, suppose that AvailableCredit is expressed as a currency value.

If the programmer erroneously assigns a string value to AvailableCredit, the environment would report the error to the programmer during compile time. When working with untyped datasets, the programmer would not learn of the error until run time.

Performance

For disconnected applications, ADO.NET datasets offer performance advantages over ADOdisconnected recordsets. When using COM marshalling to transmit a disconnected recordset among tiers, a significant processing cost can result from converting the values in the recordset to data types recognized by COM. In ADO.NET, such data-type conversion is not necessary.

Scalability

Because the Web can vastly increase the demands on your data, scalability has become critical. Internet applications have a limitless supply of potential users. Although an application might serve a dozen users well, it might not serve hundreds —or hundreds of thousands — equally well.

An application that consumes resources such as database locks and database connections will not serve high numbers of users well, because the user demand for those limited resources will eventually exceed their supply.

ADO.NET accommodates scalability by encouraging programmers to conserve limited resources. Because any ADO.NET application employs disconnected access to data, it does not retain database locks or active database connections for long durations.

Introduction ADO.Net

What is main difference between ado and ado.net

ADO and ADO.NET are different in several ways:

• ADO works with connected data. This means that when you access data, such as viewing and updating data, it is real-time, with a connection being used all the time. This is barring, of course, you programming special routines to pull all your data into temporary tables.

ADO.NET uses data in a disconnected fashion. When you access data, ADO.NET makes a copy of the data using XML. ADO.NET only holds the connection open long enough to either pull down the data or to make any requested updates. This makes ADO.NET efficient to use for Web applications. It's also decent for desktop applications.

• ADO has one main object that is used to reference data, called the Recordset object. This object basically gives you a single table view of your data, although you can join tables to create a new set of records. With ADO.NET, you have various objects that allow you to access data in various ways. The DataSet object will actually allow you to store the relational model of your database. This allows you to pull up customers and their orders, accessing/updating the data in each related table individually.

• ADO allows you to create client-side cursors only, whereas ADO.NET gives you the choice of either using client-side or server-side cursors. In ADO.NET, classes actually handle the work of cursors. This allows the developer to decide which is best. For Internet development, this is crucial in creating efficient applications.

• Whereas ADO allows you to persist records in XML format, ADO.NET allows you to manipulate your data using XML as the primary means. This is nice when you are working with other business applications and also helps when you are working with firewalls because data is passed as HTML and XML.

• ADO.NET is stateless mechanism. I can treat the ADO.Net as a separate in-memory database where in I can use relationships between the tables and select insert and updates to the database. I can update the actual database as a batch.

Standardized Database Objects and Design

Creating Databases

Database design is probably one of the most misunderstood areas of database work. It’s also one of the most essential parts. In this chapter, we will cover the basics of maintainable, normalized design, and offer general guidelines, including useful tips and tricks.

In an ideal world, every database would be fully normalized, optimized for speed, and designed to make security integrals to the structure. Of course, we don’t live in an ideal world. The key to keeping your life simple is to do the work up front. Because the database is usually the most important part of any application, it’s important to set it up correctly now – to avoid heartache later.

Trying to make changes to a long-standing database is incredibly difficult and usually results in breaking other systems. Once a database is in production use, it becomes very difficult to change. In other words, any mistakes made during design will be there weeks, months, and even years down the line, which doesn’t do much for the original developer’s reputation.

Before you start works on a database, make sure you process all of the facts regarding the applications that will be using it. The more information you can gather about the uses for the database, the better you can design it to suits those needs. Here are some of the questions you should ask before processing:

? Do you understand all the logical units (objects) of the application using the database?

? What are the ways in which people will want to query / manage the data now?

? Does the data structure support all of the functionality needed in your applications?

? Where are the applications going in their next versions, and do you need to make provisions for that now?

Once you have the answers of these questions, you will nearly ready to jump in and run some CREATE commands against your database server. First, though, you should lay out all the logical units (objects) of your solutions on paper to show how they will be represented as objects in your applications and tables in your database.

First creating the design on paper, you will be able to identify and solve numerous problems and challenges. Then, after you have the design, run through the preceding questions again to ensure that you have covered all the bases.

Standardized Database Objects and Design

Naming Conventions

Just as important, as a solid database design in the naming of your tables, views, and stored procedures. Misnaming or poorly naming your database objects can result in a lot of heartache for both yourself and anyone who has to maintain your application later.

While choosing a naming convention is a personal decision, we will show you the conventions we use and explain why we use them. That way, you can make an informed decision about which convention to adopt.

Keep in mind, however, that the most important rule in naming convention is consistency. In the following sections, we will go into detail about naming tables and stored procedures; for now, however, here are few general rules regarding all database objects:

• Do use Pascal Case

• Don’t let the name get too long. Remember: you will have to read it and type it.

• Don’t use the Hungarian notation – in other words, don’t prefix objects such as tables with “tbl”.

• Don’t abbreviate or use acronyms.

Tables

Naming your tables can be very difficult and if it’s not done correctly, it can result in much confusion down to line. Always use Pascal Case when naming your tables. This means that the first letter of each word is capitalized (e.g. CustomerOrders and InternetUsers). This is the best way to differentiate between SQL keywords such as SELECT, UPDATE and DELETEin your SQL statements and your table names, which will always be in Pascal Case, and it makes all your queries very easy to understand at a glance.

Hungarian notation should not be used when naming your tables. It’s easy to discover what type an object represents in your database server – e.g. a table can only be a table, so why bother to name it as such? Tables should be named with plurals, such as Orders instead of Order.

Treat each row of a table as an individual thing, such as an order. The table is the bucket for all these individual rows, so it’s named plurally. When a table has multiple words, only the last word should be plural. For example, OrderItems is preferable to OrdersItems, as the table contains a list of Order Items, not a list of Orders Items.

All tables should be named in relation to their scope. This is especially important if the tables are located in a shared database. Name your tables so they relate to the application in which they will be used or to the functionality that they control. For example, a tables of users for an intranet should be named IntranetUsers.

Table names should never contain numbers. The wrong way to do this would be to name the tables Sales2006, Sales2007 and so on. Instead, a column should be added a generic Sales table called Year, and the value 2006 and 2007 should be placed against the relevant records.

Ensure that no underscores and spaces find their way into your table names. Using Pascal Case for your tables negates the need for underscores in table names, and spaces are only supported by some database servers, so they should be avoided at all costs. Observing these rules will enable you to easily move your entire database schema among many different relational database management servers.

The following sections will walk you through naming conventions for every part of a table’s structure and its associated objects. Just to clarify what we are talking about, here is the CREATE script for a table in SQL Server:

CREATE TABLE [Customers] (
[CustomerID] [int] IDENTITY (1, 1) NOT FOR REPLICATION NOT NULL , 
[CustomerName] [nvarchar] (50) 
COLLATE SQL_Latin1_General_CP1_CI_AS NULL, 
[CustomerAddress] [nvarchar] (50) 
COLLATE SQL_Latin1_General_CP1_CI_AS NULL, 
    CONSTRAINTS [PK_CustomersFirstForm] PRIMARY KEY CLUSTERED
    (
       [CustomerID]
     )  ON [PRIMARY]
)  ON [PRIMARY]
GO

Columns

When naming the columns in your tables, keep in mind that the columns already belongs to a table, so it is not necessary to include the table name within the column names.

That said, the primary key in any table should be the only exception to the preceding rule of not including the table name in a column. If your table is IntranetUsers, then the primary key column should be named IntranetUsersID. This helps avoid any ambiguity in any queries written subsequently. The location of ID in your column name can appear at either the beginning or the end of the name, so IDIntranetusers would also be acceptable. Use whichever you prefer – just remember to be consistent throughout your entire database schema.

Foreign key match the column they are referencing exactly, so if you had a column in your InteranetUserSettings table that referred to the primary key IntranetUsersID in the IntranetUsers table, then you would name it IntranetUsersID. Carefully consider the naming of other columns that just store data and not keys.

Any Boolean fields should pose a question, such as IsPhotographed or HasOenTeeth, to which true or false provides a clear answer. (We will ignore NULL because that’s just awkward.)

DateTime fields should contain the word DateTime, so a field for storing the Created DateTime for a row should be called CreatedDateTime. If a column is only storing a Time, then it should be named appropriately (e.g. CreatedTime).

It is not necessary to use the word “number” in columns of type integer and other numeric columns, as their data types should show this. This rule can be ignored if the names seem ambiguous within the scope of your table. In addition, string columns should not have “string” or “text” in their name.

Column storing ambiguous data such as time periods or speeds should also contain within the name the measurements used for the units, such as PriceUSDollars, SpeedMilesPerHours, or LeaveRemainingInDays.

Triggers

Trigger should always have a prefix to distinguish them from stored procedures and tables. Choose a self-explanatory prefix you are comfortable with, such as Trig.

All trigger names should include both the table name they are referencing and the events on which they are fired. For example, a trigger on the IntranetUsers table that needs to be fired on both an INSERT and a DELETE would be called TrigIntranetUsersInsertDelete:

CREATE TRIGGER TrigIntranetUsersInsertDelete
ON IntranetUsers
FOR INSERT, UPDATE, DELETE
AS
     EXEC master..xp_sendmail   ‘Security Monkey’,
	‘Make sure the new users have been added to the right roles!’
GO

Here is a reference table you can use to check your triggers for conformance to the naming conventions.

Table	Insert	Update	UpdateInsert
Customers	TrigCustomersInsert	TrigCustomersUpdate	TrigCustomers UpdateInsert
IntranetUsers	TrigIntranet UsersInsert	TrigIntranet UsersUpdate	TrigIntranetUsers UpdateInsert

Stored Procedures

Everyone likes to do things their own way, and the practice of naming stored procedures is no different. Still, there are some things to keep in mind when naming stored procedures. Use the following questions to create the best possible stored procedure names:

Q. Will the name be easy to find within the database, both now and when there are a lot more procedures?

If the procedure is specific to the application that’s using it, then it’s in the right place and doesn’t need to be name specifically. However, if the procedure is in a general or shared database, then it should be named with respect to the application it’s related to by prefixing the procedure with the name of the application, such as ReportingSuite, EcommerceWebSite or Intranet.

Q. Does the name relate to the object on which the actions are being performed?

The scope of the procedure is the most vital part of its name. If the procedure is adding customers to a table, then it should contain the word Customer in its name. If the procedure is referring to invoices, then it would include the name Invoice.

Q. Has the procedure been named in a way in which its action can be identified?

Whether the stored procedure is performing a simple SELECT, INSERT, UPDATE, or DELETE, or whether it’s performing a more complicated task, you need to pick a name for the action it’s performing.

For example, if you are inserting rows into the Customer table, you would use, say, Add or Insert. However, if the procedure is performing a more complicated task, such as validating a username and password, then it would include the word Validate in its name.

A procedure that would insert a new record into the Customers table via the Intranet application should be called IntranetCustomerAdd or CustomerAdd depending on whether it’s inside the Intranet database or in a shared / generic database. The procedure to validate the username and password of an intranet user should be called IntranetUserValidate.

A procedure that’s selecting a specific customer from the intranet should be called IntranetCustomerSelect or IntranetCustomerGet, depending on your preferences.

If you were to write a procedure for the Accounting application that needed to return a report of all the invoices for a certain customer, it should be called IntranetCustomerInvoiceGet, as shown in the following example:

CREATE PROC [IntranetCustomerInvoiceGet]
(
	@CustomerID  Int
)	
AS
SELECT * FROM CustomerInvoices
WHERE CustomerID = @CustomerID
GO

If you are working in a multicompany environment, it can also be a good idea to prefix all of your stored procedures with the name of your company, such as eBIZ_IntranetCustomerAdd (this is one of the few circumstances in which underscores could be used).

Note: If you are using SQL Server, do not prefix your stored procedures with “sp_” or “xp_” as this is what SQL Server uses for its internal stored procedures. Not only will this make it difficult to differentiate your custom stored procedures from the database-generated ones, but it will also slow down your applications, as SQL Server checks inside the “Master” database for anything prefixed with “sp_” or “xp_” before looking inside the specified database. If you are using another database server, make sure your procedure names will not clash with any system-specific names.

The following table provides a quick reference for the naming conventions of stored procedures:

Table	Select	Insert	Delete	Update	Custom
Customers	Customer Get	Customer Adddate	Customer Delete	CustomerUp	Customer Custom
Intranet Users	Intranet UserGet	Intranet UserAdd	Intranet UserDelete	Intranet UserUpdate	Intranet UserCustom

Standardized Database Objects and Design

Primary Keys

Every table has a primary key (or at least should have one). A primary key enables each row to be uniquely identified by a column or combination of columns.

A primary key identifies a row of data in a table, but it does more than that. It also enforces constraints upon the table, enabling checks to be made by the database server to ensure that the data in a row is unique among the other rows in the table by having a different primary key.

The primary key can be defined on just one column or across several and can be set on different data types. Primary key are usually assigned a numeric data type, although some people also use unique identifiers such as GUIDs. To create a primary key, take a look at the following code sample:

CREATE TABLE jobs
(
	job_id smallint
		IDENTIFY (1, 1)
		PRIMARY KEY CLUSTERED,
	Job_desc	varchar(50)		NOT NULL
		DEFAULT ‘New Position – title not formalized yet’,
	min_lvl tinyint NOT NULL
		CHECK ( min_lvl >= 10 ),
	max_lvl tinyint NOT NULL
		CHECK ( max_lvl <= 250 )
)

Here’s a sample for creating a GUID primary key on a table:

CREATE TABLE Globally_Unique_Data
(
	guid uniqueidentifier
		CONSTRAINTS Guid_Default
		DEFAULT NEWID(),
	Employee_Name 	varchar(60),
	CONSTRAINTS Guid_PK PRIMARY KEY (Guid)
)

If the primary key is only on one column, then things are quite straightforward. It’s not possible to have rows with the same value within the primary key column.

If the primary key is made up of multiple columns, it’s the combination of values in each column defined as a primary key on the row that make up the unique key. It’s possible for values to be repeated within the same primary key column; the combination of values across the entire primary key column has to be unique.

PC TRAINING

ADO.NET - PART I

0 comments:

CATEGORIES

POPULAR POSTS

BLOG ARCHIVE

ABOUT ME

Categories

About

Link list 2

Follow us

Link list 3