Images

XML - PART III


XML
The eXtensible Markup Language (XML) is a general-purpose markup language. Its primary purpose is to facilitate the sharing of data across different information systems, particularly via the Internet. It is a simplified subset of the Standard Generalized Markup Language (SGML), and is designed to be relatively human-legible.Tally 8.1 logo
By adding semantic constraints, application languages can be implemented in XML. These include XHTMLRSSMathML,GraphMLScalable Vector Graphics(SVG), MusicXML, and thousands of others.
Moreover, XML is sometimes used as the specification language for such application languages.
What makes XML truly powerful is the acceptance and hard work done by all those who work with databases, programming, office application, etc.
It is because of this hard work that the tools exist to do these conversions from whatever platform into standardized XML data or convert XML into a format used by that platform.


XML Advanced
Namespaces
A method for identifying XML elements and attributes that have the same name, but different meanings. For example, ADDRESS is a tag that can be used to identify totally different data elements such as "street address" and "IP address."
An XML Namespace is a Prefix
An XML namespace uses a URL as a prefix in front of the "local name." The combination of URL and local name makes the element or attribute name unique. TheURL is used only as a way to create a unique prefix and does not have to resolve to a real page on the Internet. However, a document may be stored where the URL points that provides information about the namespace
Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names.
Name Conflicts
Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names.
This XML document carries details about an associate:
<?xml version="1.0" encoding="ISO-8859-1"?>
<ebiz>
<details>
    <name>Dharmedra Das</name>
    <address>D 210, sector 13</address>
    <city>Noida</city>
    <state>UP</state>
    <pin>201301</pin>
</details>
</ebiz>
Click here to view the file.
This XML document carries details about a book:
<?xml version="1.0" encoding="ISO-8859-1"?>
<ebiz>
<details>
    <title>XML Tutorial</title>
     <author>Ravish Tiwari</author>
     <date>29/04/2007</date>
     <ISBN>1-23343-235-2</ISBN>
     <publisher>eBIZ.com Publication</publisher>
</details>
</ebiz>
Click here to view the file.
If these two XML documents were added together, there would be an element name conflict because both documents contain a <details> element with different content and definition.
This XML document carries details about an associate:
<?xml version="1.0" encoding="ISO-8859-1"?>
<associate:ebiz xmlns:associate="http://www.ebizel.com/">
<associate:details>
    <associate:name>Dharmedra Das</associate:name>
    <associate:address>D 210, sector 13</associate:address>
    <associate:city>Noida</associate:city>
    <associate:state>UP</associate:state>
    <associate:pin>201301</associate:pin>
</associate:details>
</associate:ebiz>
This XML document carries details about a book:
<?xml version="1.0" encoding="ISO-8859-1"?>
<tutorial:ebiz xmlns:tutorial="http://education.ebizel.com/">
<tutorial:details>
    <tutorial:title>XML Tutorial</tutorial:title>
    <tutorial:author>Ravish Tiwari</tutorial:author>
    <tutorial:date>29/04/2007</tutorial:date>
    <tutorial:ISBN>1-23343-235-2</tutorial:ISBN>
    <tutorial:publisher>eBIZ.com Publication</tutorial:publisher>
</tutorial:details>
</tutorial:ebiz>

XML Advanced
CDATA
CDATA blocks have been provided as a convenience measure when you want to include large blocks of special characters as character data, but you do not want to have to use entity references all the time. CDATA sections are used to escape blocks of text containing characters, which would otherwise be recognized as markup. Whatever written inside CDATA section will be ignored by the XML parser. All tags and entity references are ignored by an XML processor that treats them just like any character data. CDATA means Character Data and XML Parsers ignore anything enclosed withinCDATA section.
Remember the parser ignores everything inside a CDATA section.
Uses of CDATA sections
Character data is character data, regardless of whether it is expressed via a CDATAsection or ordinary markup. New authors of XML documents often misunderstand the purpose of a CDATA section, mistakenly believing that its purpose is to "protect" data from being treated as ordinary character data during processing. Some APIs for working with XML documents do offer options for independent access to CDATAsections, but such options exist above and beyond the normal requirements of XMLprocessing systems, and still do not change the implicit meaning of the data. CDATA is not for odinary text or character data, it should be used where your data contains special characters, such as if you want to enclose a Javascript code in your XML file. This kind of data is bound to generate error while parsing, so you should use CDATAsection in such conditions.
Remember, a CDATA section cannot contain the string "]] >" and therefore it is not possible for a CDATA section to contain nested CDATA sections.CDATA sections are useful for writing XML code as text data within an XML document. For example, if one wishes to typeset a book with XSL explaining the use of an XML application, the XMLmarkup to appear in the book itself will be written in the source file in a CDATA section.
For example, to encode "]] >" one would write:
<script>
<![CDATA[
var salary;
function showSalary(amount)
{
if (amount<=5000) then
{
    salary=amount+(amount*.10);
}
else
{
    salary=amount+(amount*.08);
}
}
]]> </script>

XML Advanced
PCDATA
PCDATA acronyms for Parsed Character Data. Data of this section is parsed by XMLParser and unlike CDATA section we can not use special character in this section."#PCDATA" is a token used in an element declaration to declare the element as having mixed content (character data, or character data mixed with other elements). The content of the element is parsed; '&' and '<' have special meaning and must be escaped if they aren't the start of markup.
"#PCDATA" is a token used in an element declaration to declare the element as having mixed content. The content of the element is parsed; '&' and '<' have special meaning and must be escaped if they aren't the start of markup. This means if you use any unescaped special character such as & etc in a PCDATA section parser will generate error the XML file is parsed.
<script>
<![CDATA[
var salary;
function showSalary(amount)
{
if (amount<=5000) then
{
    salary=amount+(amount*.10);
}
else
{
    salary=amount+(amount*.08);
}
}
]]> </script>
remember when this script will be written inside any XML tag browser will display error while parsing this data.

XML on Work
XML and Server
Generating XML with JSP
XML can be generated on a server without any installed XML software.
To generate an XML response from the server - simply write the following code and save it as an JSP file on the Tomcat web server and create a System DSN named“simple” that will be used to connect the JSP to the database.
The Associate table contains the following fields:
Field nameData type
NameVarchar(30)
AgeInteger/int
AddressVarchar(30)
Cityvarchar(30)
StateVarchar(2)
<%
java.sql.ResultSet rs;
java.sql.PreparedStatement pst;
java.sql.Connection con;
response.setContentType("text/xml");
out.println("<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>");
out.println("<ebiz>");
out.println("<details>");
try
{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
//Accessing jdbc driver in JSP con=java.sql.DriverManager.getConnection("jdbc:odbc:simple");
//here simple is the name of DSN that is used to connect to desired database
pst=con.preapareStatement("select * from associate");
//here associate is the name of database table
rs=pst.executeQuery();
if(rs.next())
{
out.println("<name>"+rs.getString("name")+"</name>");
out.println("<age>"+rs.getInt("age")+"</age>");
out.println("<address_str>"+rs.getString("address")+"</address_str>");
out.println("<city>"+rs.getString("city")+"</city>");
out.println("<state>"+rs.getString("state")+"</state>");
}
}
catch(Exception e)
{
out.println("<name>Unable to retrieve details from server </name>");
}
finally
{
out.println("</details></ebiz>");
con.close();
}
%>

XML on Work
XML in Real Life
Content Management
Content management is an integral part of any enterprise. Content management system everybody has the access to required and allowed information as and when required.
The following example shows how an effective content management system can help a company improve customer satisfaction and loyalty, increase revenue and compete with third party and after-market parts suppliers.
Content Management Backbone
What is Content Management?
"Content management encompasses a set of processes and technologies, enabling the creation and packaging of content (documents, complex media, applets, components, etc.) as part of a dynamic and integrated Web-centric environment."
META Group
Let's examine this definition further. First, content management requires new, enabling technology. Second, content is not merely documents and words, but graphics, audio, video clips, live feeds and software components. This leads to two related questions that will be discussed later in this paper: What constitutes content? How can content be reused? Finally, the definition is decidedly web-oriented, bringing up two questions: Can content be shared between web and non-web uses? Are there new technologies that can assist in this process?
Enterprise content management emphasizes the need to address content management across all forms and formats of information stored throughout the extended enterprise. Within the enterprise, information is created using a wide variety of methods and tools, and this information is revised frequently. Enterprise content management takes the smallest, most appropriate units of information and allows them to be re-purposed and delivered in a personalized form to the individual requiring information.
What is Content?
In understanding content management, it may be helpful to distinguish it from document management and knowledge management. Document management deals with maintaining and storing documents. Knowledge management is concerned with making information accessible for decision making through index, query and search mechanisms. While content management shares some of the attributes of both document management (storing information) and knowledge management (accessing information), it goes beyond them to create a system for re-purposing and using information to drive business processes.
The traditional method used to transfer a document into a document management repository won't work with a live feed. Nor will the standard method of linking, object linking and embedding (OLE). Because "content" encompasses a wide variety of information objects, an expanded repository and a new system of linking are required. Content management requires a new enabling technology, to accommodate the dynamic array of information. With
The biggest advantages of a content management system is that content can be created using the best available tools for the job. Simple text documents can be created in Word while engineering drawings are built in sophisticated CAD/CAM software. Complex technical documents might use Interleaf 7 on the otherhand marketing literature may be created using Adobe Illustrator, CorelDraw etc.
Structure and Format
<P>Associate details<br /> Name: Dharampal Das</P>
<ebiz><associate><name>Dharampal Das</name></associate></ebiz>
<ebiz><associate-name>Dharampal Das</associate-name></ebiz>
<ebiz><name isAssociate=”yes”>Dharampal Das</name></ebiz>
The idea of identifying content is not new; numerous authoring tools define information formatting.
The SGML Approach
The idea that markup should be standard and separate from format information led to the creation of Standard Generalized Markup Language (SGML) in 1978. SGML became an ISO standard in 1986. SGML provided two key markup innovations. The first was to provide a language for describing markup, not just a particular set of markup elements. The second was to separate the tagging of content from its presentation or style. In other words, you do not mark up content according to SGML, you write an SGMLapplication that tags the content according to the rules set forth in the Document Type Definition (DTD). These rules do not define whether or not the content is centered or bold. Instead they define the structural elements that the content represents. DTD is used as the grammar for SGML documents. The DTD defines what SGML elements are and an application aware of the specific DTD tags the content accordingly.
In looking at DHTML with CSS, Microsoft made the following observation in a technical perspective on XML:
"CSS can still be used for simply structured XML data-and we anticipate that in such situations it will be useful. However, CSS does not provide a display structure that deviates from the structure of the data source. With XSL (eXtensible Style Language), it is possible to generate presentation structures (in HTML for instance) that are very different from the original XML data structure."
HTML is perhaps the best and easiest  way to present information. HTML also has content delivery limitations, especially when content is stored in databases, when there are complex interrelationships, and when the content is bound dynamically at the time of delivery. This means HTML has its own limitation when the content to be presented is dynamic. Without the ability to generate multiple, different presentations and to deliver a variety of content dynamically, enterprise content management is not possible. With HTML content can be presented easily and in fastest way but, one should not forget about its limitations.
Enter XML
XML DATA EXCHANGE
In the year 1996 the W3C (World Wide Web Consortium) and 80 SGML experts joined forces to develop a permanent solution to the problems of HTML. As the result a new language called XML (eXtensible Markup Language) was developed, together with a new style language called XSL (eXtensible Stylesheet Language) and, later, a new link language called XLink (eXtensible Links). XML is a simplified subset of SGML that is easy to use, designed specifically for the Web, and oriented toward content structure, not style.
XML is among one of the fastest growing web technologies at the time. XML technology has a number of advantages over SGML. First, XML is simpler to use and process thanSGML, making it more likely that low cost tools that accept XML will be widely available. Second, because XML has been developed as an enhancement to the Web, it has broad industry support. XML has been adopted by many companies such as Sun Microsystems and Microsoft, giving it a prominent place in Unix and Windows workplaces. Third, significant progress has already been made in defining standardXML DTDs for a variety of applications.
XML, combined with Java and object-oriented data technology, has become the enabling technology that makes enterprise content management possible.
True Enterprise Content Management
To achieve these goals a content management system must allow content to be:
Created using familiar tools at any place within the enterprise
Structured and accessed in units appropriate to its meaning  
Personalized and used in one-to-one marketing  
Reused as often and in any combination desired  
Easily updated and kept current  
Faithfully rendered in a variety of presentation media. 

XML on Work
XML Technologies
“XML is going to be the future and standard of all web development in coming days.”
XML and related technologies are becoming de-facto standard as far as web technologies and B2B(business to business) development is concerned.
As we know XML is a simple, flexible and text based language that gives us the freedom to define the representation of our data [who contain what]. This flexibility makes the web developers work easy, because with the help of XML they can now exchange the data, doesn’t matter what so ever the platform they are using. XML is all about structure of data it does not know anything about how to present this data.
Top guns like Sun microsystemMicrosoftOracle and others are either supportingXML technologies or using XML related technologies.
XML based Technologies
XHTML (Extensible HTML)
stricter version of HTML. It is HTML tags+ XML rules. XHTML does not have any new tag instead it enforce the user to write valid and well formed HTML codes that confirms XML rules.
XHTML stands for EXtensible HyperText Markup LanguageXHTML is aimed to replace HTMLXHTML is almost identical to HTML 4.01XHTML is a stricter and cleaner version of HTMLXHTML is HTML defined as an XML application.
XML DOM (XML Document Object Model)
The XML Document Object Model (XML DOM) defines a standard way for accessing and manipulating XML documents.
The DOM presents an XML document as a tree-structure (a node tree), with the elements, attributes, and text defined as nodes. 
The DOM views XML documents as a tree-structure. All elements; their containing text and their attributes, can be accessed through the DOM tree. Their contents can be modified or deleted, and new elements can be created. The elements, their text, and their attributes are all known as nodes. 
XSL (Extensible Style Sheet Language)
XSL consists of three parts: XSLT - a language for transforming XML documents, XPath- a language for navigating in XML documents, and XSL-FO - a language for formattingXML documents.
XSLT (XSL Transformations)
is used to transform XML documents into other XML formats, like XHTML.
XPath
is a language for navigating in XML documents.
XSL-FO (Extensible Style Sheet Language Formatting Objects)
is an XML based markup language describing the formatting of XML data for output to screen, paper or other media.
XLink (XML Linking Language)
is a language for creating hyperlinks in XML documents.
XPointer (XML Pointer Language)
allows the XLink hyperlinks to point to more specific parts in the XML document.
DTD (Document Type Definition)
is used to define the legal elements in an XML document.
XSD (XML Schema)
is an XML-based alternative to DTDs.
XForms (XML Forms)
uses XML to define form data.
XQuery (XML Query Language)
is designed to query XML data.
SOAP (Simple Object Access Protocol)
is an XML-based protocol to let applications exchange information over HTTP.
WSDL (Web Services Description Language)
is an XML-based language for describing web services.
RDF (Resource Description Framework)
is an XML-based language for describing web resources.
RSS (Really Simple Syndication)
is a format for syndicating news and the content of news-like sites.
WAP (Wireless Application Protocol)
was designed to show internet contents on wireless clients, like mobile phones.
SMIL (Synchronized Multimedia Integration Language)
is a language for describing audiovisual presentations.
SVG (Scalable Vector Graphics)
defines graphics in XML format.
List of XML Schemas
This is a list of XML schemas in use on the Internet sorted by purpose. XML schemas can be used to create XML documents for a wide range of purposes such as syndication, general exchange, and storage of data in a standard format.
Bookmarks
XBEL
XML Bookmark Exchange Language.
Graphical User Interfaces
GLADE -
GNOME's User Interface Language (GTK+)
KParts KDE's User Interface Language (Qt)
KDE's User Interface Language (Qt)
XUL
XML User Interface Language (Native) 
Mathematical
MathML
Mathematical Markup Language. 
Metadata
RDF
Resource Description Framework 
Music Playlists
XSPF
XML Shareable Playlist Format 
News Syndication
Atom -
Atom feed
RSS
Really Simple Syndication
Paper and Forest Products
papiNet
XML format for exchange of business documents and product information in the paper and forest products industries. 
Statistics
SDMX
SDMX-ML is a format for eXchange and sharing of Statistical Data and Metadata 
Vector Images
SVG
Scalable Vector Graphics 

0 comments: