Twitter Bird Gadget

Thursday, June 16, 2011

Chapter 5 MIS


KEY terms
Attributes — a piece of information describing a particular entity.

Business Intelligence — Applications and technologies to help users make better business decisions.

Data administration — a special organizational function for managing the organization’s data resources that is concerned with information policy, data planning, maintenance of data dictionaries, and data quality standards.

Data cleansing — activities for detecting and correcting data in a database or file that are incorrect, incomplete, improperly formatted, or redundant. Also known as data scrubbing.

Data definition language — the component of a database management system that defines each data element as it appears in the database.

Data dictionary — an automated or manual tool for storing and organizing information about the data maintained in a database.

Data governance — deals with the policies and processes for managing the availability, usability, integrity, and security of the data employed in an enterprise, with special emphasis on promoting privacy, security, data quality, and compliance with government regulations.

Data inconsistency — the presence of different values for the same attribute when the same data are stored in multiple locations.

Data manipulation language — a language associated with a database management system that end users and programmers use to manipulate data in the database.

Data mart — a small data warehouse containing only a portion of the organization’s data for a specified function or population of users.

Data mining — analysis of large pools of data to find patterns and rules that can be used to guide decision making and predict future behavior.

Data quality audit — a survey and/or sample of files to determine accuracy and completeness of data in an information system.

Data redundancy — the presence of duplicate data in multiple data files.

Data warehouse — a database with reporting and query tools that stores current and historical data extracted from various operational systems and consolidated for management reporting and analysis.

Database — a group of related files.

Database (rigorous definition) — a collection of data organized to service many applications at the same time by storing and managing data so that they appear to be in one location.

Database administration — refers to the more technical and operational aspects of managing data, including physical database design and maintenance.

Database management system (DBMS) — special software to create and maintain a database and enable individual business applications to extract the data they need without having to create separate files or data definitions in their computer programs.

Database server — a computer in a client/server environment that is responsible for running a database management system (DBMS) to process structured query language (SQL) statements and perform database management tasks.

Distributed database — a database that is stored in more than one physical location. Parts or copies if the database are physically stored in one location, and other parts or copies are stored and maintained in other locations.

Entity — a person, place, thing, or event about which information must be kept.

Entity-relationship diagram — a methodology for documenting databases illustrating the relationship between various entities in the database.

Field — a group of characters into a word, a group of words, or a complete number, such as a person’s name or age.

File — A group of records of the same type.

Foreign key — field in a database table that enables users to find related information in another database table.

Information policy — formal rules governing the maintenance, distribution, and use of information in an organization.

Key field — a field in a record that uniquely identifies instances of that record so that it can be retrieved, updated, or sorted.

Normalization — the process of creating small stable data structures for complex groups of data when designing a relational database.

Object-oriented DBMS — an approach to data management that stores both data and the procedures acting on the data as objects that can be automatically retrieved and shared; the objects can contain multimedia.

Object-relational DBMS — a database management system that combines the capabilities of a relational database management system (DBMS) for storing traditional information and the capabilities of an object-oriented DBMS for storing graphics and multimedia.

Online analytical processing (OLAP) — capability for manipulating and analyzing large volumes of data from multiple perspectives.

Predictive analysis Use of datamining techniques,historical data, and assumptions about future conditions to predict outcomes of events.

Primary key unique identifier for all the information in any row of a database table.

Program-data dependence — the close relationships between data stored in files and the software programs that update and maintain those files. Any change in data organization or format requires a change in all the programs associated with those files.

Record — a group of related fields.

Relational DBMS — a type of logical database model that treats data as if they were stored in two-dimensional tables. It can related data stored in one table to data in another as long as the two tables share a common data element.

Structured query language (SQL) — the standard data manipulation language for relational database management systems.

Tuple — a row or record in a relational database.

Review Questions

What are the problems of managing data resources in a traditional file environment and how are they solved by a database management system?

List and describe each of the components in the data hierarchy.

The data hierarchy includes bits, bytes, fields, records, files, and databases. Data are organized in a hierarchy that starts with the bit, which is represented by either a 0 (off) or a 1 (on).

Define and explain the significance of entities, attributes, and key fields.

Entity is a person, place, thing, or event on which information can be obtained.
Attribute is a piece of information describing a particular entity.
Key field is a field in a record that uniquely identifies instances of that unique record so that it can be retrieved, updated, or sorted. For example, a person’s name cannot be a key because there can be another person with the same name, whereas a social security number is unique. Also a product name may not be unique but a product number can be designed to be unique.

List and describe the problems of the traditional file environment.

Problems with the traditional file environment include data redundancy and confusion, program-data dependence, lack of flexibility, poor security, and lack of data sharing and availability. Data redundancy is the presence of duplicate data in multiple data files. Program-data dependence is the tight relationship between data stored in files and the specific programs required to update and maintain those files. Lack of flexibility refers to the fact that it is very difficult to create new reports from data when needed. Poor security results from the lack of control over the data because the data are so widespread. Data sharing is virtually impossible because it is distributed in so many different files around the organization.

Define a database and a database management system and describe how it solves the problems of a traditional file environment.

A database is a collection of data organized to service many applications efficiently by storing and managing data so that they appear to be in one location. It also minimizes redundant data. A database management system (DBMS) is special software that permits an organization to centralize data, manage them efficiently, and provide access to the stored data by application programs. A DBMS can reduce the complexity of the information systems environment,


2. What are the major capabilities of DBMS and why is a relational DBMS so powerful?

Name and briefly describe the capabilities of a DBMS.

A DBMS includes capabilities and tools for organizing, managing, and accessing the data in the database. The principal capabilities of a DBMS include data definition language, data dictionary, and data manipulation language.
The data definition language specifies the structure and content of the database.
The data dictionary is an automated or manual file that stores information about the data in the database, including names, definitions, formats, and descriptions of data elements.
The data manipulation language, such as SQL, is a specialized language for accessing and manipulating the data in the database.

Define a relational DBMS and explain how it organizes data.

The relational database is the primary method for organizing and maintaining data today in information systems.  It organizes data in two-dimensional tables with rows and columns called relations.  Each table contains data about an entity and its attributes.  Each row represents a record and each column represents an attribute or field.  Each table also contains a key field to uniquely identify each record for retrieval or manipulation. 

List and describe the three operations of a relational DBMS.

In a relational database, three basic operations are used to develop useful sets of data: select, project, and join.
Select operation creates a subset consisting of all records in the file that meet stated criteria. In other words, select creates a subset of rows that meet certain criteria.
Join operation combines relational tables to provide the user with more information that is available in individual tables.
Project operation creates a subset consisting of columns in a table, permitting the user to create new tables that contain only the information required.

3. What are some important database design principles?

Define and describe normalization and referential integrity and explain how they contribute to a well-designed relational database.

Normalization is the process of creating small stable data structures from complex groups of data when designing a relational database. Normalization streamlines relational database design by removing redundant data such as repeating data groups. A well-designed relational database will be organized around the information needs of the business and will probably be in some normalized form. A database that is not normalized will have problems with insertion, deletion, and modification.

Referential integrity rules ensure that relationships between coupled tables remain consistent. When one table has a foreign key that points to another table, you may not add a record to the table with the foreign key unless there is a corresponding record in the linked table. 

Define a distributed database and describe the two main ways of distributing data.

A distributed database is one that is stored in more than one physical location. A distributed database can be partitioned or replicated. When partitioned, the database is divided, so that each remote processor has access to the data that it needs to serve its local area. These databases can be updated locally and later justified with the central database. With replication, the database is duplicated at various remote locations.

Define a data warehouse, explaining how it works and how it benefits organizations.

A data warehouse is a database with archival, querying, and data exploration tools (i.e., statistical tools) and is used for storing historical and current data of potential interest to managers throughout the organization and from external sources. The data originate in many of the operational areas and are copied into the data warehouse as often as needed.

Define business intelligence and explain how it is related to database technology.

Powerful tools are available to analyze and access information that has been captured and organized in data warehouses and data marts.  These tools enable users to analyze the data to see new patterns, relationships, and insights that are useful for guiding decision making.  These tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions are often referred to as business intelligence.  Principal tools for business intelligence include software for database query and reporting tools for multidimensional data analysis and data mining.

Define data mining, describing how it differs from OLAP and the types of information it provides.

Data mining provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules from them to predict future behavior.  The patterns and rules are used to guide decision making and forecast the effect of those decisions.  The types of information obtained from data mining include associations, sequences, classifications, clusters, and forecasts.

Explain how text mining and Web mining differ from conventional data mining.

Conventional data mining focuses on data that have been structured in databases and files. Text mining concentrates on finding patterns and trends in unstructured data contained in text files. The data may be in email, memos, call center transcripts, survey responses, legal cases, patent descriptions, and service reports. Text mining tools extract key elements from large unstructured data sets, discover patterns and relationships, and summarize the information.

Web mining helps businesses understand customer behavior, evaluate the effectiveness of a particular Web site, or quantify the success of a marketing campaign. Web mining looks for patterns in data through
Web content mining: extracting knowledge from the content of Web pages
Web structure mining: examining data related to the structure of a particular Web site
Web usage mining: examining user interaction data recorded by a Web server whenever requests for a Web site’s resources are received

Describe how users can access information from a company’s internal databases through the Web. 
Conventional databases can be linked via middleware to the Web or a Web interface to facilitate user access to an organization’s internal data.  Web browser software on his/her client PC is used to access a corporate Web site over the Internet. The Web browser software requests data from the organization’s database, using HTML commands to communicate with the Web server. Because many back-end databases cannot interpret commands written in HTML, the Web server passes these requests for data to special middleware software that then translates HTML commands into SQL so that they can be processed by the DBMS working with the database.

Describe the roles of information policy and data administration in information management.

An information policy specifies the organization’s rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information.  Information policy lays out specific procedures and accountabilities, identifying which users and organizational units can share information, where information can be distributed, and who is responsible for updating and maintaining the information.

Data administration is responsible for the specific policies and procedures through which data can be managed as an organizational resource.  These responsibilities include developing information policy, planning for data, overseeing logical database design and data dictionary development, and monitoring how information systems specialists and end-user groups use data.

In large corporations, a formal data administration function is responsible for information policy, as well as for data planning, data dictionary development, and monitoring data usage in the firm.

Explain why data quality audits and data cleansing are essential.

Data that are inaccurate, incomplete, or inconsistent create serious operational and financial problems for businesses because they may create inaccuracies in product pricing, customer accounts, and inventory data, and lead to inaccurate decisions about the actions that should be taken by the firm.  Firms must take special steps to make sure they have a high level of data quality.  These include using enterprise-wide data standards, databases designed to minimize inconsistent and redundant data, data quality audits, and data cleansing software.

A data quality audit is a structured survey of the accuracy and level of completeness of the data in an information system.  Data quality audits can be performed by surveying entire data files, surveying samples from data files, or surveying end users for their perceptions of data quality.

Data cleansing consists of activities for detecting and correcting data in a database that are incorrect, incomplete, improperly formatted, or redundant.  Data cleansing not only corrects data but also enforces consistency among different sets of data that originated in separate information systems.



2 comments:

sahrawy said...

are you teaching this class Maria?
i liked your side very much. this is my email add me in your facebook, in that way we share what we have.

Anonymous said...

omg i love you so much, thank you for all your full answer. it helps me alott

Recent News

Followers

Etiketler