click below
click below
Normal Size Small Size show me how
CIS CH4
CIS CH4 Introduction to Information Systems
Term | Definition |
---|---|
structured information | facts & data that are reasonably ordered, or that can be broken down into component parts & organized into hierarchies. |
unstructured information | information that has no inherent structure on order, & the parts can't be easily linked together. |
semi-structured information | info category that falls btw structured & unstructured info. Includes facts & data that show at least some structure, such as web pages & documents, which bear creation dates, titles, & authors. |
metadata | data about data that clarifies the nature of the information |
Most important characteristics that affect quality | accuracy, precision, completeness, consistency, timeliness, bias, and duplication |
accuracy (characteristic affecting quality) | mistakes in birth dates, spelling, or price reduce the quality of the information |
precision (characteristic affecting quality) | rounding to the nearest mile might not reduce quality much when you estimate the drive to the mall, but for property surveys "about 2 miles" is unacceptable. |
completeness (characteristic affecting quality) | omitting the zip code on customer's address record might not be a problem b/c zip can be determined by address; leaving out house # would delay order. |
consistency (characteristic affecting quality) | reports showing "total sales by region" may conflict b/c ppl generating rpts are using slightly different definitions. When results are inconsistent, the quality of both reports are in question. |
timeliness (characteristic affecting quality) | outdated info has less value than up-to-date info & thus is lower quality unless you are looking for historical trends. Actual definition for what is up-to-date varies. W/ stock trading timeliness measured in fractions of a second. |
bias (characteristic affecting quality) | biased info lacks objectivity, & that reduces its value & quality. To make sales seem higher, a mgr might choose to included canceled orders, though CEO might not be pleased. |
duplication (characteristic affecting quality) | info can be redundant, resulting in misleading & exaggerated summaries. In customer records, ppl can easily appear more than once if their address changes. |
table | group of records for the same entity, such as employees. Each row is one record, and the fields of each record are arranged in the table's columns. |
record | a means to represent an entity, which might be a person, product, purchase order, event, building, vendor, book, video, or some other "thing" that has meaning to ppl. Record is made up of attributes of that thing. |
field | an attribute of an entity. A field can contain numeric data or text, or a combination of the two. |
data definition | specifies the characteristics of a field, such as the type of data it will hold or the maximum # of characters it can contain. |
batch processing | the process of sequentially executing operations on each record in a large batch. |
major advantages of a database vs. file processing systems | reduced redundancy & inconsistency, improved info integrity & accuracy, improved ability to adapt to changes, improved performance & scalability, increased security. |
problems with file processing systems | data redundancy & inconsistency, lack of data integration, inconsistent data definitions, and data dependence. |
database | an integrated collection of information that is logically related & stored in such a way as to minimize duplication & facilitate rapid retrieval. |
database management software (DBMS) | software used to create & manage a database; it also provides tools for ensuring security, replication, retrieval, & other administrative & housekeeping tasks. |
relational database | the widely used database model that organizes info into tables of records that are related to one another by linking a field in one table to a field in another table w/ matching data. |
database architecture, 3 types of relationships | one-to-one, one-to-many, and many-to-many |
one-to-one database architecture relationship | relatively easy to accommodate & even file processing systems can handle it. Ex: each person has only one birth date. |
one-to-many database architecture relationship | more challenging than one-to-one; ex. a person might have one or more dependents or one or more ee's reporting to him or her. |
many-to-many database architecture relationship | more complicated to support than one-to-one or one-to-many; Ex. situation in which a person might be working on any # of projects, ea of which can have any # of ee's assigned to it. |
early database architecture: hierarchical | resembles an organizational chart or an upside down tree |
early database architecture: network | resembles a lattice or web rather than the upside down tree (hierarchical). Records can be linked in multiple ways, supporting many-to-many relationships. |
modern database architecture: relational | maintains records in rows w/in tables, & links btw the tables are created by linking a field in one table to a field in another table w/ matching data. Most widely used. |
modern database architecture: object-oriented | represents info in the form of objects, & uses object-oriented programming languages to access them; used especially for organizing complex data types such as graphics & multimedia. |
modern database architecture: XML | organizes data using XML tags; used especially for managing web content & web-based resources. |
data model | a model used for planning the org's database that identifies what kind of info is needed, what entities will be created, & how they are related to one another. |
primary key | a field, or a group of fields, that makes each record unique in a table |
autonumbering | process that assigns incremental #'s to records as they are created to ensure that each record has a unique primary key. |
normalization | a process that refines entities & their relationships to help minimize duplication of information in tables. |
functionality dependent | for each value of the table's primary key, there should be jsut one value for each of the attributes in the record, and the primary key should determine that value; the attribute should be functionally dependent on the value of the primary key. |
foreign keys | primary keys that appear as an attribute in a different table are a foreign key in that table. They can be used to link the records in two tables together. |
structured query language (SQL) | a standard query language, widely used to manipulate information in relational databases. |
interactive voice response (IVR) | a technology that facilitates access to the database from signals transmitted by telephone to retrieve information and enter data. |
scalability | a system's ability to handle rapidly increasing demand. |
referential integrity | a rule enforced by the database management system that ensures that every foreign key entry actually exists as a primary key entry in its main table. |
database schema | a graphic that documents the data model & shows the tables, attributes, keys, & logical relationships for a database. |
data dictionary | documentation that contains the details of each field in every table, including user-friendly descriptions of the field's meaning. |
shadow system | smaller databases developed by individuals outside of the IT dept that focus on their creator's specific info requirements. |
master data management | an approach that addresses the underlying inconsistencies in the way ee's use data by attempting to achieve consistent & uniform definitions for entities & their attributes across all business units |
data steward | a combination of watchdog & bridge builder, a person who ensures that ppl adhere to the definitions for the master data in their organizational units. |
data warehouse | a central data repository containing info drawn from multiple sources that can be used for analysis, intelligence gathering, & strategic planning. |
Examples of internal data sources | operational data, customer data, inventory, manufacturing data, archived historical data, metadata drawn from documents - multimedia - & other sources, website traffic data. |
Examples of external data sources | census data GPS/ Mapping coordinates, and competitor information |
data warehouse architectures: relational database | co's often use the same relational DBMS for their data warehouse as they use for their operational database, but loaded onto a separate server & tuned for fast retrieval & reporting. |
data warehouse architectures: data cubes | this architecture creates multidimensional cubes that accommodate complex, grouped data arranged in hierarchies. Retrieval is very fast b/c data are already grouped in logical dimensions, such as sales by product, city, region, & country. |
data warehouse architectures: virtual federated warehouse | this approach relies on a cooperating collection of existing databases; software extracts & transforms the data in real time rather than taking snapshots at periodic intervals. |
data warehouse architectures: data warehouse appliance | the appliance is prepackaged data warehouse solution offered by vendors that includes the hardware & software, maintenance, & support. |
data warehouse architectures: NoSQL | database mgmt systems suited for storing & analyzing big data. NoSQL stands for "not only SQL." |
data warehouse architectures: in-memory database | relies on main memory to store the database, rather than secondary storage devices, which vastly increases access speeds. |
The 3 V's that are the defining features of big data | volume, velocity, and variety |
3 V's defining features of big data: Volume | data collections can take up petabytes of storage, & are continually growing. |
3 V's defining features of big data: Velocity | many data sources change & grow at very fast speeds. The nightly ETL process often used for data warehouses is not adequate for many real-time demands. |
3 V's defining features of big data: Variety | relational databases are very efficient for structured info stored in tables, but businesses can benefit from analyzing semi-structured & unstructured data as well. |
extract, transform, and load (ETL) | a common strategy for drawing info from multiple sources by extracting data from its home database, transforming & cleansing it to adhere to common data definitions, & then loading it into the data warehouse. |
big data | collections of data that are so enormous in size, so varied in content, & so fast to accumulate that they are difficult to store & analyze using traditional approaches. |
data mining | a type of intelligence gathering that uses statistical techniques to explore records in a data warehouse, hunting for hidden patterns & relationships that are undetectable in routine reports. |
information resources can be described as.... | structured, unstructured, or semi-structured |
the quality of information is affected by several characteristics, such as: | accuracy, precision, completeness, consistency, & timeliness. |
management information with the help of computers relies on use of... | tables, records, & fields, and each field should have a data definition. |
most widely used database | relational database |
relationships between tables are created by.... | linking a field in one table to a field in another table w/ matching data. |
the development of a relational database begins with | the planning phase |
the planning phase of a relational database does what? | identify the entities, their attributes, and their relationships. |
normalcy in the development of a relational database | tables are created in such a way as to eliminate redundancy a much as possible and ensure that tables can be related to one another in a way that reflects their actual relationship. |
primary keys | ensure each record in a table is unique, & foreign keys help establish relationships among tables. |
information is retrieved from the database using... | query languages, such as SQL |
integration strategies, like master data management, are needed to... | coordinate how data are entered & provide enterprise-wide summaries for strategic planning. |
the data warehouse draws information from multiples sources to create one information storehouse that can be used for... | reporting, analysis, and research. |
sources can be both... | internal and external |
3 steps used to create the warehouse, which is refreshed w/ updated info daily or more often | extract, transform, and load |
big data refers to... | immense data collections that feature the 3 V's: high volume, velocity, and variety |
data ownership issues arise because | data have to be shared by all the stakeholders in the organization, and also for databases w/o boundaries like Craigslist - records entered by someone outside of the org. |
what is needed to balance all the stakeholders' requirements? | leadership, cooperation, negotiation, & a well-designed database. |
3 categories that describe the nature of information resources | structured information, unstructured information, and semi-structured information |
nature of info resources: structured information | based on facts and data; info in ordered parts that can be broken down into hierarchy; transaction sales w/ distinct fields for customer #, date, type of order, amt paid, etc. |
nature of info resources: unstructured information | no particular structure or order; photos, handwritten notes, etc. |
nature of info resources; semi-structured information | web pages w/ date, title, author or spreadsheets used by multiple ppl to track info; |
what does metadata describe for structured data | describes and provides info about every field, table, and corresponding relationship. |
what does metadata describe for unstructured and semi-structured data | used for describing the properties of a document or resource that are useful as it layers the structure on information. |
characteristics of information that affect quality are... | accuracy, precision, completeness, consistency, timeliness, bias, and duplication. |
example of accuracy (a characteristic of info that affects quality) | information mistakes decrease the accuracy of information, like incorrect dates, spelling, or price. |
example of precision (a characteristic of info that affects quality) | rounding to the nearest digit, rounding speeds or distances can sometimes affect the precision. |
example of completeness (a characteristic of info that affects quality) | omitting information like a house number on an address |
example of consistency (a characteristic of info that affects quality) | different people submitting information have a slightly different definition for details. |
example of timeliness (a characteristic of info that affects quality) | up-to-date information may be annually for some or to the second for others |
example of bias (a characteristic of info that affects quality) | quality of information decreases - keeping records up-to-date for a certain set and not for another |
example of duplication (a characteristic of info that affects quality) | some records may duplicate, leading to inaccurate or incorrect information |
early design approaches to managing info resources were | early designs placed documents in envelopes and eventually files in a filing cabinet, which eventually transferred to a computer |
major disadvantages of file processing systems | data redundancy and inconsistency, lack of data integration, inconsistent data definitions, and data dependence. |
data redundancy and inconsistency | 2 separate departments might spell an ee's name differently, leading to 2 records for the same ee. |
lack of data integration | different systems may only allow certain information that was contained in the old records. |
inconsistent data definitions | different programmers writing code may write the background information differently |
data dependence | certain programs interconnect and depend on one another |
the 3 architectures that emerged for integrated databases are | one-to-one (easy), one-to-many (one super w/ many employees tied together), and many-to-many relationships (difficult to support) |
steps in planning a relational data model | develop data model for what will be tracked & how it relates to the other (entities and attributes), identify primary keys and uniqueness, normalize data model to minimize duplication |
how are primary keys and foreign keys used to create links btw tables in a relational database | the databases are connected in a meaningful way, which will relate the 2 tables. Usually the primary in 1 is the foreign in another |
what is the typical strategy to access a database | most use an application interface where they can access, edit, and retrieve information safely and securely. |
what is the role of a database administrator | a DBA monitors and optimizes performance of databases, troubleshoots, protects, backs-up, and helps develop new. One of the fastest developing careers in US |
how is SQL used to query a database | Structured Query Language is a programming language that tells the database how to respond/ format |
how is IVR used to query a database | Interactive Voice Response uses transmitted signals from the phone to retrieve info from an account,or select #'s/ options. |
what are shadow systems and why are they sometimes used in organizations | smaller databases created by individuals outside of IT to focus on the creator's specific information. Used to get jobs done more efficiently and can be created w/ Access or Excel. Bad if person leaves. |
role of master data management in an organization's integration strategy | help to achieve uniform definitions so that everyone is using the terms in the same manner for accuracy |
3 steps in building a data warehouse | extract, transform, load |
internal sources of data for a data warehouse | customer records, transactions, inventory, assets, HR. |
external sources of data for a data warehouse | census data, GPS mapping or coordinates, competitor information |
4 examples of data warehouse architecture | relational database, data cube, virtual federated warehouse, and data warehouse appliance |
which approach of data warehouse architecture is suitable for today? | virtual federated warehouse because it gives real-time information vs. periodic snapshots |
difference btw data mining and data dredging | data mining is finding hidden data and data dredging is finding the relationships between the data. |
examples of databases w/o boundaries | people away from the enterprise enter and manage records |
ownership issues affect information management... | companies may set policies that they own information on their devices, like flash drives or computers |
Information management needs differ among stakeholder groups by | leadership, cooperation, negotiation, and a well-designed database |