click below
click below
Normal Size Small Size show me how
Ch 2 Data Models
Data Modeling, business rules, level of abstraction of data models,
Term | Definition |
---|---|
data modeling | The process of creating a specific data model for a determined problem domain. |
data model | A representation, usually graphic, of a complex “real-world” data structure. Data models are used in the database design phase of the Database Life Cycle. |
business rule | A description of a policy, procedure, or principle within an organization. For example, a pilot cannot be on duty for more than 10 hours during a 24-hour period, or a professor may teach up to four classes during a semester. |
attribute | A characteristic of an entity or object. An attribute has a name and a data type. |
conceptual model | The output of the conceptual design process. The conceptual model provides a global view of an entire database and describes the main data objects, avoiding details. |
Crow’s Foot notation | A representation of the entity relationship diagram that uses a three-pronged symbol to represent the “many” sides of the relationship. |
entity | A person, place, thing, concept, or event for which data can be stored. |
entity relationship diagram (ERD) | A diagram that depicts an entity relationship model’s entities, attributes, and relations. |
external model | The application programmer’s view of the data environment. Given its business focus, an external model works with a data subset of the global database schema. |
external schema | The specific representation of an external view; the end user’s view of the data environment. |
tuple | In the relational model, a table row. |
relational Model | Developed by E. F. Codd of IBM in 1970, the relational model is based on mathematical set theory and represents data as independent relations. |
relation (a) | Each relation (table) is conceptually represented as a two-dimensional structure of intersecting rows and columns. The relations are related to each other through the sharing of common entity characteristics (values in columns). |
relational diagram | A graphical representation of a relational database’s entities, the attributes within those entities, and the relationships among the entities. |
relation (b) | A logical construct perceived to be a two-dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model. |
physical model | A model in which physical characteristics such as location, path, and format are described for the data. The physical model is both hardware- and software-dependent. See also physical design. |
one-to-one (1:1 or 1..1) relationship | Associations among two or more entities that are used by data models. In a 1:1 relationship, one entity instance is associated with only one instance of the related entity. |
one-to-many (1:M or 1..*) relationship | Associations among two or more entities that are used by data models. In a 1:M relationship, one entity instance is associated with many instances of the related entity. |
many-to-many (M:N or *..*) relationship | Association among two or more entities in which one occurrence of an entity is associated with many occurrences of a related entity and one occurrence of the related entity is associated with many occurrences of the first entity. |
constraint | A restriction placed on data, usually expressed in the form of rules. For example, “A student’s GPA must be between 0.00 and 4.00.” |
connectivity | The type of relationship between entities. Classifications include 1:1, 1:M, and M:N. |
big data | A movement to find new and better ways to manage large amounts of web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost. |
American National Standards Institute (ANSI) | The group that accepted the DBTG recommendations and augmented database standards in 1975 through its SPARC committee. |
3 Vs | Three basic characteristics of Big Data databases: volume, velocity, and variety. |
class | A collection of similar objects with shared structure (attributes) and behavior (methods). A class encapsulates an object’s data representation and a method’s implementation. Classes are organized in a class hierarchy. |
class diagrams | A diagram used to represent data and their relationships in UML object notation. |
class hierarchy | The organization of classes in a hierarchical tree in which each parent class is a superclass and each child class is a subclass. See also inheritance. |
client node | One of three types of nodes used in the Hadoop Distributed File System (HDFS). The client node acts as the interface between the user application and the HDFS. See also name node and data node. |
data definition language | The language that allows a database administrator to define the database structure, schema, and subschema. |
data manipulation language | The set of commands that allows an end user to manipulate the data in the database, such as SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK. |
data node | One of three types of nodes used in the Hadoop Distributed File System (HDFS). The data node stores fixed-size data blocks (that could be replicated to other data nodes). See also client node and name node. |
entity occurrence | A row in a relational table. |
entity relationship (ER) model | A data model that describes relationships (1:1, 1:M, and M:N) among entities at the conceptual level with the help of ER diagrams. |
entity set | A collection of like entities. |
extended relationship data model (ERDM) | A model that includes the object-oriented model’s best features in an inherently simpler relational database structural environment. See extended entity relationship model (EERM). |
Hadoop | A Java-based, open-source, high-speed, fault-tolerant distributed storage and computational framework. Hadoop uses low-cost hardware to create clusters of thousands of computer nodes to store and process data. |
hadoop distributed file system | A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds. |
hardware independence | A condition in which a model does not depend on the hardware used in the model’s implementation. Therefore, changes in the hardware will have no effect on the database design at the conceptual level. |
candidate key | A minimal superkey; that is, a key that does not contain a subset of attributes that is itself a superkey. See key. |
closure | A property of relational operators that permits the use of relational algebra operators on existing tables (relations) to produce new relations. |