Save
Upgrade to remove ads
Busy. Please wait.
Log in with Clever
or

show password
Forgot Password?

Don't have an account?  Sign up 
Sign up using Clever
or

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
Your email address is only used to allow you to reset your password. See our Privacy Policy and Terms of Service.


Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.
focusNode
Didn't know it?
click below
 
Knew it?
click below
Don't Know
Remaining cards (0)
Know
0:00
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how

DMBOK - Chp 5

Data Modeling and Design

QuestionAnswer
Data Modeling The process of discovering, analyzing, and scoping data requirements, and then representing and communicating these data requirements in a precise form called the data model.
Goal of data modeling and design To confirm and document an understanding of different perspectives, which leads to applications that more closely align with current and future business requirements, and creates a foundation to successfully complete broad-scoped initiatives.
Data modeling and design business drivers Provide a common vocabulary around data Capture and document explicit knowledge about an organization's data and systems Serve as a primary communication tool during projects
Confirming and documenting understanding of different perspectives facilitates Formalization - a data model documents concise definition of data structures and relationships Scope - a data model can help explain the boundaries for data context Knowledge retention - a data model can preserve corporate memory regarding a system
A data model describes an organization's data as the organization understands it, or as the organization wants it to be. The model is a form of documentation for data requirements and data definitions resulting from the modeling process.
Types of data that can be modeled Category information Resource information Business event information Detail transaction information
Data model components Entities Relationships Attribute Domain
Data entity a thing about which an organization collects information, sometimes referred to as the nouns of an organization; answers fundamental questions of who, what, when, where, why, or how
Entity type A common alias for entity, as a type of something being represented; e.g., Jane is an entity and Employee is the entity type
Entity instance Occurrences or values of a particular entity
Aliases for entity in other schemes Dimensional schemes: dimension and fact table Object-oriented: class or object Time-based: hub, satellite, and link NoSQL: document or node
Entity definitions Essential contributors to the business value of any data model. Clarify the meaning of business vocabulary and provide rigor to the business rules governing entity relationships.
Data relationship The association between entities.
Aliases for relationships in other schemes Dimensional: navigation path NoSQL: edge or link
Relationship cardinality Captures how many of one entity (entity instances) participates in the relationship with how many of the other entity. Options are zero, one, or many
Arity of Relationships Number of entities in a relationship. Common examples are unary, binary, and ternary
Unary (Recursive) Relationship Self-referencing relationship involving only one entity. A one to many relationship describes a hierarchy, whereas a many-to-many describes a network or graph.
Binary Relationship Most common on a traditional data model diagram, involves two entities.
Ternary Relationship A relationship that includes three entities.
Foreign Key Used in physical and sometimes logical relational data modeling schemes to represent a relationship.
Data attribute A property that identifies, describes, or measures an entity. The physical correspondent is a column, field, tag, or node in a table, view, document, graph, or file.
Data identifier Also called a key, set of one or more attributes that uniquely defines an instance of an entity.
Construction-type Keys Simple Surrogate Compound Composite
Simple key One attribute that uniquely identifies an entity instance
Surrogate key A unique identifier for a table
Compound key Set of two or more attributes that together uniquely identify an entity instance
Composite key Contains one compound key and at least one other simple or compound key or non-key attribute. E.g., key on a multi-dimensional fact table
Function-type Keys Super key Candidate key Business key Primary key Alternate key
Super key Any set of attributes that uniquely identify an entity instance
Candidate key Minimal set of one or more attributes that identifies the entity instance to which it belongs; minimal means that no subset of the candidate key uniquely identifies the entity instance
Business key One or more attributes that a business professional would use to retrieve a single entity instance
Primary key The candidate key that is chosen to be the unique identifier for an entity
Alternate key candidate key that although unique, was not chosen as the primary key
Independent entity Entity where the primary key contains only attributes that belong to that entity
Dependent entity Entity where the primary key contains at least one attribute from another entity; Has at least one identifying relationship (primary key of the parent is migrated as a foreign key to the child)
Data domain The complete set of possible values that an attribute can be assigned.
Domain types Data Type (e.g., integer, date, etc) Data Format (e.g., alpha only, allowable characters, etc) List Range Rule-based (e.g., attribute 1 must be > attribute 2)
Data Modeling Schemes Relational Dimensional Object-Oriented (UML) Fact-Based Time-Based NoSQL
Relational scheme Provides a systematic way to organize data so that they reflect their meaning. Reduces redundancy in data storage. Most effectively managed in terms of two-dimensional relations.
Dimensional scheme Capture business questions focused on a particular business process. Data is structured to optimize the query and analysis of large amounts of data.
Fact Table Used within dimensional scheme, rows correspond to particular measurements and are numeric.
Dimension Table Used within dimensional scheme, table represent the important objects of the business and contain mostly textual descriptions. Serve as the primary source for 'query by' or 'report by' constraints.
Snowflaking Normalizing the flat, single-table, dimensional structure in a star schema into the respective component hierarchical or network structures
Grain Description of a single row of data in a fact table
Conformed Dimensions Built with the entire organization in mind instead of just a particular project; this allows these dimensions to be shared across dimensional models, due to containing consistent terminology and values
Conformed Facts Use standardized definitions of terms across individual marts.
Object-Oriented scheme (UML) Unified Modeling Language is a graphical language for modeling software.
Fact-Based Modeling (FBM) Family of conceptual modeling languages, based in the analysis of natural verbalization (plausible sentences) that might occur in the business domain.
Object Role Modeling (ORM or ORM2) Model-driven engineering approach that starts with typical examples of required information or queries presented in any external formulation familiar to users, and then verbalizes these examples at the conceptual level, in terms of simple facts
Time-Based Scheme Used when data values must be associated in chronological order and with specific time
Data Vault Detail-oriented, time-based, and uniquely linked set of normalized tables that support one or more functional areas of business. Hybrid approach between third normal form and star schema
NoSQL Category of databases built on non-relational technology. Four main types: document, key-value, column-oriented, and graph
Document (NoSQL) Instead of taking a business subject and breaking it up into multiple relational structures, they frequently store business subject in one structure called a docuent.
Key-Value (NoSQL) Allow an application to store its data in only two columns with the feature of storing both simple and complex information.
Column-oriented (NoSQL) Similar to RDBMS but can work with more complex data types including unformatted text and imagery.
Graph (NoSQL) Designed for data whose relations are well represented as a set of nodes with an undetermined number of connections between these nodes.
Conceptual data model Captures the high-level data requirements as a collection of related concepts. It contains only the basic and critical business entities within a given realm and function, with a description of each entity and the relationship between entities.
Logical data model Detailed representation of data requirements, usually in support of a specific usage context, such as application requirements. Independent of any technology or specific implementation constraints. Extended from the conceptual model by adding attributes.
Physical data model (PDM) Represents detailed technical solution, often using the logical model as a starting point and then adapted to work within a set of hardware, software, and network tools. Built for a particular technology.
Canonical model Used for data in motion between systems. This model describes the structure of data being passed as packets or messages. A variant of a physical scheme.
View A virtual table; provides a means to look at data from one or many tables that contain or reference the actual attributes.
Partitioning the process of splitting a table. Vertically: create subset of columns Horizontally: create subset of rows
Denormalization The deliberate transformation of normalized logical data model entities into physical tables with redundant or duplicate data structures. E.g., intentionally puts one attribute in multiple places.
Reasons to denormalize Combining data from multiple other tables in advance to avoid costly run-time joins Creating smaller, pre-filtered copies of data to reduce costly run-time calculations Pre-calculating and storing costly data calculations Enforce user security
Normalization The process of applying rules in order to organize business complexity into stable data structures. The basic goal is to keep each attribute in only one place to eliminate redundancy and inconsistencies.
First normal form (1NF) Ensures each entity has a valid primary key, and every attribute depends on the primary key; removes repeating groups, and ensures each attribute is atomic (not multi-valued.)
Second normal form (2NF) Ensures each entity has the minimal primary key and that every attribute depends on the complete primary key.
Third normal form (3NF) Ensures each entity has no hidden primary keys and that each attribute depends on no attributes outside the key. The term 'normalized model' usually means the data is in 3NF.
Boyce/Codd normal form (BCNF) Resolves overlapping composite candidate keys.
Fourth normal form (4NF) Resolves all many-to-many-to-many relationships (and beyond) in pairs until they cannot be broken down into any smaller pieces.
Fifth normal form (5NF) Resolves inter-entity dependencies into basic pairs, and all join dependencies use parts of primary keys
Abstraction the removal of details in such a way as to broaden applicability to a wide class of situations while preserving the important properties and essential nature from concepts or subjects.
Generalization (method of abstraction) Groups the common attributes and relationships of entities into supertype entities
Specialization (method of abstraction) Separates distinguishing attributes within an entity into subtype entities
Deliverables of data modeling process Diagram Definitions Issues and outstanding questions Lineage
Creating conceptual data model steps Select scheme Select notation complete initial conceptual data model Incorporate enterprise terminology Obtain sign-off
Creating logical data model steps Analyze information requirements Analyze existing documentation Add associative entities Add attributes Assign domains Assign keys
Creating physical data model steps Resolve logical abstractions Add attribute details Add reference data objects Assign surrogate keys Denormalize for performance Index for performance Partition for performance Create views
Reverse engineering The process of documenting an existing database, starting with the physical data model and working towards logical and conceptual data models.
Tools to assist data modelers Data Modeling Lineage Data profiling Metadata repositories Data Model patterns Industry data models
ISO 11179 Metadata Registry international standard for representing Metadata in an organization.
Database design principles Performance and ease of use Reusability Integrity Security Maintainability
Data professionals must balance data requirements of the information consumers and the application requirements of data producers
Design reviews should be held with subject matter experts representing different backgrounds, skills, expectations, and opinions.
Components of Data Model and Design quality management Develop Data Modeling and Design Standards Review Data Model and Database Design Quality Manage Data Model Versioning and Integration
Created by: sshupert
 

 



Voices

Use these flashcards to help memorize information. Look at the large card and try to recall what is on the other side. Then click the card to flip it. If you knew the answer, click the green Know box. Otherwise, click the red Don't know box.

When you've placed seven or more cards in the Don't know box, click "retry" to try those cards again.

If you've accidentally put the card in the wrong box, just click on the card to take it out of the box.

You can also use your keyboard to move the cards as follows:

If you are logged in to your account, this website will remember which cards you know and don't know so that they are in the same box the next time you log in.

When you need a break, try one of the other activities listed below the flashcards like Matching, Snowman, or Hungry Bug. Although it may feel like you're playing a game, your brain is still making more connections with the information to help you out.

To see how well you know the information, try the Quiz or Test activity.

Pass complete!
"Know" box contains:
Time elapsed:
Retries:
restart all cards