Save
Upgrade to remove ads
Busy. Please wait.
Log in with Clever
or

show password
Forgot Password?

Don't have an account?  Sign up 
Sign up using Clever
or

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
Your email address is only used to allow you to reset your password. See our Privacy Policy and Terms of Service.


Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.
focusNode
Didn't know it?
click below
 
Knew it?
click below
Don't Know
Remaining cards (0)
Know
0:00
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how

DMBOK - Chp 6

Data Storage

QuestionAnswer
What is the definition of data storage and operations? The design, implementation, and support of stored data to maximize it's value.
What are the two sub-activities of Data Storage and Operations? Database support and database technology support
What are the 4 inputs to Data Storage & Ops Data Architecture; Data Requirements, Data Models; Service Level Agreements
Definition of database? Any collection of stored data regardless of structure or content.
What type of improvement work should be done in collaboration with data modeling and data architecture? Automating database development processes, developing tools, and processes that shorten each development cycle, reduce errors and rework and minimize the impact on the development team.
Developing and promoting the use of abstracted and reusable data objects prevents what kind of issue in the database? Developing and promoting the use of abstracted and reusable data objects prevents applications from being tightly coupled to database schemas - 'object-relational impedance mismatch'
Database Administrators and developers need to agree on accepted methods for doing what? Methods of ensuring data integrity and data security.
What is the definition of an instance? An execution of database software controlling access to a certain area of storage. An organization will usually have multiple instances executing concurrently, using different areas of storage. Each instance is independent of all other instances.
What is the definition of node? An individual computer hosting either processing or data as part of a distributed database .
What is the definition of schema? A subset of a database objects contained within the dtabase or an instance. Schemas are used to organize objects into more manageable parts.
What is the definition of database abstraction? A common application interface (API) is used to call database functions, such than an application can connect to multiple different databases without the programmer having to know all function calls for all possible databases.
Data lifecycle management involves implement which 5 main policies and procedures. Acquisition, migration, retention, expiration, and disposition.
What are the 3 characteristics of a process for moving database changes to production environment? Processes should be controlled, documented and auditable
What are the 3 types of DBAs described in this section? Production, Application, Procedure and Development
What are some of the primary data operations management responsibilities for Production DBAs? Ensure performance and reliability; error reporting; implement backup & recovery mechanisms; implement clustering and failover mechanisms if continual data availability is a requirement; execute other database maintenance activities such as archiving;
Which role creates the deliverable of mechanisms and processes for monitoring database performance such as workloads and data volume. Production database administrator
How is an application DBA different from a Production DBA? Application DBA is responsible for all the environments of one database while a Production DBA is responsible for data operations management.
Which type of DBA leads the review and administration of procedural database objects? A procedural DBA
Which type of DBA leads the designing, creating, and managing of special use databases such as a training 'sandbox'? A development DBA
What element of data storage are Network Storage Administrators concerned with? NSAs are concerned with the hardware and software supporting data storage arrays as they have different needs and monitoring requirements.
What are the two types of database architecture types? A database can be classified as either centralized or distributed.
How is a centralized database different from a distributed database? A centralized system manages a single database while a distributed system manages multiple databases on multiple systems.
What are the 2 classifications for distributed systems? Federated and non-federated
An integration of component database mgmt systems that are not autonomous. They are controlled, managed and governed by a centralized DBMS. Non-federated database system.
Database type that is best for heterogeneous and distributed integration projects such as enterprise information integration. Federated databases
A type of federated database used to securely manage financial transactions, contract management, or health information. Blockchain database
The two basic types of database processing ACID & BASE
What does the acronym ACID stand for? Atomicity Consistency Isolation Durability
Atomicity (A of the ACID acronym) All operations are performed or none of them are so if one part of the transaction fails, the entire transaction failes
Consistency (C of the ACID acronym) The transaction must meet all rules defined by the system at all times and must void half-completed transactions
Isolation (I of the ACID acronym) Each transaction is independent unto itself
Durability (D of the ACID acronym) Once complete, the transaction cannot be undone.
BASE Basically Available, Soft state, Eventual consistency
Basically Available (BA of the BASE acronym) System guarantees some level of availability to the data even when there are node failures.
Soft state (S of the BASE acronym) The data is in a constant state of flux; while a response may be given, the data is not guaranteed to be current
Eventual consistency (E of the BASE acronym) The data will eventually be consistent through all nodes and in all databases, but not every transaction will be consistent at every moment.
CAP Theorem A distributed system cannot comply with all parts of ACID at all times. At most 2 of the 3 properties of consistency, availability, and partition tolerance can exist in any shared-data system.
A very stable method of storing data persistently Disk and Storage Area Networks (SAN)
Permanent story loaded into volatile memory when the system is turned on, and all processing occurs within the memory array. IMDB (In-Memory Database)
Designed to handle data sets in which data values are repeated to a great extent. Columnar-based databases
Combines access speed of memory-based storage with the persistence of disk-based storage Flash memory
Production environment The technical environment where all business processes occur.
Pre-production environments Environments are used to develop and test changes before such changes are introduced to the production environment.
Development environments This environment is used to create and test code for changes in separate environments, which then are combined the QA environment for full integration testing.
Test environment Used to execute quality assurance and user acceptance testing.
Sandbox or experimental environment An alternate environment that allows read-only connections to production data and can be managed by the users.
The 3 general ways databases are organized hierarchical, relational, non-relational
Most rigid of the 3 organizational models hierarchical
Relational databases Based on set theory and relational algebra, where data elements or attributes are related into tuples (rows)
Multidimensional database technologies Data stored in a structure that allows searching using several data element filter simultaneously.
Temporal database A relational database with built-in support for handling data involving time usually valid time and transaction time.
Non-relational databases A database where the data structure is no longer bound to a tabular relational design.
Column-oriented database A non-relational database is mostly used in Business Intelligence applications as they can compress redundant data.
Spatial database A database that is optimized to store and query data that represents objects defined in a geometric space
Object/Multi-media Includes a hierarchical storage management system for the efficient management of a hierarchy of magnetic and optical storage media.
Flat file database Any of various means to encode a data set as a single file such as a plain text file or a binary file.
Key-Value Pair database A database contains sets of two items: a key identifier and a value.
Triplestore databases A data entry composed of subject-predicate-object where the subject denotes a resource, the predicate expresses a relationship between the subject and the object, and the object itself.
Specialized databases Designed for specialized situations that are managed differently from traditional relations databases such as CAD/CAM, and GIS
Archiving The process of moving data off immediately accessible storage media and onto media with lower retrieval performance.
Partitioning The archival process must be aligned with this strategy to ensure optimal availability and retention.
Tests of archive restoration It is wise to schedule these regularly so as to avoid surprises in an emergency
When thinking about capacity and growth projections what three elements need to be considered? Size to hold all data, how much data goes in and how quickly, and how much data comes out and how quickly
Change data capture The process of detecting that data has changed and ensuring that information relevant to the change is stored appropriately.
Purging The process of completely removing data from storage media such that it cannot be recovered.
Goal of data storage management The cost of maintaining data should not exceed its value to the organization.
Replication the same data is stored on multiple storage devised.
Active Replication Recreating and storing the same data at every replica from every other replica
Passive Replication Recreating and storing the data on a single primary replica from and then transforming its resultant state to other secondary replicas
Horizontal data scaling Data scaling that uses more replicas
Vertical data scaling Data scaling that has data replicas located further away geographically
Multi-master replication Updates can be submitted to any database node and then ripple through to other servers.
Replication transparency When data is replicated between database servers so that the information remains consistent throughout the database system and users cannot tell or even which database copy they are using.
Mirroring replication pattern Updates to the primary database are replicated immediately to the secondary database as part of a two-phase commit process.
Log shipping replication pattern When a secondary server receives and applies copies of the primary database's transaction logs at regular intervals.
Resiliency The measure of how tolerant a system is to error conditions.
Highly resilient If a system can tolerate a high level of processing errors and still function as expected, it is high resilient.
Resilient A database that can detect and either abort or automatically recover from common processing errors.
Not resilient A database that crashes upon the first unexpected condition.
3 recovery types Immediate, critical, non-critical
Data processing errors data load failures, query return failures, obstacles to completing ETL or other processes
Methods to increase resiliency Trap and re-route, detect and ignore, and flags
Sharding A process where small chunks of the database are isolated and can be updated independently of other shards so replication is merely a file copy.
What are the tw main activities in Data Operations and Storage? Database technology support and database operations support.
Database technology support The selecting and maintaining the software that stores and manages the data.
Factors to consider when selecting DBMS software Architecture and complexity; volume, velocity, application profile , business intelligence, hardware platform, operating system support; availability of supporting tools; performance benchmarks, scalability, memory and storage requirements
What elements other than data should be considered when planning for capacity and growth? Other elements to be included in the planning are indexes, logs, and any redundant images such as mirrors.
Part of managing databases is defining the requirements. What are the 3 types of requirements? Storage, Usage and Access
Business continuity A plan for the business to continue in the event of a disaster or adverse event that impacts systems and their ability to use their data.
What two factors must be balanced when making plans for creating back ups The importance of the data must be balanced against the cost of protecting it.
How should back up files be stored? A separate filesystem, separate storage medium, in a secure off-site facility
Configuration identification process To identify the attributes that define every aspect of a configuration for end-user purposes.
Configuration change control process Set of processes and approval stages required to change configuration item's attributes and to re-baseline them.
Configuration status accounting Ability to record and report on the configuration baseline associated with each configuration item at any point in time.
Configuration audits A physical configuration audit ensures that a configuration item is installed in accordance with the requirements. A functional configuration audit ensures that performance attributes of a configuration item are achieved.
Physical data model Storage objects, indexing objects and any encapsulated code objects required to enforce data quality rules, connect database objects, and achieve database performance.
Database performance depends on which two interdependent facets? Availability and speed
Availability The percentage of time that a system or database can be used for productive work.
Recoverability The ability to reestablish service after interruption, and correct errors caused by unforeseen events or component failures.
Reliability The ability to deliver service at specified levels for a stated period
Serviceability The ability to identify the existence of problems, diagnose their causes, and repair/solve them
Built-in query optimizer (DB issues) Uses stored statistics about the data and indexes to make decisions about how to execute a given query most effectively.
Poor coding (DB issues) The most common cause of poor database performance.
Inefficient complex table joins (DB issues) Uses views to pre-define complex table joins. Unlike stored procedures, these are opaque to the query optimizer
Insufficient indexing (DB issues) Create the indexes necessary to support complex queries and queries involving large tables
Application activity (DB issues) Applications should be running on a server separate from the Database mgmt system to avoid database performance issues.
Overloaded servers (DB issues) For DBMS supporting multiple databases and applications there may be a breaking point.
Database volatility (DB issues) Large numbers of table inserts and deletes over a short while can create inaccurate database distribution statistics.
Runaway queries (DB issues) User may unintentionally submit queries that use a majority of the system's shared resources.
Test data set Data that has been specifically identified to test a system. This can include verifying that a given set of inputs produces the expected output or challenging the ability of programming to respond to unusual or extreme input.
Data migration The process of transferring data between storage types, formats or computer systems with as little change as possible.
Phases of data migration Design, extraction, remediation, load, verification
Tools for managing databases Modeling, monitoring, managing, and support
Test in lower environments The best practice of testing upgrades and patches on the lowest level environment first - usually development and installing on the production environment last.
Risk and readiness revolves around which two central ideas? Risk of data loss and risks related to technology readiness
Common data storage metrics Count of DBs by type; Aggregated transaction statistics; capacity metrics (amount of storage used, number of storage containers, number of data objects); storage service usage; requests made against the storage services; improvement to performance
Performance metrics transaction frequency and quantity; query performance; API service performance
Operational metrics Aggregated statistics about retrieval time; backup size; data quality measurement; availability
Service metrics issue submission, resolution, escalation count by type; issue resolution time
Information asset tracking Ensuring that an organization complies with all licensing agreements and regulatory requirements through yearly audits of software license and annual support costs
Data audit Evaluation of a data set based on defined criteria. Usually performed to investigate specific concerns about a data set to determine if the data was stored in compliance with contractual and methodological requirements
Data validation The process of evaluating stored data against established acceptance criteria to determine its quality and usability.
Created by: sshupert
 

 



Voices

Use these flashcards to help memorize information. Look at the large card and try to recall what is on the other side. Then click the card to flip it. If you knew the answer, click the green Know box. Otherwise, click the red Don't know box.

When you've placed seven or more cards in the Don't know box, click "retry" to try those cards again.

If you've accidentally put the card in the wrong box, just click on the card to take it out of the box.

You can also use your keyboard to move the cards as follows:

If you are logged in to your account, this website will remember which cards you know and don't know so that they are in the same box the next time you log in.

When you need a break, try one of the other activities listed below the flashcards like Matching, Snowman, or Hungry Bug. Although it may feel like you're playing a game, your brain is still making more connections with the information to help you out.

To see how well you know the information, try the Quiz or Test activity.

Pass complete!
"Know" box contains:
Time elapsed:
Retries:
restart all cards