Busy. Please wait.

show password
Forgot Password?

Don't have an account?  Sign up 

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
We do not share your email address with others. It is only used to allow you to reset your password. For details read our Privacy Policy and Terms of Service.

Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.

Remove Ads
Don't know
remaining cards
To flip the current card, click it or press the Spacebar key.  To move the current card to one of the three colored boxes, click on the box.  You may also press the UP ARROW key to move the card to the "Know" box, the DOWN ARROW key to move the card to the "Don't know" box, or the RIGHT ARROW key to move the card to the Remaining box.  You may also click on the card displayed in any of the three boxes to bring that card back to the center.

Pass complete!

"Know" box contains:
Time elapsed:
restart all cards

Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how


Information Storage Management

What is Data? Data is a collection of raw facts from which conclusions can be drawn.
What are the two catagories of data? Structured and Unstructured
Describe Structured Data Structured data is organized in rows and columns in a rigidly defined format so that applications can retrieve and process it efficiently.
Describe Unstructured Data Data is unstructured if its elements cannot be stored in rows and columns, and is therefore difficult to query and retrieve by business applications.
Define Information Information is the intelligence and knowledge derived from data.
What is the value of information to a business? Identifying new business opportunities. Identifying patterns that lead to changes in existing business. Creating a competitive advantage.
How is the type of storage to be used determined? The type of storage used is based on the type of data and the rate at which it is created and used.
Describe RAID Redundant Array of Independent Disks. RAID is used in all storage architectures such as DAS, SAN and so on.
Describe DAS Direct Attached Storage. Connects directly to the server (host) or a group of servers in a cluster. Storage can either be internal or external to the server. External DAS alleviated the challenges of limited internal storage capacity.
Describe SAN Storage Area Network. This is a dedicated, high performance Fibre Channel (FC) network to facilitate Block Level communication between servers and storage. Storage is partitioned and assigned to a server for accessing its data.
What are the benefits of SAN? SAN offers scalability, availability, performance and cost benefits compared to DAS.
Describe NAS Network Attached Storage. Dedicated storage for File Serving applications. Connects to an existing communication network (LAN) and provided file access to heterogeneous clients.
What are the benefits of NAS? NAS offers higher availability, scalability, performance and cost benefits compared to general purpose file servers.
What is IP SAN? Internet Protocol Storage Area Network. IP SAN is a convergence of technologies used in SAN and NAS. It provides Block Level communication across a LAN or WAN resulting in greater consolidation and availability of data.
What are the five core elements of Data Center Infrastructure? 1.Application / User Interface 2.Database (More commonly referred to as a Database Management System) 3.Server and Operating System 4.Network 5.Storage Array
What are the seven key requirements for data center elements? Performance Availability Scalability Security Data Integrity Capacity Manageability
What are the four activities within the Information Life Cycle Management Process? Classifying data Implementing Policies Managing the Environment Organizing Storage Resources
What are the benefits of implementing Information Life Cycle Management? Improved Utilization Simplified Management Simplified Backup and Recovery Maintaining Compliance Lower Cost of Total Ownership
What are the three most basic components of a storage system environment? Host Connectivity (Network) Storage Array
What are the physical components of a host? CPU Storage Input / Output (I/O) Device
What are the three methods of communication between I/O devices and the host? User to Host (Keyboard, Mouse, etc.) Host to Host (via Network Interface Card) Host to Storage Device (via Host Bus Adapter)
What are the logical components of a host? Applications Operating System File System Volume Manager Device Drivers
What are the logical components of a host? Application Operating System
What are the two application data access classifications? Block Level (Data stored and retrieved in Blocks specifying the LBA) File Level (Data stored and retrieved by specifying the name and path of the files)
Define Protocol A defined Format for communication between sending and receiving devices.
What are the three major communication protocols for system components? Tightly Connected Entities Directly Attached Entities Network Connected Entities
Give three storage media options Magnetic Tape Optical Disks Disk Drives
What are the key components of a disk drive? Platter Spindle Read / Write Head Actuator Arm Assembly Controller
What are the two ways of accessing data on a platter? Cylender, Head, Sector (CHS) Logical Block Addressing (LBA)
What are the things that affect disk drive performance? Electromechanical Device Disk Service Type
What are the components that comprise service time? Seek Time Rotational Latency Data Transfer Rate
What are the three seek time specifications? Full Stroke Average Track to Track
Define 'Little's Law' It is the relationship between the number of requests in a queue and the response time. N=a x R N = Total number of requests in the system a = The arrival rate R = Average response time
What does RAID provide? Increased Capacity Higher Availability Increased Performance
What are the components of a RAID Array? Host RAID Controller RAID Array Physical Array Logical Array Hard Disks
What are the common RAID Levels? 0 1 Nested RAID 3 4 5 6
Describe RAID 0 A striped array with no fault tolerance.
Describe RAID 1 Disk Mirroring
Describe Nested RAID Benefits from multiple RAID configs 0+1 Striping&Mirroring-Mirrored Stripe-Striping across HDDs is performed then the entire stripe is mirrored 1+0 Mirroring&Striping-Striped Mirror-Incoming data is first mirrored and then both stripes are written
Describe RAID Parity Method of protecting striped data from HDD failure w/o cost of mirroring HDD is added to strip width to hold parity Parity-math construct-allows re-creation of missing data Redundancy check ensures full protection of data
Describe RAID 3 Stripes data for high perf-uses parity for improved fault tolerance Parity info stored on dedicated drive so data can be re-constructed if a drive fails ALWAYS r&w complete stripes of data across all disks Good bandwidth for xfer of large vol of data
Describe RAID 4 Stripes data for high performance & uses parity for improved fault tolerance Unlike 3 disks in R4 can be accessed indep. so specific data elements can be RorW on 1 disk w/o RorW of entire stripe
Describe RAID 5 Drive (strips) are independently accessible Parity is distributed across all disks Preferred for messaging, data mining, medium performance media serving and Relational Database Management System (RDBMS) implementations in which DBAs optimize data acces
Describe RAID 6 Dual Parity Distributes parity across all disks Can survive two disk failures Rebuild operation may take longer due to the presence of two parity sets
What is a 'Hot Spare'? Refers to a spare HDD in a RAID array that temporarily replaces a failed HDD of a RAID set.
What is EMC^2's Best practice concerning Hot Spares? For every two Disk Array Enclosures (DAE) one Hot Spare will be used.
What is an intelligent Storage System? Part 1 RAID Arrays that are: Highly optimised for I/O processing Have large amounts of cache for improving I/O performance
What is an intelligent Storage System? Part 2 Have operating environments that provide: Intelligence for managing cache Array resource allocation Connectivity for heterogeneous hosts
What are the benefits of an intelligent storage system? Increased capacity Improved performance Easier data management Improved data availability & protection Enhanced business continuity & support Improved security and access control
What are the components of an intelligent storage system? Front end, Cache, Back end, & Physical disks
What is the function of the 'Front End' in an intelligent storage system? The front end provides the interface between the storage system and the host. It consists of two components: Front End Ports Front End Controllers
What is the function of a front end port? The front end ports enable hosts to connect to the intelligent storage system. Each front end port has processing logic that executes the appropriate transport protocol, such as SCSI, FC or iSCSI for storage connections.
What is the function of a front end controller? The front end controllers route data to and from cache via the internal data bus. When cache receives write data, the controller sends and acknowledgement message back to the host. Controllers optimize I/O processing by using command queuing algorithms.
Describe command queuing Command queuing is a technique implemented on front end controllers. It determines the execution order of received commands and can reduce unnecessary drive head movements and improve disk head movements and improve disk performance.
What are the most commonly used command queuing algorithms? FIFO-Default algorithm Seek Time Optimization-Cmds executed based on optimising R/W head movements which may result in reordering of cmds Access Time Optimization-Cmds are executed based on the combination of seek time
Describe Cache Cache is semiconductor memory where data is placed temporarily to reduce the time required to service I/O requests from the host.
Describe the ways that cache is implemented in write operations WriteThrough-Data placed in cache and immediately written to disk and an ack is sent to host WriteBack-Data is placed in the cache and an ack is sent to the host immediately.
What is a Read Cache Hit? If the requested data is found in the cache it is called a read cache hit or a read hit and the data is sent to the host without any disk operation.
What is a Cache Miss? If the requested data is not found in the cache, it is called a cache miss and the data must be read from disk.
Describe two cache management algorithms implemented by intelligent storage systems to pro-actively maintain a free set of pages. Least Recently Used (LRU) & Most Recently Used (MRU)
What does the Least Recently Used (LRU) algorithm do? An algorithm that continuously monitors data access in cache and identifies the cache pages that have not been accessed for a long time.
What does the Most Recently Used (MRU) algorithm do? An algorithm that is the converse of LRU. In MRU the pages that have been accessed most recently are freed up or marked for reuse.
Describe 'Watermarking' in cache management Flushing is process of commuting data from the cache to the disk. On the basis of the I/O access rate & pattern, high & low levels called Watermarks are set in cache to manage the flushing process. Provides headroom in write cache for improved perf.
List the 3 types of "watermarks" 100% High Water Mark Low Water mark
Describe 'Idle Flushing' Idle Flushing occurs continuously, at a modest rate, when the cache utilisation level is between the high and low watermark.
Describe 'High Watermark Flushing' Activated when cache utilization hits the high watermark. The system dedicates some additional resources to flushing. This type of flushing has minimal impact on host I/O processing.
Describe 'Forced Flushing' Forced Flushing occurs in the event of a large I/O burst when the cache reaches 100% of its capacity, which significantly affects the I/O response time. In Forced Flushing, dirty pages are forcibly flushed to disk.
Give two methods of Cache Data Protection Cache Mirroring & Cache Vaulting
Describe Cache Mirroring Each write to cache is held in two different memory locations on two independent memory cards.
Describe Cache Vaulting A set of physical disks called vault drives are used to dump the contents of the the cache in the event of a power failure.
In an intelligent storage system, what is the 'back end'? The back end provides the interface between the cache and physical disks. From the cache data is sent to the back end and then routed to the destination disk.
Describe the two components of the "back end" Back End Ports: Connect to the Physical Disks Back End Controllers: Communicates with the disks when performing reads and writes and also provides additional, but limited temporary data storage.
What is a LUN? Physical drives or groups of RAID drives can be logically split into volumes-Logical Unit Numbers(LUN)The use of LUNs improves disk utilization by only alloc the portion of diskspace needed by host,leaving the unused diskspace to be alloc'd to other hosts
What is LUN Masking? LUN Masking is an access control mechanism that provides data access control by defining which LUNs a host can access. LUN masking is typically implemented at the front end controller.
What does LUN Masking ensure? LUN Masking ensures that volume access by servers is controlled appropriately, preventing unauthorised or accidental use in a distributed environment. Usually implemented on storage arrays.
Describe the capabilities of a high end storage array Large storage capacity Huge cache to service host I/Os Fault tolerance architecture Multiple front end ports & support to interface protocols High scalability Ability to handle large amounts of concurrent I/Os Designed for large Corps Called Active
Describe the capabilities of a Midrange storage array Host can perform I/Os to LUNs only through active paths Other paths remain passive until active path fails Have two controllers, each with cache, RAID controllers & disk drive ints Designed for small and medium Corps less scalable than a high end arra
Describe the characteristics of the CLARiiON CX-4 Ultraflex technology Scalable up to 960 disks Supports flash drives Supports RAID 0,1, 1+0, 3, 5, 6 Supports 16GB of cache per controller (x2) Supports storage based local and remote data replication via SnapView & MirrorView CLARiiON Messaging Inte
Describe the characteristics of the Symmetrix DMX-4 Incre scalable to 2,400 disks Dynamic global cache memory 16GB-512GB Advanced proc power High data processing bandwidth (up to 128 GB/s) Supports RAID 1, 1+0 (AKA 10 for mainframe), 5, 6 Storage based local & remote replication through TimeFinder &
Describe the characteristics of the Symmetrix VMAX Series 96 to 2400 drives up to 2 PB 1-8 VMAX engines Up to 1TB global mem Twice the host ports FC, iSCSI Gb Ethernet FICON up to 128 ports 8Gb/s FC FICON & FC SRDF x2 the backend connections for flash Quad core 2.3GHz processors to provide more than x2 the
What is DAS? Direct Attached Storage is an architecture where storage connects directly to servers. Internal or External
Describe Internal DAS Internal DAS is internally connected to the host by a serial or parallel bus.
Describe External DAS In External DAS Architectures, the server connects directly to the external storage device. In most cases, communication between the host and the storage device takes place over SCSI or FC protocol.
What are the benefits of DAS? Ideal for data provisioning Quick deployment for small environments Simple to deploy Reliable Low capital expense Low complexity
What are the four DAS connectivity options? ATA and SATA SCSI FC Bus and Tag (primarily for external mainframe)
What are the two types of DAS Management? Internal: Host provides disk partitioning and file system layout. External: Array based management, lower TCO for managing data and storage infrastructure.
What are some of the challenges of DAS, in regards to scalability? Scalability is limited: Number of connectivity ports to hosts Number of addressable disks Distance limitations
What are some of the challenges of DAS? Downtime required for maintenance w/internal DAS Limited ability to share resources.
What is the definition of SCSI? Small Computer System Initiative SCSI is all about an initiator sending a command to a target.
What does SCSI communication involve? SCSI Initiator Device: Issues commands to SCSI target devices. SCSI Target Device: Executes commands issued by initiators.
Describe SCSI-1 Defined cable length, signalling characters, commands, and transfer modes, Uses 8-bit narrow bus (supports 8 devices)
Describe SCSI-2 Defined common Set (CCS), 16 bit, improved performance and reliability
Describe SCSI-3 Latest version, comprised different but related standards, rather than one large document.
What is SCSI Addressing? Used to uniquely number (0-15) identify hosts and devices. the UNIX naming convention is used to identify a disk and the three identifiers - initiator ID, target ID, and a LUN.
Explain the Structure and Organization of FC Data Exchange Operation (conversation) Sequence (Sentence) Frame (word)
What SCSI ID has the highest priority? 7
What is a SCSI Port? Are physical connectors that the SCSI cable plugs into for communication with a SCSI device. SCSI device may contain initiator port, target port and target/initiator port. To service reqs from multiple devices, a SCSI device may also have multiple ports
WWN World Wide Names: a unique 64-bit identifier which is static to the port. Used to physically identify ports. Like a NIC's MAC Address Every HBA has one Burned into an array port
What is SAN? Storage Area Network. Is a dedicated high speed network for block level access.
Describe a SAN Carries data between servers (AKA Hosts) and storage devices through FC switches. Provides Block Level data access. Consolidates resources centralizing storage and management
Fibre Channel Addressing is dynamically assigned during fabric login. Used to communicate between nodes within SAN. Like an IP Address on a NIC. Address format: 24 bit, dynamically assigned
What are the components of SAN? Servers Network infrastructure Storage
Give a more granular description of SAN components. Node Ports Cabling Interconnecting Devices (FC Switches or hubs) Storage Arrays SAN Management Software
List the 5 Fibre Channel Protocol Stacks FC-0: Physical interface FC-1: Transmission Layer FC-2: Transport Layer FC-3: Has not been implemented FC-4: Upper Layer Protocol
Give a Fiber Channel Architecture Overview Used channel technology High performance with low protocol overheads FCP is SCSI-3 over FC network Has five layers
What is Fibre Channel SAN and its components? Moves blocks of data over fibre optic cables using SCSI commands between initiator and target. Components: director/switch, host (node), storage (node), cables, management software to control ports/switches.
FLOGI Fabric Log In: between N-Port to F_port Between node and switch (switch/array or initiator/target) 1st in process
What are the two types of optical cables? Single Mode: Can carry single beams of light with a distance of up to 10 KM. Multi Mode:Can carry multiple beams of light simultaneously at a distance of up to 500M. (Note: multi mode cable can suffer from modal dispersion)
PLOGI Port Login : between N_Port to N_Port (initiator to target initial contact) 2nd in process
PRLI Process login (figure out how to talk by a common language - SCSI) 3rd in series
What are the different types of SAN connectors? Node Connectors: Standard Connector (SC) Duplex Connectors Lucent Connector (LC) Duplex Connector Patch Panel Connectors Straight Tip (ST) Simplex Connectors
ISL Inter Switch Links - connects two or more FC Switches to each other using E-Ports. Used to transfer host to storage data as well as the fabric management traffic from one switch to another Also one of the scaling mechanisms in SAN connectivity
What is an N_Port N_port is a port on the node (e.g. host or storage device) used with both FC-P2P or FC-SW topologies. Also known as Node port.
What is an NL_Port NL_port is a port on the node used with an FC-AL topology. Also known as Node Loop port.
What is an F_Port F_port is a port on the switch that connects to a node point-to-point (i.e. connects to an N_port). Also known as Fabric port. An F_port is not loop capable.
What is an FL_Port FL_port is a port on the switch that connects to a FC-AL loop (i.e. to NL_ports). Also known as Fabric Loop port.
What is an E_Port E_port is the connection between two fibre channel switches. Also known as an Expansion port. When E_ports between two switches form a link, that link is referred to as an inter-switch link (ISL).
What are the three commonly used SAN Interconnecting Devices? Hubs: Physically connect nodes in a logical loop or a physical star topology. Switches: More intelligent than hubs and directly route data from one physical port to another. Directors: Departmental switch
Describe the SAN Interconnectivity Option called FC-SW? Fibre Channel switched fabric (FC-SW) - provides interconnected devices, dedicated bandwidth, and scalability. Also know as fabric connect.
Describe the SAN Interconnectivity Option called FC-AL? Fibre Channel Arbitrated Loop (FC-AL): devices are attached to a shared loop. Devices on the loop must arbitrate to gain control of the loop. At any given time, only ONE device can perform I/O operations on the loop.
What is the simplest form of SAN Interconnectivity? Point to Point - two devices are connected directly to each other (like DAS).
Describe SAN Management Software? A suite of tools used in a SAN to manage the interface between host and storage arrays. Provides integrated management of SAN environment. Web based GUI or CLI
What is Core-Edge Fabric? two types of switch tiers - the edge tier (comprised of switches) and the core tier (enterprise directors)
What are the two types of Core-Edge Fabric? Single Core: all hosts are connected to the edge tier and the core tier. Dual Core: can be expanded to include more core switches - enables load balancing.
Describe the Fabric Topology Mesh each switch is connected directly to the other switches by using ISLs. Promotes enhanced SAN connectivity.
Name the different types of Fabric Topology Mesh Full Mesh: every switch is connected to another switch in the topology - appropriate for a smaller # of switches Partial Mesh: several hops or ISLs may be required for traffic to reach its destination. Can cause latency issues.
Describe the term Zoning in Fabric Management FC switch function-enables nodes w/in the fabric tobe logically segmtd in groups that can communic8 w/each other Access control done on the switch or fabric vs. LUN masking which is done on the array Setting up relationship between initiator & tar
What are the Storage Over IP protocol Options? iSCSI (SCSI over IP) or FCIP (Fibre Channel-to-IP bridge / tunnel
Describe iSCSI IP base protocol-establishes & manages connections between storage, hosts & bridge devices over IP Carries block level data over IP Encapsulates SCSI cmds & data in order to allow these encapsulated cmds & data blocks to be transported by TCP/IP pac
Describe FCIP Point to Point Fibre Channel end points
Describe the components of a Zone Members: nodes within the SAN that can be included in a zone Zones: comprise a set of members Zone Set: comprise of a group of zones that can be activated or deactivated as a single entity fabric. *Only one zone set per fabric can be active at a
Describe the Types of Zoning Port Zoning (hard zoning): uses FC addressing of the physical ports to define the zones (most secure - EMC general Practice). WWN zoning (soft zoning): uses world wide names to define zones.
What are the components of iSCSI? iSCSI host initiators iSCSI Targets LAN for IP Storage Network
What are iSCSI Host Initiators Host computer using a NIC or iSCSI HBA to connect to storage
What are iSCSO Targets Storage array with embedded iSCSI capable network port
What makes up the LAN for an IP Storage Network? Interconnected Ethernet switches and/or routers.
What is NAS and what are the benefits? Efficiency Flexibility Centralized storage Simplifies management Scalable High Availability Secure
What are the iSCSI host connectivity options? Software Initiators TCP Offload Engine (TOE)
What is a Software Initiator? Code that can be loaded onto a host to provide the translation between the storage I/O calls and the network interface
What is anTCP Offload Engine (TOE) Moves the TCP processing load off the host CPU onto the NIC Card to free up processing cycles for application execution.
What are the component of NAS? NAS Head (CPU and Memory) NIC Card(s) Operating System to manage NAS functions NFS (unix) and CIFS (microsoft) Industry-standard storage protocols Storage Array
Describe the NAS File Sharing Protocols CIFS - Common Internet File System Protocol. Microsoft Environment sharing protocol. NFS - Network File System Protocol. UNIX environment file sharing protocol.
Describe Step 1 of the NAS I/O Process requester packages the I/O request into TC/IP to a remote file system which is handled by the NAS.
Describe Step 2 of the NAS I/O Process The NAS converts the I/O into an appropriate physical storage request (block level I/O).
Describe Step 3 of the NAS I/O Process When the data is returned from the physical storage pool, the NAS processes and repackages it into a file protocol response.
Describe Step 4 of the NAS I/O Process The NAS packages this response into TCP/IP again and forwards it to the client through the network.
What are the three iSCSI Topologies? Native Connectivity: Do not have and FC components; perform all communication over IP. Bridged Connectivity: Enable the co-existance of FC with IP by providing iSCSI to FC bridging functionality. Combining FCP and Native Connectivity
Describe the types of NAS Implementations Integrated NAS: has all components of NAS in a single enclosure. Connects to the IP network to provides connectivity to the clients and service the file I/O requests. Gateway NAS: has independent NAS head and one or more storage arrays (2 protocols)
What are the two ways in which iSCSI discovery takes place? Send Targets Discovery-Initiator is mutually configured with the target Internet Storage Name Service (iSNS): Initiators and targets automatically register themselves with iSNS server. iSNS is a client / server model.
Describe how Managing an Integrated System (NAS Connectivity) works Both the NAS component and the storage array are managed via NAS management software
Describe managing a Gateway System (NAS) NAS component managed via NAS Management software and the storage array is managed via array management software
What are the two types of iSCSI names? IQN: iSCSI Qualified Name - ex: iqn.2008-02.com.example:op EUI: Extended Unique Identifier - eui.020234k2034j03D34
What is an EMC Celerra Celerra is a dedicated high-performance infrastructure for FILE LEVEL I/Os Consists of: Data movers (file servers in cabinet) Control Station (sets up data movers and initially configs them) Specialized OS - DART - Linux Red Hat
Describe how to join the building blocks in Integrated NAS and Gateway NAS? Integrated: system is assignd to dedicated NAS storage. No other SAN hosts connected to the storage. array is dedicated solely to NAS provision Gateway: NAS system is assigned sep. apportioned storage w/in the array. 2 separate sections for SAN & NAS
What is an FCIP Frame? Encapsulates FC frames in IP packets FCIP router is used for encapsulation FC Router at other end removes IP wrapper and sends FC data to other fabric Includes security, data integrity, congestion and performance specifications
What is Fibre Channel over Ethernet? Maps Fibre Channel protocol natively over Ethernet.
Describe Lossless Ethernet To support Fibre Channel frames over Ethernet, no frames can be dropped throughout the entire transmission. No frame drop due to congestion or buffer overflow. PAUSE capability of Ethernet is used to achieve the lossless fabric.
Describe the FCoE Physical Elements Host Interface: CNA (converged network adapter) - ex: PCIs card on host consolidates NICs and HBAs 10 Gbps connectivity options: either copper or standard optical
Describe the benefits of FCoE Lower capital expenditure Reduced power and cooling requirements Enabler for consolidated network infrastructure Lower TCO
What is Virtualization? The technique of abstracting physical resources into a logical view. Increases utilization Simplifies resource management Reduces downtime (planned and unplanned) Improved performance of IT resources
What are the challenges of storing fixed content? Fixed content is growing at more than 90% annually. New regulations require retention and data protection Often, long term preservation is required Simultaneous multi-user online access is preferable to online storage Need faster access to fixed conte
What is a swap file (used in memory virtulization)? is a portion of the hard disk that functions like physical memory (RAM) to the operating system.
Describe Network Virtualization creates virtual networks whereby each application sees its own logical network independent of the physical networks. EX: Virtual LAN (VLAN) - centralized configuration of devices
What are the 3 traditional storage solutions for archive? Online Nearline Offline Traditional archival solutions were offline.
What is Server Virtualization? enables multiple operating systems and applications to run simultaneously on different virtual machines created on the same physical server (or group of servers). Provides a layer of abstraction between the OS and the underlying hardware.
What is storage virtulization? Process of presenting a logical view of physical storage resources to hosts Logical storage appears and behaves as physical storage directly connected to host
What are the shortcomings of traditional archival solutions? Tape is slow Optical is expensive and requires vast amounts of media Recovering files from tape and optical is often time consuming Data on tape and optical is subject to media degradation Both solutions require sophisticated media management
What does SNIA Storage Virtulization Taxonomy provide? the Storage Networking Industry Association (SNIA) storage virtulization taxonomy provides a systematic classification of storage virtulization, with three levels: WHAT, WHERE, and HOW Specifies the types of virtulization:
What is Content Addressed Storage? Object oriented, location-independent approach to data storage Repository for the "objects" Access mechanism to interface with repository Globally unique identifiers provide access to objects
What are the benefits of CAS? Content authenticity Content integrity Location independance Single instance storage Retention enforcement Record level protection and disposition Technology independence Fast record retrieval
Describe block level storage virtualization -Ties together multiple independent storage arrays. Presented to host as a single storage device Mapping used to redirect I/O on this device to underlying physical arrays. -Deployed in a SAN environment
What are the Physical Elements of CAS? Storage devices (CAS based) Storage Node Access node Servers (to which storage devices get connected) Client
Describe the Application Programming Interface (API)? A set of function calls that enables communication between applications or between an application and an operating system.
Describe file-level virtualization addresses the NAS challenges by eliminating the dependencies between the data accessed at the file level and the location where the files are physically stored.
Describe EMCs Invista provides block-level storage virtualization in heterogeneous storage environments. Supports dynamic volume mobility for volume extension and data migration between different storage tier without any downtime.
What is a BLOB? The distinct bit sequence (DBS) of user data represents the actual content of a file and is independent of the file name and physical location.
Describe Control Path Cluster Control Path Cluster-storage device running invista and is located OUTSIDE of the data path (handles any requests which are NOT I/Os)
Describe Data Path Controller Data Path Controller - special purpose SAN switch/blade which operates inside the data path and handles the I/O requests. If its not an I/O then it routes the request to the CPC.
Describe EMC Centera Architecture deals with the storage and retrieval of fixed content Based on RAIN (redundant array of independant node - access and storage) 1 TB of usable capacity in each node two 24-port 2 gigabit internal switches Is self healing
What is a C-Clip? A package containing the user's data and associated metadata C-Clip ID is the CA that the system returns to the client application
What data protection does the Centera Use? CCP - content protection parity CPM - Content protection mirrioring
Describe Content Address An identifier that uniquely addresses the content of a file and not its location. Unlike location based addresses, content addresses are inherently stable and, once calculated, they never change and always refer to the same content
Describe the C-Clip Descriptor File (CDF) The additional XML file that the system creates when making a C-Clip. This file includes the content addresses for all referenced BLOBs and associated metadata.
What are the features of CAS? Integrity Checking Data protection (local and remote) Load balancing Scalability Self - diagnosis and repair Report generation and event notification Fault tolerance Audit trails
What is virtualization? Its a technique of abstracting physical resources in to a logical view. Benefits: increases utilization, simplifies resource management, reduces downtime, improved performance of IT resources
What are the four forms of virtualization? Memory, storage, servers, and storage
How does virtual memory work? makes an application appear as if it has its own contiguous logical memory independent of the existing physical memory resource. Done by virtual memory managers (VMM) Space used by VMMs on the disk is known as a swap file
What is a SWAP file? the portion of the hard disk that functions like physical memory (RAM) to the operating system. - gives the illusion physical space on the memory
How does Network Virtualization Work? creates virtual networks whereby each application sees its own logical network independent of the physical network. EX: Virtual LAN (VLAN) - enables centralised configuration of devices located in the physically diverse locations.
What are the benefits of Virtual Memory? Removed physical - memory limits Run multiple applications at once
What are the benefits of Virtual Networks? Common network links with access-control properties of separate links Manage logical networks instead of physical networks Virtual SANs provide similar benefits for SANs
What is Business Continuity? Preparing for, responding to & recovering from a application outage that adversely affects business ops. Addresses unavailability & degrades application performance Integrated & enterprise wide process & set of activities to ensure "info availability"
What are the benefits of server virtualization? Break dependencies between operating system and hardware Manage OS and application as a single unit Strong fault tolerance Hardware - independent
How does storage virtualization work? the process of presenting a logical view of the physical storage resources to a host. Appears and behaves as physical storage
What are the benefits of storage virtualization? Increased storage utilization Adding or deleting storage without affecting applications availability Non-disruptive data migration - KEY
What is Information Availability (IA)? Accessibility: Information should be accessible in the right place and to the right user Reliability: Information should be reliable and correct Timeliness:Information must be available whenever required
Describe the Multi-Level Approach to Storage Virtualization Server: path management, volume management, replication Storage Network: path redirection, load balancing - ISL trunking, Access control - zoning (ex - powerpath Storage: volume management - LUNs, access control (LUN Masking), replication, RAID
What are the two types of storage virtualization configs? Out of Band - the virt. env. confi is stored externally to the data path - minimal latency In Band - implementation places the virtualization function inside the data path - additional latency
What are some causes of information unavailability? Planned Outages (80%) Unplanned Outages (20%) Disaster (<1%)
What is block-level storage virtualization? Ties together multiple independent storage arrays and presents them to the host as a single storage device. Mapping is used to direct the I/O on this device to underlying physical arrays
What are some of the impacts of Downtime? Lost productivity Damaged Reputation Lost Revenue Financial Performance Other expenses
What is file level virualization? address the NAS challenges by eliminating the depenpendancies between the data accessed at the file level and the location where the files are physically storage.
What are main hardware components of Invista? Control path cluster (CPC): stores configurations parameters OUTSIDE of the data path. Data Path Controller (DPD): special purpose SAN switch blade which routes I/Os INSIDE the data path. If its not an I/O then it sends it to the CPC
What is Disaster Recovery? Coordinated process of restoring systems, data & infrastructure required to support ongoing business operations in the event of a disaster Restoring previous copies of data & applying logs to that copy to bring it to a known point of consistency
What are the benefits of virtual provisioning? Reduce administrative costs (people) Reduce storage costs by deploying assets as needed Reduce operating costs (fewer disks) Reduce downtime
What is Disaster Restart? The process of restarting from disaster using mirrored consistent copies of data and applications Generally implies the use of replication technologies
In virtual provisioning - what is thin pool expansion? Adding drives to a thin pool on disruptively increases available shared capacity for all the Thin LUNs in the pool
Describe the "Cloud" Approach to Storage A cost effective approach to handling Internet era data growth. Five requirements: infinite scale, no boundaries, operationally efficient, self-managing, self-healing
Define cloud computing is an emerging IT development, deployment, and delivery model, enabling real time delivery of products, services and solutions over the Internet
What is Recovery Point Objective? A point in time to which systems and data must be recovered after an outage The amount of data loss that a business can endure
What are the key attributes of Cloud Services? Offsite third party provided Accessed via Internet Minimal to no IT skills required to implement Provisioning Pricing User interface system interface Shared resources
What is Recovery Time Objective? The time within which systems, applications or functions must be recovered after an outage. The ammount of downtime that a business can endure and survive.
What are the solutions and supporting technologies that enable business continuity and uninterrupted data availability? Single point of failure Multi-pathing software Backup and replication Backup and recovery Local replication remote replication
What are the three types of backup granularity? Full Backup Incremental Cumulative
Define Single Point of Failure The failure of a component that can terminate the availability of the entire system or IT service.
What are the different types of Backup Methods? Cold - offline Hot - online Open File (either have to retry or have a SW agent) Point in Time (PIT) Replica Backup file metadata for consistency Bare metal recovery
What are some advantages of Multi-pathing Software? Configures multiple paths to increase data availability Helps to recognize and and utilize alternate I/O paths to data Provides load balancing to improve data path utilization
Backup Architecture and Process Backup Client - sends backup data to backup server or storage node Backup Sever - manages backup operations and maintains backup catalog Storage Node - Responsible for writing data to backup device
What is local replication? Data from the production devices (LUN) is copied to replica devices within the same array The replicas can then be used for restore operations in the event of data corruption or other events
What are Direct Attached Backups? a backup device is attached directly to the client. Only the metadata is sent to the backup server through the LAN. - Frees LAN from backup traffic
What are LAN Based Backups? all servers are connected to LAN & all storage devices are directly attached to the storage node. The data to be backed up is xferred from the backup client (source), to the backup device (destination), over the LAN, which may affect network performance.
Describe some attributes of EMC PowerPath Host based software Resides between the application and SCSI Device Driver Provides intelligent I/O path management Is transparent to the application Automatic detection and recovery from host to array path failures
What are SAN based backups? backup devices and clients attached to the SAN
Mixed Backup (2 Clients) uses both LAN and SAN - the data goes through both the LAN and the FCSAN
Describe flushing host buffer Flush memory (buffer) on the host before you make the copy. Done by the sync daemon (unix)
What is the dependent write I/O Principle? Dependant Write: a write I/O that will not be issued by an application until a prior related write I/O has completed - LOGICAL dependency - NOT a time dependency
What are the two local replication technologies? Host based array -logical volume manager (LVM) based mirroring -File system Snapshot Storage Array Based -Full volume mirroring -Pointer based full volume replication -Pointer based virtual replication
What is LVM Based Mirroring? the LVM is responsible for creating and controlling the host-level logical volume. Components: physical volumes (disk), volume groups, and logical volumes.
What is File System Snapshot? Is a pointer based replica that requires a fraction of the space used by the original file system.
What are the limitations of host based replications? -LVM based replicas add overhead on host CPUs -Host based replicas can be usually presented back to the same server -Keeping track of changes is a challenge after the replica has been created
Describe how a storage array based local replication works? -Replication is performed by the Array Operating Environment -Replicas are on the same array Types: -Full-volume mirroring -Pointer-based full volume replication - Clone -Pointer -based virtual replication - snap
What is pointer based full volume replication? A clone -Provides a full copy of the source data on the target -Target device is made accessible for business operation as soon as the replication session is started -Point-in- time is determined by the time of session activation
Describe detached full volume mirroring After synchronisation is complete, the target can be detached from the source and made available for BC operations. -PIT is determined by the time of detachment -After detachment, re-synchronisation can be incremental
What is COFA? Copy On First Access-Data is copied from the source to the target on when: A write is issued the first time after the PIT to a specific address on the source A read or write is issued for the first time after the PIT to a specific address on the target
What is full copy mode? On session start, the entire contents of the source device is copied to the Targer device in the background. -most vendors also provide SW to track d changes made to the source or target
What is synchronous Replication? data is committed at both the source site and the target site before the write is acknowledged to the host. Any write to the source must be transmitted to and acknowledged by the target before signalling a write complete to the host.
What are the challenges of synchronous replication? - Response time extension for applications (data must transmit to target site before write can be acknowledged) -Bandwidth - needs high bandwidth -Rarely deployed beyond 200 Km (125 miles)
What does asynchronous replication mean? write is commitd to the src and immediately acknowledged to the host Data is buff'd at the src & RX'd to the remote site later Writes timed stamped & applied in the order it was received then written to target Needs avg bwidth deployed over the long d
What are the two remote replication technologies? Host Based: Logical Volume Manager (LVM) and supports both synchronous and asynchronous mode and logs shipping Storage Array Based: support both synchronous and asynchronous mode. Disk buffered - consistent PITs - combines local and remote replication
LVM Based Replication is performed and managed at the volume group level. Writes to the source volume are transmitted to the remote host by LVM. The LVM on the remote host receives the writes and commits them to the remote volume group.
What are the advantages and disadvantages of LVM? Pro-diff storage arrays & RAID protection can be used at the src & target sites. -Response time issues can be eliminated w/asynchronous mode, w/extended RPO. Con-Extended network outages require large log files & results in higher CPU overhead on host
What is host based log shipping? Transactions to the source database are capture in logs, which are periodically transmitted by the source host to the remote host. The remote host rec. the logs and applies then to the remote database.
What is storage array based remote replication? -Replication is performed by the array operating environment so that host CPU resources can be devoted to production, arrays communicate over dedicated channels -Replicas are on different arrays. Most used for disaster recovery.
How does array based synchronous replication work? 1. Write is rec. by the source array from the host 2. Write is transmitted by the source array to the target array 3. Target array sends acknowledgement to the source array 4. Source array signals write complete to host
How does array based asynchronous replication work? 1. Write is received by the source array from the host 2. Soure array signals write complete to host 3. Write is transmitted by source array to the target array 4. Target array send acknowledgement to the source array
How do you ensure consistency in asynchronous replication? You can maintain a write order - attach a time stamp or Dependent write consistency (buffer the writes in the cache of the source array for a period of time)
What is Array based Disk Buffered Replication? local and remote replication technologies can be combined to create consistent PIT copies of data on target arrays. -RPO is in HOURS -Lower bandwidth is required -Extended distance
What is three site replication and what are the two types? -there is a bunker site between replication sites. 1. Cascade/Multihop 2. Triangle/Multi-taget - SRDF/Star in Symmertrix - concurrent replication of source to two different arrays.
Dscribe SAN Based Remote Replication -Replicate from one storage array to any other storage array over SAN/WAN - can implement tiered storage, do data migrations, and remote vault -heterogeneous array support -No impact on LAN or servers
What are the terminologies in SAN Based Replication? Control Array: responsible for replication operations Remote Array: to/from which data is being replicated Operation (2): -Push: data is pushed from control array to remote array -Pull: data is pulled to the control array from remote array
What are the network options for Remote Replication? a dedicated or a shared network must be in place for remote replication -uses an optical network for for extended distances: DWDM and SONET
What is DWDM? Dense wavelength division multiplexing(DWDM)puts data from different srcs on optical fiber w/each signal carried on its own sep wavelength up 32 prot and 64 unprot sep wavelengths of data can be multiplxd into light stream TX'd on a single optical fiber.
What is SONET? -Synchronous Optical Network is Time Division Multiplexing (TDM) technology -Implemented over long distances
What are the types of EMC remote replication? Symmetrix Arrays CLARiiON Arrays SAN Copy
What are the attributes of a Storage Security Framework? Confidentiality Integrity of Data Availability of Data
Define the Risk Triad Refers to the risk in terms of threats, assets and vulnerabilities.
What are the two types of attacks that can be carried out on an IT infrastructure? Active & Passive
What are three factors to consider when assessing the extent to which an environment is vulnerable to security threats? Attack Surface: The various entry points that an attacker can use to launch an attack Attack Vector: A step in a series of steps necessary to complete an attack Work Factor: The amount of time and effort required to exploit an attack vector.
In terms of security vulnerabilities, what are some of the solutions to protect critical assets? Minimize the attack surface Maximize the work factor Manage vulnerabilities
What are some technical countermeasures to network vulnerabilities? Implementations in computer hardware, software and firmware
What are some non-technical countermeasures to network vulnerability? Administrative Policies and Standards Physical Standards Guards Gates
What are the three Security Domains? Application: Involves access to stored data through the storage network Management: Involves access to storage and interconnect devices and to the data residing on those devices Backup & Data Storage: BURA access
What does BURA stand for? Backup, Recovery and Archive
What are some of the threats in the Application Access Domain? Spoofing user / host identity Elevation of user / host privileges
What are some of the threats in the Management Access Domain? Spoofing user / administrator identity Elevation of user / administrator privileges Tampering with Data Denial of Service Network snooping
What are some of the threats in the BURA Domain? Spoofing of User / Administrator Identity Elevation of User / Administrator privilege Tampering with Data Denial of Service Network snooping
What are some of the Security Controls used in Storage Infrastructure? User Authentication User Authorization Host and Storage authentication Access Control to Storage Objects Storage Access Monitoring Infrastructure Integrity Management Network Encryption Management Access Control Primary to secondary Access Control
What are some of the security implementations in SAN? Traditional FC SANs are configured as an isolated private network making them inherently more secure Authenticating FC Entities Setting up Session Keys
What are some of the basic SAN Security Mechanisms? Array Based Volume Access Controls Security on FC Switch Ports Switch-Wide and Fabric-Wide Access Control Logical Partitioning of a fabric: Vertual SAN (VSAN)
What is Array Based Volume Access Control? LUN Masking: Filters the list of LUNS that an HBA can access S_ID Lockdown (EMC Symmetrix Arrays): Stronger variant of masking Port Zoning: Zone Member is of the form (Switch_Domain_ID_Port_Number).
What are some security measures on FC Switch Ports? Port Binding: Limits devices that can attach to a particular switch port Port Lockdown / Port Lockout: Restricts the type of initialization of a switch port
What are some of the components of Switch-Wide and Fabric-Wide Access Control? Access Control Lists (ACL) Fabric Binding
What is Logical partitioning of a fabric? Administrator allocates switch ports to different VSANs A switch port can be in only one VSAN at a time Each VSAN has its own distinct active zone set and zones Fabric events in one VSAN are not propagated to the others Role based Management
What are some authentication and authorization mechanisms used in NAS? Kerberos and Directory Services Firewalls
What are the types of Windows Access Control Lists used in NAS? Discrentionary ACLs (DACLS) System ACLs (SACLS)
Describe the UNIX Permissions used in NAS file sharing Every file and directory (folder) has three access permissions Rights for the file owner Rights for the group you belong to Rights for all others in the facility
What is CHAP? Challenge Handshake Authentication Protocol.
What is iSNS Discovery Domain? Internet Storage Name Server Functions the same way as FC Zones Provides functional groupings of devices in an IP-SAN In order for devices to communicate with one another, they must be configured in the same discovery domain
What are the major storage infrastructure components that should be monitored? Servers, Databases and Applications Network (SAN & IP) Storage Arrays
Created by: AZ4ME