Friday, June 11, 2021

Storage Basics

(Audience: IT security person who needs a basic understanding of IT storage.)

Storage is one of the IT disciplines that is a little less visible to users.

A typical server has one or more components that store data -- hard disks, SSDs, and/or tapes.  The basic idea of IT Storage is to move some or all of these components outside the servers.  Some or all of these storage components are separated into their own devices.

Why do we separate storage?

Separating out storage yields a bunch of benefits:

Storage devices can provide larger capacity than a single server can hold

Servers only have so many internal slots.  If a server needs more storage than its drive slots can support, dedicated storage is the only option.

Storage devices can more easily provide a mix of storage sizes.

If you have some servers that need a tiny amount of storage and some that need a lot, it can be hard to order servers in the right balance.  Special storage devices can have big pools of physical disks that are divided up between servers.  So you can more easily allocate a little storage to the servers that need a little and a lot of storage to the servers that need a lot.

Dynamic growth

When servers contain disks internally, if there is a need to add capacity while the server is already "live", it can be difficult.  If the storage is provided by special storage devices, it is easier to do this "live" without downtime or rebuilds.

Robust, consistent handling of disk failures

Storage components such as disks tend to fail.  If the storage has RAID, this is easy to fix so long as someone notices quickly.  When disks are inside servers, the ability to fix disk failures depends on the system administrators, operating systems, supporting software, and configurations.  These can be of inconsistent quality.  Centralizing storage management allows these functions to be centralized and performed consistently.

File sharing

Sometimes there is a need to share the same files to many servers, such as network shares, user home directories, or migrating profiles.  This is often easiest and most reliable with dedicated storage devices.

High performance

Some scenarios, such as high-performance computing, require storage that is much faster than normal.  Some needs can be best served with dedicated hardware.

What kinds of products are out there?

Storage products fall into a number of categories.

Direct-attached storage (DAS)

This is the simplest to understand.  It's typically a box that looks like a server, but contains lots of disks, and connects to a server with an interface cable. 

Some of those interfaces are specific to storage, such as "fiber channel" or "SCSI".  Others of those interfaces could potentially be used for multiple purposes, such as fiber channel over ethernet (FcoE) or iSCSI.  If the interface is FcoE or iSCSI, it might be dedicated to storage or it might be shared with other network traffic.

A logical disk carved up from a real disk pool is typically referred to as a "LUN".  It is shared at the "block" level, meaning that the servers that it is shared to see it as a virtual disk that they can format with partitions and filesystems.

Storage Area Network (SAN)

As with direct-attached storage, we have a box with disks inside.  But now, instead of connecting it directly to one server, we attach it to a network.  As with direct-attached, we have multiple options of how to made the connection -- storage-specific options such as fiber channel, or potentially dual-use technologies such as FcoE or iSCSI that may or may not actually have dual-use.  The box or boxes can typically carve up the storage so that different servers can be given different-sized pieces, and the sizes can be changed for "live" servers.

Once more, the SAN presents logical disks to servers at the "block" level.  The servers still need to partition and format the disks.

Network Attached Storage (NAS)

With a NAS, one wants to have a central file store that is shared to one or more computers over a network.  Usually the network is an IP network.  The network could be dedicated to storage, but usually is shared with other traffic.

The servers see the storage as files and directories, rather than the "blocks" that we see with DAS and SAN.  So this is called "file storage."

This typically works via features built into the server operating systems.  Both the NAS and the operating system need to support special "storage protocols" such as NFS or CIFS/SMB.  It can require systems administration setup and support.

This kind of storage can also be useful from desktops.

Object storage

This is a relatively new kind of storage.  Like Network Attached Storage, it is accessible using regular IP technology.  But it is somewhat different in that the interface is typically an "application programming interface" (API), often via web protocols, rather than special storage protocols.

This type of storage is typically provisioned not at the individual file level, but as "buckets" of storage that can hold a collection of related data.

This type of storage is specifically designed to not require operating system support.  Applications and users can provision storage objects and access them by talking directly to the object storage API.

Security concerns

Like any technology, Storage has security concerns.

Access controls

How do the storage devices know which client(s) should be allowed to access which resources?  What access level (read-only vs. read-write) should be applied to a given client's access to a given resource?

If the storage is connected to only one device then access controls are easy.  If the device is connected through a special-purpose network or a dedicated network to multiple devices, then access control must be carefully managed.  And if they connect through a network that allows other kinds of traffic in a less controlled fashion, then we require even more robust network controls.

Client trust and authentication

To what extent does the storage trust the servers and/or desktops that access it?  For example, NAS devices that talk NFS version 3 assume that the client user is who the user claims to be.  So if a user can place a laptop they control on the network, at an IP or subnet in the NAS access list, then they can claim to be any user.

Some storage types (e.g. NFSv4 and AFS) can utilize kerberos tickets to avoid this issue.  But they are complicated to setup and manage.

Encryption

Many storage technologies do not default to encrypting data in transit.

Network trust and dedicated network

As mentioned above, there are a number of issues that revolve around network trust -- access controls, client trust, and encryption.  Security personnel are likely to want dedicated networks for storage.  Especially SAN storage.

Management controls

Many storage devices have software for the storage device to be "managed" -- configuration and reporting.  Even if the actual storage goes through a dedicated interface, the management function usually needs to be on a regular IP network.  It needs to have security controls, like any other device on a regular IP network -- security scans, best practice configuration checklists, and all the rest.

No comments:

Post a Comment