Robert Buda, Author at Buda Consulting

5 Things to NEVER DO with Your Passwords

by Robert Buda | Nov 2, 2020 | Best Practices, Database Security

Password security is one of many elements of our ongoing efforts to protect our customers’ data. But even though we have all heard many times how important password protection is, we still see basic password protection rules broken all the time.

So here is a quick refresher:

Never write down your password on paper. Never stick a note with your password on it to your laptop keyboard, or tape it to your monitor, or hang it on your cubicle wall. Just don’t do it!
Never keep passwords in a clear text (non-encrypted) file on your laptop, on a server, or on any storage device. Just don’t do it!
Never make it easy for a hacker to guess your password by including the company name, vendor name, your name, server name, application name, department name, pet’s name, kid’s name, spouse’s name, birthday, anniversary, or any combination of the above. Substituting some symbols for letters, like P@yr0ll or S@l$sF0rce, is still not OK. Personal information is easy to find on the internet, and the symbol substitution won’t fool a good hacker. Just don’t do it!
Never log on to anything while sharing your screen in a web meeting. A quick screenshot can be taken by anyone watching. Just don’t do it!
Never send passwords to colleagues, clients, vendors, or anyone else in a non-encrypted email, or in a Slack message, Google chat, or any other “open” channel. Just don’t do it!

OK, so I told you what not to do. Now how can you cope with all the passwords you have to remember?

The approach that I use is to minimize the number of passwords that I have to remember by using a password store application (aka a password manager) like LastPass. This tool and others like it securely store many of my passwords, so I only need to remember the master password that opens my password store. All my other passwords are randomly generated, very strong passwords that I don’t even try to remember. My master password is a complex string of characters and numbers, but since it’s the only password I need to remember it’s not a problem.

One more thing: whenever it is offered, use two-factor authentication (2FA) for applications that really matter, like bank accounts. The extra step is simple with SMS and authentication apps, and well worth it for the significant extra protection 2FA offers.

So go ahead, protect your data—JUST DO IT!

Happy protecting!

SQL Server Always Encrypted—Is It Right for My Data?

by Robert Buda | Oct 23, 2020 | Best Practices, Database Security, SQL Server

Chances are your company holds sensitive transactional data like personal, financial or health records. This data is often subject to regulations and is also coveted by cybercriminals. Unfortunately, unless encrypted it is vulnerable to compromise both at rest on the server that stores it, and in transit to/from clients that request it.

For Microsoft SQL Server users, Transparent Data Encryption (TDE) has long been available to protect data at rest in the event that database files or backups are compromised. But TDE still leaves encrypted data vulnerable to a privileged user who can access it via the database.

Further, TDE offers nothing to secure SQL Server data in transit. HTTPS can help, but still leaves data exposed to man-in-the-middle (MITM) and other attacks.

To help mitigate these security weaknesses, SQL Server 2016 introduced the Always Encrypted feature to protect sensitive data at rest, in motion and even in memory. Outside an application client’s connection, the data remains encrypted—across the network, server, storage and database.

With Always Encrypted, only users and applications with access to valid keys can decrypt the data, so MTM attacks, insider threats, etc. are nullified. In addition, Always Encrypted lets you encrypt data at the column level, whereas TDE requires encrypting the entire database.

Who should use Always Encrypted?

New privacy regulations and other compliance and data governance mandates are driving more and more SQL Server users to leverage Always Encrypted. It can be a big help with separating those who own and can view the data from those who manage the data but should otherwise not access it.

For example, Always Encrypted lets you protect Personal Information (PI) that you store in the cloud. Even when the data is outside your direct control, the cloud service provider, third-party DBAs, your on-premises admins and other privileged users can’t access it.

Because it is available with all SQL Server 2016 and newer editions (as of SP1), you can try Always Encrypted without purchasing additional licenses. However, the real investment lies in modifying the client application, which will handle the encryption/decryption outside the SQL Server or Azure SQL Database environment.

Always Encrypted cryptography

Always Encrypted is aptly named: from the standpoint of the database engine the data is literally “always encrypted.” But some queries on encrypted data are still supported, depending on the column-level encryption setting.

You can choose either of two encryption types:

Deterministic encryption, which always generates the same encrypted values. This enables the database engine to perform specific operations (point lookups, grouping, equality joins, and also supports indexing.
Randomized encryption, which is even more secure but makes the data write/display-only. Thus, it can’t be evaluated or used in any database operations and cannot be indexed.

The downside of deterministic encryption is that unauthorized users could potentially extrapolate information about the data in encrypted columns by examining patterns within the encrypted values, especially if the set of possible values is small (e.g., true/false). Randomized encryption generates a less predictable result, but precludes indexing, searching, grouping or joining operations on encrypted columns.

For columns of sensitive data that will be used as search or grouping parameters (e.g., US Social Security Numbers), you’ll need to use deterministic encryption. Randomized encryption can be used for data that isn’t grouped with other records and won’t be used to join tables.

History and enhancements

As noted above, Always Encrypted has been available across all SQL Server and Azure SQL Database editions and SQL Database service tiers since SQL Server 2016 (13.x) SP1. Prior to that, it was only available in the Enterprise and Developer Editions.

Beginning with SQL Server 2019 (15.x), the new Always Encrypted with secure enclaves feature extends the database operations you can perform on encrypted data to include pattern matching, other comparison operators and in-place encryption.

A secure enclave is basically a protected memory area specifically for processing Always Encrypted data within SQL Server. To the rest of the SQL Server engine, and to other processes on the host machine, a secure enclave is a “black box” that shields all code and data inside from outside processes, including debuggers.

To validate a secure enclave before sending it encryption keys, etc., both the client-side driver and SQL Server must contact an external attestation service. The process also relies on “enclave-enabled” column master and encryption keys.

While not without complexities and performance overhead, secure enclaves significantly extend the kinds of operations you can perform on Always Encrypted data.

How Always Encrypted works

Before it can process Always Encrypted data, a client application must be configured to use an approved driver, which will automatically encrypt and decrypt columns of sensitive data. The driver encrypts the data before passing it to the database engine. It also needs to automatically rewrite queries to preserve their semantics. When the database returns query results, the driver transparently decrypts these before storing them.

This works because only the client-side application, not the SQL Server environment, has access to the encryption keys. Other applications can retrieve the encrypted values but can do nothing with them.

To encrypt a column of data with Always Encrypted, you must first generate a column encryption key and a column master key. The former encrypts the data and the latter encrypts the former.

The column encryption key resides on the SQL Server instance, while the database engine stores metadata pointing to the column master key’s location. The column master key itself resides in the Windows Certificate Store, Azure Key Vault, a hardware security module or some other trusted external key store. The database engine never sees or stores either key as plaintext.

Always Encrypted use cases

What are some good reasons to try Always Encrypted? Here a few of the top use cases:

To improve security of data in transit beyond what SSL can provide.
To meet the demands of regulated industries like financial services and telecommunications around protecting Personally Identifiable Information (PII) like credit card numbers, customer names/addresses, etc.
To improve security when outsourcing SQL Server DBA services, Always Encrypted ensures separation of duties between third-party DBAs and in-house application administrators.
To improve security in scenarios where on-premises client instances need to access sensitive data stored on Microsoft Azure. Because the column master key resides on-premises in a trusted key store, Microsoft admins cannot access the cloud-based data.
To improve security in situations where both the client application and database environment are hosted on Azure (as SQL Database or SQL Server instances running in a virtual machine). In this case, both the data and keys are potentially exposed to Microsoft admins on the cloud platform hosting the client tier. However, the data is still always encrypted in the cloud database.

In short, Always Encrypted makes the most sense for protecting PII and other types of data that you need to store, but don’t want to search on or display to application users.

Limitations with Always Encrypted

As noted above, Always Encrypted limits the kinds of operations you can perform on data; e.g., copying data between columns using UPDATE, SELECT INTO, etc. It also restricts the data types you can encrypt. Some of the excluded types include XML, IMAGE, TEXTNTEXT and GEOGRAPHY, as well as user-defined data types.

If you need to query Always Encrypted data from multiple applications, such as reporting tools and data warehouses, you will need to give them access to the appropriate database drivers and decryption certificates. This can be more challenging if you need to replicate encrypted data to other SQL Server instances.

Performance can also be a limiting factor with using Always Encrypted. Encryption/decryption obviously involve compute and storage overhead that will impact both the duration of queries and the volume of storage your application needs. You’ll probably want to do some testing with your specific CPU, RAM and disk setup to gauge the production impact.

Next steps

Want to talk with a database security expert before you dive into Always Encrypted? Contact Buda Consulting to schedule a free consultation.

Database Patch News — June 2020 (Issue 4)

by Robert Buda | Jun 29, 2020 | Database Patch News

Welcome to Database Patch News, Buda Consulting’s newsletter of current patch information for Oracle and Microsoft SQL Server. Here you’ll find information on available patches—including security patches—and desupported versions recently made available.

Why should you care about patching vulnerabilities and bugs? Two big reasons:

1. Unpatched systems are a top cyber attack target. Patch releases literally advertise vulnerabilities to the hacker community. The longer you wait to patch, the greater your security risk.
2. Along with running a supported database version, applying the latest patches ensures that you can get support from the vendor in case of an issue. Patching also helps eliminate downtime and lost productivity associated with bugs.

Here are the latest patch updates for Oracle and SQL Server:

Oracle Patches:

Apr 14 2020 Quarterly Patch Updates:

19c – Release Update 19.7 available.

18c – Release Update 18.10 available.

12.2.0.1 – APR 2020 Release Update 12.2.0.1.200414 available
Regular support ends Mar 2023 and extended support ends Mar 2026.

12.1.0.2 – Currently in extended support.
The last freely available patch was July 2019 for 12.1.0.2.
Apr 14 2020 PSU available but requires extended support purchase to access it.
Patches will be released until July 2021 for this version.
PATCH SET UPDATE 12.1.0.2.200414 available.

11.2.0.4 – Entered extended support December of 2017
Last free available patch was October 2018 for 11.2.0.4.
PATCH SET UPDATE 11.2.0.4.200414 available but requires extended support
purchase to access it.

SQL Server Patches:

SQL Server 2019
Cumulative update 3 (Latest build) Released March 12, 2020
Mainstream support ends Jan 7, 2025
Extended support ends Jan 8, 2030

SQL Server 2017
Cumulative update 20 (Latest build) Released Apr 7, 2020
Mainstream support ends Oct 11, 2022
Extended support ends Oct 12, 2027

SQL Server 2016 Service Pack 2
Cumulative update 12 Release date: Feb 25, 2020
Mainstream support ends Jul 13, 2021
Extended support ends Jul 14, 2026

SQL Server 2014 Service Pack 3
Cumulative update 3 Release date: Apr 16, 2019
Mainstream support ended Jul 9, 2019
Extended support ends Jul 9, 2024

SQL Server 2012 Service Pack 4
Release date: Oct 5, 2017
Mainstream support ended Jul 11, 2017
Extended support ends Jul 12, 2022

Note: All other SQL Server versions not mentioned are no longer supported.

In-Place Oracle Database Encryption with Zero Downtime

by Robert Buda | Jun 8, 2020 | Best Practices, Database Security, Oracle

Have you been wanting to encrypt your Oracle database “since forever,” but feel like you just can’t afford the downtime? If a lot of data is involved, taking it all offline and encrypting it could be very time-consuming. So you’ve been putting the process off, while keeping your fingers crossed that your company’s network security will somehow protect you from a data breach and associated legal, compliance and reputational impacts.

But did you know that you can now encrypt existing tablespaces in-place, either online or offline in Oracle? In case you missed it, Oracle Enterprise Edition version 12.2 (released in 2017) added Transparent Data Encryption (TDE), a much-needed feature that enables you to encrypt an existing database while it remains online.

If you’ve been running an earlier Oracle version and haven’t seen a compelling reason to update, TDE could be it. This capability is a game-changer for those who want to “do the right thing” and encrypt their data at rest, but haven’t wanted to incur the downtime.

At a high level, here is how TDE works:

- First, encrypt the system tablespaces (these must be done separate from user tablespaces)
- Next, encrypt the user tablespaces, one at a time.
- Finally, drop and recreate any temporary tablespaces (these cannot be converted online)

That’s basically all there is to it! There are some technical issues that your DBA and/or security group will need to work out, such as key management and disk space. (You must have enough available disk space during the conversion to duplicate your largest tablespace.)

Of course, you need to back up your entire database before you start the encryption process. If you decide to tackle encryption gradually, then just back up each tablespace before you convert it.

Taking the important step of encrypting your sensitive data at rest will significantly improve your security posture.

So what are you waiting for? Get encrypting!

To schedule a free consultation on your database security, including encryption requirements, contact Buda Consulting.

Is a Cloud Data Warehouse Right for My Company?

by Robert Buda | May 29, 2020 | Data Warehouse, Database Architecture

You’ve probably heard the catchphrase, “Data is the new oil.” You need it to run your business, and its potential value is huge… But first you have to drill for it.

Companies that can collect, process and analyze their data resources efficiently and effectively will gain competitive advantage. A robust data warehouse is often critical to success.

Data warehouses have traditionally been on-premises systems. But a cloud-based or hybrid alternative may now be a better choice for many organizations. This post compares on-premises versus cloud data warehouse architectures and explains the pros and cons of each.

What is a data warehouse?

A data warehouse is a central repository that stores and processes both historical and new data from multiple databases and applications (including potentially SaaS platforms) to improve the speed and scope of problem-solving and decision-making. Data scientists and other roles direct queries, reporting and analytics requests to the data warehouse via business intelligence (BI) and analytics tools and dashboards. Data warehouses support business decisions in many areas, from how to price products to where to cut costs to whether to launch a new service.

Columnar versus row-based data storage

Why not just query your relational databases (RDBMS) directly? One reason is that BI/analytics requests could significantly impact performance of business-critical RDBMS and associated “systems of record.”

Also, the row-based structure of RDBMS, which is ideal for transactional usage, isn’t as well suited to analytical processing. This is why many data warehouses, especially cloud-based offerings, store data in a columnar format.

For example, aggregation queries (like “What is the average age of customers who bought widgets?”) can be much faster in a columnar database because the data elements involved are stored closer together on the storage media, making read times (latency) lower. But if you need to query multiple fields per record (like “What is the name, email address and date of birth of each person who bought a widget in 2018?”) the advantages of columnar databases diminish.

Similarly, write times can be much slower in columnar databases. This is less problematic in data warehouses versus transactional applications because new data is generally added less frequently, via so-called “data dumps.”

Data warehouse structures

The more data sources you need to access and the more kinds of questions you want to answer, the more complex your data warehouse structure might need to become.

A basic data warehouse provides summary data derived from similar types of operational systems, like RDBMS. As you add more data sources, you may need to setup a “staging area” to convert diverse data into a more homogeneous structure that is easier to query. Some solutions then add “data marts,” which tailor data from individual lines of business to support the specific analysis needs of different user roles.

ELT or ETL?

Most data warehouses pull data from different line of business sources, such as marketing, sales, accounting, etc. and/or from the organization overall. All this data can either be aggregated within a central repository and then transformed into a suitable format (an Extract-Load-Transform (ELT) approach); or initially pulled into a temporary database to be transformed before being moved to a central repository (an Extract-Transform-Load (ETL) process).

The ETL approach is typically the choice for on-premises data warehouses, as hardware limitations dictate processing the data off-platform. ETL can also help ensure privacy and compliance by cleansing the data before loading. The massive scalability of cloud data warehouse platforms makes them amenable to ELT, which can enable users to transform raw data according to their specific needs after it’s loaded.

On-premises data warehouse architectures

An on-premises data warehouse requires dedicated storage, servers and administrative resources. As such, it can be slow and costly to scale up as data volumes expand.

Traditional data warehouses have a three-tier architecture:

1. The bottom tier consists of a database server that extracts data from multiple sources; e.g., RDBMS and SaaS
2. The middle tier consists of an online analytical processing (OLAP) server that transforms the data into a more analytics-friendly structure
3. The upper tier is the client-facing applications, which provide tools for data analysis, reporting and data mining

Cloud data warehouse architectures

A cloud data warehouse, also called Data Warehouse-as-a-Service (DWaaS), is a SaaS offering. You pay as you go for storage and compute power, and your data is stored, queried and analyzed over the internet.

Since the service provider manages the physical hardware and IT resources, cost-effective scalability and much lower setup costs are among the key advantages of cloud data warehouses. Gartner estimates that 75% of all databases will be deployed on or migrated to the cloud by 2022, with cost and flexibility being perhaps the primary drivers for this rapid shift.

Some generic architecture components common to many cloud data warehouse architectures include:

- Clusters, which are large groupings of shared computing resources called nodes
- Nodes are discrete computing resources with dedicated CPU, RAM and storage. Queries are executed across multiple nodes in a cluster, depending on what data is stored on them.
- Slices. Each node has multiple partitions, called slices, which operate in parallel to accelerate query execution.

These elements together enable massively parallel processing, another big performance advantage of cloud data warehouses.

Cloud data warehouse architectures can vary significantly. Amazon Redshift uses a model similar to the above, and requires users to provision resources more like an on-premises data warehouse. Google BigQuery, in contrast, is serverless; Google manages system resources dynamically with no user input or visibility.

Pros and cons of on-premises and cloud data warehouses

Where to host your data warehouse? That depends on a balance of cost, scalability/growth, query efficiency and other factors that are unique to your business. Here are the top issues to consider:

Cost

Because you pay only for the storage and CPU time you actually use, plus there are no physical machines to buy and no maintenance, admin or upgrade costs, cloud data warehouses are much less expensive—especially upfront. According to Forrester Research, most companies save between 20% and 80% on cost with a cloud data warehouse solution.

Security and compliance

Security and compliance concerns are the top roadblocks to moving data warehouses to the cloud. Leading DWaaS providers offer robust security controls that are probably better than many businesses could deliver on their own. However, in virtualized cloud environments you may not know where your data resides or where and when it is moved. You also may not know when a vendor has trusted your data to a “fourth party” whose security posture might be suspect. Concerns like these can make it hard to prove compliance.

Some of the security topics to explore with DWaaS providers include:

- Data encryption at rest and in transit
- User identity management and authentication
- Security and privacy policies for replication and archiving of data
- Compliance with regulations like HIPAA and PCI DSS, and with cybersecurity frameworks like SOC 2 and ISO 27001

Scalability

To scale an on-premises data warehouse solution, you need to buy new hardware and/or software (often more than you need right away). In the cloud, you get the capacity you need on-demand to support growth and across both peak and off-peak times.

Reliability and availability

With on-premises systems, reliability and availability depend on your IT resources. In cloud environments, you can hold the service provider to SLAs, which are typically pretty stringent (e.g., 99.9% uptime). DWaaS providers may also be able to offer high availability or improved resilience by replicating data across different geographical regions.

Data loading

Third-party tools can make data loading faster and easier. However, it is generally more of a challenge with cloud platforms because data has to travel over external networks. Further, cloud data loading tools are often vendor-specific, which can make moving to a new DWaaS a technological and contractual struggle.

Time to market

Thanks to faster setup and rollout times, DWaaS typically offers faster time to market (and faster time to value) than an on-premises data warehouse you must stand up from scratch.

Data latency

The volume, type(s) and location of data all influence latency, which is the time required to store or retrieve data. Because of their distributed architectures, cloud data warehouses generally have more latency issues than onsite data warehouses. The fresher you need your data to be (one minute versus one hour, for instance), the bigger problem latency can present. Especially if you need near real-time data analysis, it’s important to test and measure latency before choosing a cloud provider.

Integrations

The leading cloud service providers increasingly offer pre-built integrations with growing ecosystems of cloud services that could help you load or process your data (assuming no compliance problems); e.g., identity and access management and BI/big data analytics tools. In an on-premises scenario, comparable integrations could be largely on you.

Choosing the right data warehouse solution

Before you implement a data warehouse, carefully consider your unique technical, business and user requirements and research likely options in detail. Given the high levels of time, effort and cost involved you don’t want any surprises.

If you don’t have database architecture and design expertise in-house, a trusted advisor can help ensure you ask all the right questions and make all the right moves.

For a free consultation to discuss your needs and questions in-depth, contact Buda Consulting.

« Older Entries

Next Entries »