Thursday, February 25, 2010

Complex Row Level Security

Row level security (constraining a user’s view of the data to rows which meet pre-defined criteria) is a common requirement. This post will explore this topic, using a simple schema with a single fact table and three dimension tables, built around the theme of retail sales.

The data model in this schema has dimension tables Weeks, Items and Stores. Users can see all weeks. So security does not have to be concerned with the Weeks table. In the dimension tables there are just three items and three stores.

clip_image002 clip_image004

Here is the fact table (partially shown). Store, Item, and Week are foreign keys to the dimension tables. “QS” is the fact (Quantity Sold), which has a SUM aggregation rule.

clip_image006

The Business Model looks like this.

clip_image008

The first requirement is to restrict the data visibility of some users to just certain products in all stores. For example, user A should be able to see Product A in all stores, but not products B and C. The second requirement is to restrict other users to only a subset of products in a subset of stores. For example, user B should be able to see data just for Product B but only in Stores 1 and 2. User C should see only Product C but only in Stores 2 and 3.

A good way to implement this is to use an initialization block to set session variables using row-wise initialization. A database table, such as the one in this schema called RowWiseVars, contains the data visibility rules for users A, B, and C. When each user logs on, an init block will read the table, creating and populating session variables.

clip_image010

The init block contains the following SQL. It sets the session variables Product and Store (the values in the table’s VAR column) and their respective values (the values in the VALUE column) for each user. With row-wise initialization, the names of the columns are immaterial. The values in the first column in the SQL define and name the session variables. The values in the second column populate the session variables that are being defined.

clip_image012

Notice in the following screen shot that the session variables are not listed in the Edit Data Target dialog of the initialization block. Instead, the Row-Wise initialization radio button is clicked on.

clip_image014

Now defining an rpd group called “RetailUsers”, we set the filters on the Business Model that will be used for this group. Queries could be dimensional only, fact only, or dimensions with facts. Filters are needed for all three cases.

The fact table filter will cause a join between the fact table and the Item and Store dimension tables and apply a filter on those tables even when the columns in the dimension tables are not involved in the user’s query. The expression builder does not show the session variables in its Session Variables folder, since they are not defined in the RPD but instead are set by row-wise initialization. The session variable names have to be entered manually, as shown in yellow.

clip_image016

Filters are also defined on the RETAILITEMS and RETAILSTORES dimension tables. This will filter the values that will be returned to the user when queries are constructed that do not have to involve the logical fact table.

clip_image018

Now when user A logs on, the initialization block runs with the following SQL.

clip_image020

Which produces these results:

clip_image022

The variable Store is set to Store 1, Store 2, and Store 3. The variable Product is set to Product A.

When user A queries the Items dimension, he only sees product A.

clip_image024 clip_image026

The physical SQL generated is

select distinct T2372.ITEMNAME as c1
from
RETAILITEMS T2372
where ( T2372.ITEMNAME = 'Product A' )
order by c1

When user A queries just the fact table (i.e. queries for QS only), the result is the sum of QS for Product A in the three stores, which is 3.

clip_image028

The physical SQL is

select sum(T2364.QS) as c1
from
RETAILSTORES T2376,
RETAILITEMS T2372,
RETAILFACTS T2364
where ( T2364.ITEM = T2372.ITEM
and T2364.STORE = T2376.STORE
and T2372.ITEMNAME = 'Product A'
and (T2376.STORENAME in
('Store 1', 'Store 2', 'Store 3')))

Note, because this rpd group needs to have data filtered by products and stores using two session variables, it is necessary to list all stores for user A in the RowWiseVars table. Alternatively, if A was limited by product only, we could have defined a separate rpd group and used a single session variable for that group. In that case the RowWiseVars table would not have included store information for user A.

This meets the requirements when the “legal data” is the intersection of the dimensional values. However, it may be that some requirements are more complex and cannot be accommodated by an intersection of dimension values. For example, suppose user D is allowed to see Product A, but only in Store1 and 2, and Product B, but only in Store 3. To handle this requirement we need to create another security table. Let’s call this table ComplexSecurity. It contains the list of legal product/store combinations for each user – in this case just User D.

clip_image030

To filter the data correctly, the ComplexSecurity table must be joined to the fact table, like this.

clip_image032

clip_image034

In addition to constraining fact table data, it is also necessary to constrain the values returned by dimensional queries (e.g. when browsing using a dashboard prompt). To do this, we can use the session variable strategy that we used in the previous case and add these rows to the RowWiseVars table.

clip_image036

We need to define a new RPD group with the dimensional and fact table filters.

clip_image038

When D queries stores alone, he sees all three stores: clip_image040

The physical SQL includes the Store session variable values.

select distinct T2376.STORENAME as c1
from
RETAILSTORES T2376
where ( T2376.STORENAME in
('Store 1', 'Store 2', 'Store 3') )
order by c1

When D queries items, he sees just A and B:

clip_image042.

The physical SQL includes the Product session variable values.

select distinct T2372.ITEMNAME as c1
from
RETAILITEMS T2372
where ( T2372.ITEMNAME in ('A', 'B') )
order by c1

When D includes the fact QS in the query, the results are

clip_image044.

You can see that these results are correct by examining the relevant rows in the fact table for D.

clip_image046

Note how the ComplexSecurityTable is included in the physical SQL.

select   T2372.ITEMNAME   as C1
,T2376.STORENAME as C2
,Sum(T2364.QS)   as C3
from    
COMPLEXSECURITY T2567
,RETAILSTORES T2376
,RETAILITEMS T2372
,RETAILFACTS T2364
where
T2364.ITEM = T2567.PRODUCT
and T2364.ITEM = T2372.ITEM
and T2364.store = T2376.store
and T2364.store = T2567.store
and T2567.USERID = 'D'
and T2372.ITEMNAME in ('Product A','Product B')
and T2376.STORENAME in
('Store 1','Store 2','Store 3')
group by T2372.ITEMNAME
,T2376.STORENAME
order by C1 ,C2

There are many possible variations on this theme. The important points are:

1) Use row-wise variables to filter dimensional queries.

2) Use row-wise variables to filter fact table queries when the intersection of dimensions defined by the row-wise variables meets the security requirements.

3) Use a table with the appropriate legal dimensional tuples to limit fact table queries. Join this table to the fact table physically and include it in the business model. Include this table in queries involving the logical fact table by configuring the security filter appropriately for the group with complex security requirements.

4) When the complex security involves attributes at a higher level than the grain of the fact table, then include those legal tuples in the complex security table and join the complex security table to the dimension tables.