Thursday, February 25, 2010

Complex Row Level Security

Row level security (constraining a user’s view of the data to rows which meet pre-defined criteria) is a common requirement. This post will explore this topic, using a simple schema with a single fact table and three dimension tables, built around the theme of retail sales.

The data model in this schema has dimension tables Weeks, Items and Stores. Users can see all weeks. So security does not have to be concerned with the Weeks table. In the dimension tables there are just three items and three stores.

clip_image002 clip_image004

Here is the fact table (partially shown). Store, Item, and Week are foreign keys to the dimension tables. “QS” is the fact (Quantity Sold), which has a SUM aggregation rule.


The Business Model looks like this.


The first requirement is to restrict the data visibility of some users to just certain products in all stores. For example, user A should be able to see Product A in all stores, but not products B and C. The second requirement is to restrict other users to only a subset of products in a subset of stores. For example, user B should be able to see data just for Product B but only in Stores 1 and 2. User C should see only Product C but only in Stores 2 and 3.

A good way to implement this is to use an initialization block to set session variables using row-wise initialization. A database table, such as the one in this schema called RowWiseVars, contains the data visibility rules for users A, B, and C. When each user logs on, an init block will read the table, creating and populating session variables.


The init block contains the following SQL. It sets the session variables Product and Store (the values in the table’s VAR column) and their respective values (the values in the VALUE column) for each user. With row-wise initialization, the names of the columns are immaterial. The values in the first column in the SQL define and name the session variables. The values in the second column populate the session variables that are being defined.


Notice in the following screen shot that the session variables are not listed in the Edit Data Target dialog of the initialization block. Instead, the Row-Wise initialization radio button is clicked on.


Now defining an rpd group called “RetailUsers”, we set the filters on the Business Model that will be used for this group. Queries could be dimensional only, fact only, or dimensions with facts. Filters are needed for all three cases.

The fact table filter will cause a join between the fact table and the Item and Store dimension tables and apply a filter on those tables even when the columns in the dimension tables are not involved in the user’s query. The expression builder does not show the session variables in its Session Variables folder, since they are not defined in the RPD but instead are set by row-wise initialization. The session variable names have to be entered manually, as shown in yellow.


Filters are also defined on the RETAILITEMS and RETAILSTORES dimension tables. This will filter the values that will be returned to the user when queries are constructed that do not have to involve the logical fact table.


Now when user A logs on, the initialization block runs with the following SQL.


Which produces these results:


The variable Store is set to Store 1, Store 2, and Store 3. The variable Product is set to Product A.

When user A queries the Items dimension, he only sees product A.

clip_image024 clip_image026

The physical SQL generated is

select distinct T2372.ITEMNAME as c1
where ( T2372.ITEMNAME = 'Product A' )
order by c1

When user A queries just the fact table (i.e. queries for QS only), the result is the sum of QS for Product A in the three stores, which is 3.


The physical SQL is

select sum(T2364.QS) as c1
where ( T2364.ITEM = T2372.ITEM
and T2364.STORE = T2376.STORE
and T2372.ITEMNAME = 'Product A'
and (T2376.STORENAME in
('Store 1', 'Store 2', 'Store 3')))

Note, because this rpd group needs to have data filtered by products and stores using two session variables, it is necessary to list all stores for user A in the RowWiseVars table. Alternatively, if A was limited by product only, we could have defined a separate rpd group and used a single session variable for that group. In that case the RowWiseVars table would not have included store information for user A.

This meets the requirements when the “legal data” is the intersection of the dimensional values. However, it may be that some requirements are more complex and cannot be accommodated by an intersection of dimension values. For example, suppose user D is allowed to see Product A, but only in Store1 and 2, and Product B, but only in Store 3. To handle this requirement we need to create another security table. Let’s call this table ComplexSecurity. It contains the list of legal product/store combinations for each user – in this case just User D.


To filter the data correctly, the ComplexSecurity table must be joined to the fact table, like this.



In addition to constraining fact table data, it is also necessary to constrain the values returned by dimensional queries (e.g. when browsing using a dashboard prompt). To do this, we can use the session variable strategy that we used in the previous case and add these rows to the RowWiseVars table.


We need to define a new RPD group with the dimensional and fact table filters.


When D queries stores alone, he sees all three stores: clip_image040

The physical SQL includes the Store session variable values.

select distinct T2376.STORENAME as c1
where ( T2376.STORENAME in
('Store 1', 'Store 2', 'Store 3') )
order by c1

When D queries items, he sees just A and B:


The physical SQL includes the Product session variable values.

select distinct T2372.ITEMNAME as c1
where ( T2372.ITEMNAME in ('A', 'B') )
order by c1

When D includes the fact QS in the query, the results are


You can see that these results are correct by examining the relevant rows in the fact table for D.


Note how the ComplexSecurityTable is included in the physical SQL.

select   T2372.ITEMNAME   as C1
,T2376.STORENAME as C2
,Sum(T2364.QS)   as C3
T2364.ITEM = T2567.PRODUCT
and T2364.ITEM = T2372.ITEM
and =
and =
and T2567.USERID = 'D'
and T2372.ITEMNAME in ('Product A','Product B')
and T2376.STORENAME in
('Store 1','Store 2','Store 3')
group by T2372.ITEMNAME
order by C1 ,C2

There are many possible variations on this theme. The important points are:

1) Use row-wise variables to filter dimensional queries.

2) Use row-wise variables to filter fact table queries when the intersection of dimensions defined by the row-wise variables meets the security requirements.

3) Use a table with the appropriate legal dimensional tuples to limit fact table queries. Join this table to the fact table physically and include it in the business model. Include this table in queries involving the logical fact table by configuring the security filter appropriately for the group with complex security requirements.

4) When the complex security involves attributes at a higher level than the grain of the fact table, then include those legal tuples in the complex security table and join the complex security table to the dimension tables.