sexta-feira, 13 de março de 2015

LOD Expressions #2 - A better understanding


In the post LOD Expressions #1 - OVER (PARTITION BY) x LOD we talked a little bit about LOD Expression and how it compares to over (partition by...).
As we said in that post, LOD expressions allows you to achieve similar results than over (partition by...).

Now we are going to understand the reason why we said "similar". In other words, we will understand the difference between them.

We will use a pretty simple example to make easier to understand. 

Many of you might say this simple example can be solved with table calculation and/or data blending (as we did in the post Compare a filtered dimension to all values). And you are right, you can re-build this example with those approaches, but the intention here it to explain what happens in the back-end when you use LOD, so you can can apply on more complex cases.

So, lets start.
In our database, we have a Sales table (financial.sales_region):





















We want to compare the sum of sales per employee with the sum of sales of the region.

This is our desired output:





















If your approach is to write a query using olap function, you can use this simple query:
select Category, Sub-Category, Sales, sum(sales) over (partition by Category)
from (
    select Category, Sub-Category, sum(sales) as Sales
    from table
    group by Category, Sub-Category
) A



If you are using Tableau 9.0, you can use this calculated field "By Region":
{FIXED [Region]: SUM([Sales])}

By doing that, this is the query that Tableau will create to achieve the desired output:

SELECT "t0"."employee" AS "employee",

  "t3"."__measure__1" AS "sum:calculation_86303112206550",
  "t0"."sum:sales:ok" AS "sum:sales:ok"
FROM (
  SELECT "sales_region"."employee" AS "employee",
    SUM("sales_region"."sales") AS "sum:sales:ok"
  FROM "financial"."sales_region" "sales_region"
  GROUP BY 1
) "t0"
  INNER JOIN (
  SELECT SUM("t2"."__measure__0") AS "__measure__1",
    "t1"."employee" AS "employee"
  FROM (
    SELECT "sales_region"."region" AS "region",
      "sales_region"."employee" AS "employee"
    FROM "financial"."sales_region" "sales_region"
    GROUP BY 1,
      2
  ) "t1"
    INNER JOIN (
    SELECT SUM("sales_region"."sales") AS "__measure__0",
      "sales_region"."region" AS "region"
    FROM "financial"."sales_region" "sales_region"
    GROUP BY 2
  ) "t2" ON (("t1"."region" = "t2"."region") OR (("t1"."region" IS NULL) AND ("t2"."region" IS NULL)))
  GROUP BY 2
) "t3" ON (("t0"."employee" = "t3"."employee") OR (("t0"."employee" IS NULL) AND ("t3"."employee" IS NULL)))


















As you can see, they have different technical solution.


With Tableau's approach you can interact direct with those sub-queries, which gives you some advantages.

For example, you want to exclude the Employee A from your analysis.

To do that, you will use a quick filter.













When you add a filter, you can whether or not use it as a context filter.

Depending on which option you chose, you will interact with specifics sub-queries.

Not using as context filter

If you did not add the filter to context, Tableau will exclude Employee A from your output, but will keep it on the calculation of the Region.
Which means that Employee A will still be part of the "Sum os Sales by Region".

This is the query that Tableau will perform:

SELECT "t0"."employee" AS "employee",

  "t3"."__measure__1" AS "sum:calculation_86303112206550",
  "t0"."sum:sales:ok" AS "sum:sales:ok"
FROM (
  SELECT "sales_region"."employee" AS "employee",
    SUM("sales_region"."sales") AS "sum:sales:ok"
  FROM "financial"."sales_region" "sales_region"
  WHERE (NOT ("sales_region"."employee" = 'A'))
  GROUP BY 1
) "t0"
  INNER JOIN (
  SELECT SUM("t2"."__measure__0") AS "__measure__1",
    "t1"."employee" AS "employee"
  FROM (
    SELECT "sales_region"."region" AS "region",
      "sales_region"."employee" AS "employee"
    FROM "financial"."sales_region" "sales_region"
    WHERE (NOT ("sales_region"."employee" = 'A'))
    GROUP BY 1,
      2
  ) "t1"
    INNER JOIN (
    SELECT SUM("sales_region"."sales") AS "__measure__0",
      "sales_region"."region" AS "region"
    FROM "financial"."sales_region" "sales_region"
    GROUP BY 2
  ) "t2" ON (("t1"."region" = "t2"."region") OR (("t1"."region" IS NULL) AND ("t2"."region" IS NULL)))
  GROUP BY 2

) "t3" ON (("t0"."employee" = "t3"."employee") OR (("t0"."employee" IS NULL) AND ("t3"."employee" IS NULL)))



As you can see, the filter was not included in sub-query "t2", because that is the sub-query which generate the sales by region.



Using context filter

If you add the filter to context, Tableau will include the filter "Employee <> 'A'" to your whole analysis, by doing that, the sales by region will no longer include the sales of Employee A.

This is the query generated:

SELECT "t0"."employee" AS "employee",

  "t3"."__measure__1" AS "sum:calculation_86303112206550",
  "t0"."sum:sales:ok" AS "sum:sales:ok"
FROM (
  SELECT "sales_region"."employee" AS "employee",
    SUM("sales_region"."sales") AS "sum:sales:ok"
  FROM "financial"."sales_region" "sales_region"
  WHERE (NOT ("sales_region"."employee" = 'A'))
  GROUP BY 1
) "t0"
  INNER JOIN (
  SELECT SUM("t2"."__measure__0") AS "__measure__1",
    "t1"."employee" AS "employee"
  FROM (
    SELECT "sales_region"."employee" AS "employee",
      "sales_region"."region" AS "region"
    FROM "financial"."sales_region" "sales_region"
    WHERE (NOT ("sales_region"."employee" = 'A'))
    GROUP BY 2,
      1
  ) "t1"
    INNER JOIN (
    SELECT "sales_region"."region" AS "region",
      SUM("sales_region"."sales") AS "__measure__0"
    FROM "financial"."sales_region" "sales_region"
    WHERE (NOT ("sales_region"."employee" = 'A'))
    GROUP BY 1
  ) "t2" ON (("t1"."region" = "t2"."region") OR (("t1"."region" IS NULL) AND ("t2"."region" IS NULL)))
  GROUP BY 2

) "t3" ON (("t0"."employee" = "t3"."employee") OR (("t0"."employee" IS NULL) AND ("t3"."employee" IS NULL)))















Now, all the sub-queries have the filter "Employee <> 'A'".



Technical information:
Tableau version: 9.0.0 (9000.15.0304.1722) 64-bit
Database: Teradata 14.0.3

Nenhum comentário:

Postar um comentário