The V4 Challenge for SQL and OLAP
|
Introduction
This
paper presents five challenges for your
data management tools. Can you quickly and
easily solve these problems using one or
more of the tools in your data management
toolkit? Solutions to each of the problems
is given using V4â
, a revolutionary new database tool,
developed and supported by MKS Inc. All
of the problems presented are based on
relational tables taken from actual
applications. Only the necessary columns
required for each particular challenge are
listed. It is assumed that the data is in
relational form. If an OLAP product is to
be used, then your "solution"
should include the effort required to map
from flat tables to the multi-dimensional
cube. V4
uses new concepts. Its syntax may appear
foreign to you. The point here is not
to explain the basics of V4 or the
particulars of a solution. What we want
to show is how easily and quickly one can
create solutions using V4. Note that each
challenge presented has been implemented
in V4 for a real client. The results shown
here are based on real data, but some
numbers and descriptions have been altered
due to the proprietary nature of the data.
The time to implement ranged from about 30
minutes for Challenge II to 8 hours for
Challenge V. You may not want to actually
implement a solution to all of these
challenges, but at a minimum, estimate the
level of effort that would be required. These
challenges are about manipulating data,
not about providing a pretty GUI front-end
or generating multi-color reports. Most
products, including V4, directly or
indirectly provide these ease-of-use
facilities. The example reports shown
below are images taken directly from
Microsoft Excel using the V4 Add-In. Challenge
I – Ranking Of Sales Reps By Month and
Sales Manager This
challenge takes information from two
tables. The first, Sale, is a record of
each sale. The columns in the table are
Date (date of the sale), SaleAmount
(dollar amount of the sale), and Emp (the
sales representative responsible for the
sale). The second table, Emp, contains a
record for each employee. Employees
include sales representatives and their
managers. The columns are Id (identifier
of the employee), Name (the employee’s
name), and Mgr (the employee who is the
manager of the sales rep). Sale(Date
SaleAmount Emp) Emp(Id
Name Mgr) The
required output shows months across and
managers down. For each manager and month,
the top five sales representatives are
shown as well as their corresponding sales
for that month (i.e. two columns for each
month). The top sales rep for each month
is highlighted in red. Note that any range
and/or list of months may be specified,
and that a manager may not have five sales
reps reporting to him/her. Bind [Rank Month.. Emp.. Int..] List(Sort([Emps] Reverse::[Sales],0) Nth::Int*) Bind [TopRep Month..] List(Sort(Emp.. Reverse::[Sales],0) Nth::1) Bind [RankedRep Month.. Emp..] If({[Rank Int:1] = [TopRep]} TopEmp:=[[Rank Int:1] Id] [[Rank Int:1] Id]) Bind [RepRank Month..] Do( Tally(Emp.. (ListOf::Emp* By::Emp.Mgr ByList::[MgrList] Bind::[Emps])) Tally(Sale.. If::I{Sale.Date == Month*} (Sum::Sale.SaleAmount By::(Sale.Emp DTInfo(Sale.Date Month?)) Bind::[Sales])) SSDim(Dim:TopEmp Style::"Color:Red,Italic") EchoS("" "" TC:=Str("{2}" Month*)/Month*) EchoS("" TR:"Rank" (TR:"Rep" TR:"$ Sales")/Month*) Sort([MgrList] By::Emp.Id Do::(EchoS(Emp.Name 1 ([RankedRep] [[Rank Int:1] Sales],0)/Month*) Enum(Int:2..5 If::DefQ([Rank Int*]) @EchoS("" Int* ([[Rank Int*] Id] [[Rank Int*] Sales],0)/Month*)) EchoS() )) EchoS(End::"SalesRep Ranking by Manager") ) Evaluating [RepRank Month:Jul-99..Dec-99] would result in output looking like-
Challenge II – Extract Lists of Customers Based on Keywords in Name Given a list of keywords, scan through the customer table, first extracting all customers whose name contains the first keyword, then customers whose name contains the second (but not the first), and so forth for all keywords. The number of keywords is variable with a maximum of 100 keywords. The single table required for this analysis is– Cus(Id Name City State) The single V4 rule required to perform this is shown below-
To run this, extracting all customers with "Band", then "Orchestra", then "Music" would be done by simply evaluating "[CusGroups (Band Orchestra Music)]". The output would be a tab delimited file with columns for the keyword, the customer Id, the customer name, city and state.
Challenge III – Variable Bucket Accounts Receivable Aging The challenge is to generate what is called an Aging Report. It looks at all invoices with balances due, aggregates by the sales representative responsible for the sale, and "ages" the totals. Aging is done by determining how old the invoice is, or how long the balance has been due. For example, an invoice dated 1-Feb-00 would be considered "over 30 days old" as of 10-Mar00. Two tables are required. The Emp table gives the name and identifier of each sales rep. The Inv(oice) table contains the current balance due, the employee (sales rep), and invoice date. The analysis should handle variable number of aging buckets. Emp(Id Name) Inv(BalanceAmt Emp InvDate) The V4 rules to perform this analysis are shown below. Evaluating "[CusAgeByRep Periods:30,45,60,90,120 Date:1-Jun-00]" generated the report immediately following the V4 rules.
Challenge IV – Number of Unique Customers Purchasing Items This challenge scans purchases or sales for two separate time periods. The number of unique customers purchasing each item in the two time periods is calculated and output. The output is sorted by the vendor supplying the inventory and then by item. It is often the case that the most elegant solution is not the most efficient. This challenge is computationally expensive due to the requirement for unique customers. The example solution given below was run off of 3.9 million sales records in less than 7.5 CPU minutes on a 400 megahertz PC. Over eight thousand lines of output were generated in the actual report. Only a few are shown here as an example. The three tables required are shown below. The Ven(dor) table contains the name and identifier for each vendor. The IM (inventory) table has the identifier, name, and vendor of each item. The Sale(s) table contains the customer, inventory item, and time period of each sale. Ven(Id Name) IM(Id Name Ven) Sale(Cus IM Period) Evaluating "[NumCusVen UP1:9901..9903 UP2:9907..9909]" results in the report shown following the V4 rules. Bind [NumCusVen UP1.. UP2..] Do(Tally(Sale.. (By::Sale.IM ByList::[NIMList]) (UCount::Sale.Cus By::Sale.IM Bind::[CusCount1] If::{Sale.Period == UP1*}) (UCount::Sale.Cus By::Sale.IM Bind::[CusCount2] If::{Sale.Period == UP2*})) EchoS(T:"Id" :"Description" TR:"Period 1" :"Period 2" Echo::0) Enum(Sort([NIMList] By::IM.Ven By::IM*) @EchoS(IM* IM.Desc [CusCount1],0 [CusCount2],0) After::(IM.Ven EchoS("" Str("Vendor " IM.Ven.Id " - " IM.Ven.Name)))) EchoS(End::Str("Number NABCA Licensees Purchasing Product\lPeriod " UP1* " versus " UP2*)) )
Challenge V – Raw Materials Requirements Based on Sales Projection This is the last and most complex challenge. The problem is to take weekly sales projections of finished goods, explode them through bills-of-material and generate a consolidated list of raw materials needed to satisfy the projection. Additionally, the onhand quantities of the raw materials and the projected receipts (total quantity and detail) from open purchase orders should be shown. Raw material shortages should also be shown by week. Seven tables necessary for this analysis. The IM table contains entries for each inventory item (identifier and description). The IML table contains entries giving the on-hand quantity (uOHQty) for each item at each warehouse location (Loc). The PO and POD tables describe open purchase orders with detail for each outstanding inventory item- the quantity remaining to be received (uRemainQty) and the Expected due date. The bill-of-material (BOM) table has an entry for each finished good and the BOMD has entries for each raw material component needed to make the finished product. The Yield is the amount/quantity of the finished product. The BOMIM is the component raw material, and BOMQty is the quantity/amount of the raw material. The SPW table has the weekly projected sales for the finished goods. Note that in the "real world" these quantities may be given in different units. The recipe in the bill-of-material may specify a quantity in grams, but the warehouse quantity (in the IML table) may be in kilograms or pounds. IM(Id Desc) IML(IM Loc uOHQty) PO(Id Loc) POD(PO IM uRemainQty Expected) BOM(Id IM Yield) BOMD(BOM ParentIM BOMIM BOMQty) SPW(Id UWeek uQty IM) The rules to perform this analysis are shown below. To run the project for four weeks beginning with 29-May-00, for warehouse location Main on finished goods projection Test one would evaluate "[ProjRMSPW Week:29-May-00..19-Jun-00 Loc:Main SPWId:Test]".
|
MKS,
Inc. • 992 Old Eagle School Road •
Wayne, PA 19087-1803 |