Hi Folks,
Following is an simplifed example from a current project, showing how to achieve 100x or better performance executing cross joins in CE functions rather than SQL.
The CE function approach for various dataset sizes in testing has remained around 700ms, so the performance improvement may be greater than 100.
/* For certain use cases in HANA a 'cartesian product' is required, also known as a cross join. A typical example is replacing [nested for loops over data sets with tuple calculations] with a [CROSS JOIN + calculated column]. A real use case for cross join is in CRM IPM Availability Requests. Media companies can maintain information for products (i.e. movies) at different 'rights scopes': - Media (i.e. Free TV, Pay TV, Cable...) - Territory (i.e. regions, countries, states, cities, counties) - Languages A salesperson will want to find out what the availability is for certain products in certain rights scopes. In the example below, a salesperson can search for availability for 1000+ products, 60 medias, 240 territories, 4 languages - resulting in 57 million combinations. Code below shows how to calculate cross join with SQL and CE functions, showing 100x better performance with CE functions. */ --SET SCHEMA TEST; /* Generator tables used to create fake data. */ DROP TABLE GENERATOR1; CREATE COLUMN TABLE GENERATOR1 (G1 NCHAR(1)); DROP TABLE GENERATOR2; CREATE COLUMN TABLE GENERATOR2 (G2 NCHAR(1)); DROP TABLE GENERATOR3; CREATE COLUMN TABLE GENERATOR3 (G3 INTEGER); INSERT INTO GENERATOR1 VALUES ('A'); INSERT INTO GENERATOR1 VALUES ('B'); INSERT INTO GENERATOR1 VALUES ('C'); INSERT INTO GENERATOR1 VALUES ('D'); INSERT INTO GENERATOR1 VALUES ('E'); INSERT INTO GENERATOR1 VALUES ('F'); INSERT INTO GENERATOR1 VALUES ('G'); INSERT INTO GENERATOR1 VALUES ('H'); INSERT INTO GENERATOR1 VALUES ('I'); INSERT INTO GENERATOR1 VALUES ('J'); INSERT INTO GENERATOR2 VALUES ('!'); INSERT INTO GENERATOR2 VALUES ('@'); INSERT INTO GENERATOR2 VALUES ('#'); INSERT INTO GENERATOR2 VALUES ('$'); INSERT INTO GENERATOR2 VALUES ('&'); INSERT INTO GENERATOR2 VALUES ('*'); INSERT INTO GENERATOR3 VALUES (1); INSERT INTO GENERATOR3 VALUES (2); INSERT INTO GENERATOR3 VALUES (3); INSERT INTO GENERATOR3 VALUES (4); INSERT INTO GENERATOR3 VALUES (5); INSERT INTO GENERATOR3 VALUES (6); INSERT INTO GENERATOR3 VALUES (7); INSERT INTO GENERATOR3 VALUES (8); INSERT INTO GENERATOR3 VALUES (9); INSERT INTO GENERATOR3 VALUES (10); -- 1000 unique GUIDs for products DROP TABLE PRODUCT_GUID; CREATE COLUMN TABLE PRODUCT_GUID AS (SELECT T1.G1 || T2.G1 || T3.G1 AS P FROM GENERATOR1 T1 CROSS JOIN GENERATOR1 T2 CROSS JOIN GENERATOR1 T3); -- 60 medias DROP TABLE MEDIA; CREATE COLUMN TABLE MEDIA AS (SELECT T1.G1 || T2.G2 AS M FROM GENERATOR1 T1 CROSS JOIN GENERATOR2 T2); -- 240 territories DROP TABLE TERRITORY; CREATE COLUMN TABLE TERRITORY AS (SELECT T1.G1 || T2.G2 || T3.G3 AS T FROM GENERATOR1 T1 CROSS JOIN GENERATOR2 T2 CROSS JOIN (SELECT TOP 4 G3 FROM GENERATOR3) T3); -- 4 languages DROP TABLE LANGUAGE; CREATE COLUMN TABLE LANGUAGE AS (SELECT TOP 4 G3 AS L FROM GENERATOR3); DROP TYPE TT_TAB; CREATE TYPE TT_TAB AS TABLE (P NVARCHAR(32), M NVARCHAR(30), T NVARCHAR(30), L NVARCHAR(30)); -- Read-only cross join procedure in SQL: Products x Media x Territory x Language DROP PROCEDURE CROSS_JOIN_SQL; CREATE PROCEDURE CROSS_JOIN_SQL (OUT var_out TT_TAB) READS SQL DATA WITH RESULT VIEW CJ_SQL_VIEW AS BEGIN var_out = SELECT * FROM PRODUCT_GUID CROSS JOIN MEDIA CROSS JOIN TERRITORY CROSS JOIN LANGUAGE; END; -- Read-only cross join in SQL: Products x Media x Territory x Language DROP PROCEDURE CROSS_JOIN_CE; CREATE PROCEDURE CROSS_JOIN_CE (OUT var_out TT_TAB) READS SQL DATA WITH RESULT VIEW CJ_CE_VIEW AS BEGIN -- 'query' tables a = CE_COLUMN_TABLE(PRODUCT_GUID, [P]); b = CE_COLUMN_TABLE(MEDIA, [M]); c = CE_COLUMN_TABLE(TERRITORY, [T]); d = CE_COLUMN_TABLE(LANGUAGE, [L]); -- add dummy field F, used for 'fake' cross join a1 = CE_PROJECTION(:a, [P, CE_CALC('1', INTEGER) AS F]); b1 = CE_PROJECTION(:b, [M, CE_CALC('1', INTEGER) AS F]); c1 = CE_PROJECTION(:c, [T, CE_CALC('1', INTEGER) AS F]); d1 = CE_PROJECTION(:d, [L, CE_CALC('1', INTEGER) AS F]); -- 'fake' cross join ab = CE_JOIN(:a1, :b1, [F], [F, P, M]); cd = CE_JOIN(:c1, :d1, [F], [F, T, L]); abcd = CE_JOIN(:ab, :cd, [F], [F, P, M, T, L]); var_out = CE_PROJECTION(:abcd, [P, M, T, L]); END; -- server processing time is about 70 sec SELECT * FROM CJ_SQL_VIEW; -- server processing time is about 700 ms SELECT * FROM CJ_CE_VIEW; -- optional: verify same number of records in each -- SELECT COUNT(*) FROM CJ_SQL_VIEW; -- SELECT COUNT(*) FROM CJ_CE_VIEW; -- optional: verify that results match -- SELECT * FROM CJ_SQL_VIEW ORDER BY P, MEDIA, TERRITORY, LANGUAGE; -- SELECT * FROM CJ_CE_VIEW ORDER BY P, MEDIA, TERRITORY, LANGUAGE;