Oracle OLAP features performance testing Part 1

Since I grew up in a small sea village I always loved fishing. But when I was young it was always hard to wait for a fish, so I think I preferred diving and instead of waiting I went after fishes :) After years of OLTP performance testing OLAP performance testing was a similar experience, I had to train myself again for patience, here in Türkiye there is an old saying; “Only a patient dervish attains his wish.”

Before getting into details I have mentioned importance of testing several times, again this time the results proved that somethings even written in the documentation may not be the optimum for your case.

We are one of the largest Telecom operators in Europe and our data warehouse database is of course a big one. We are on Solaris 10 and using Hitachi storage devices with Oracle 9i Release 2. With company’s aggressive marketing needs our data warehouse needed to be re-structured from bottom to top. So this was a great opportunity for me to advice both continuing with 10g Release 2 and using Oracle’s OLAP features like star transformation and materialized views which the development team once upon an Oracle release tested and didn’t like the outcomes.

There was no upgrade need, for the new 10g Release 2 environment and we had the luxury of setting up a completely different database. This was a big advantage of course for both not upgrading a xx TB database, getting rid of the old fragmented dictionary and staring with a fresh one.

Hüsnü, one of the two DBAs, made lots of interesting load tests with Orion and I hope he also will be blogging some of the interesting results he got. Below are some of the important resources helped us during our tests;

Oracle’s Data Warehousing Documentation
Oracle Database 10g: The Top 20 Features for DBAs by Arup Nanda
Oracle By Example Series – Business Intelligence and Data Warehousing
Using Basic Database Functionality for Data Warehousing
Using Oracle’s Materialized Views Capabilities

So the initial step was to setup a test environment for a specific end user reporting need, after getting the example queries for two big fact and several dimension tables, we immediately started to load several partitions of the facts and all data for the dimensions. Our old friend SQL*Loader with its direct path and nologging options was there for us, which I believe one of the best tools Oracle has ever developed. So here are the next steps of the tests briefly for your informations and comments;

a. Collecting Object and System Statistics after the load

With 10g CBO is the only path, RBO is unsupported anymore. So we gathered object statistics at schema, dictionary and fixed table levels;

-- schema level
exec dbms_stats.gather_schema_stats( ownname =>'ABC', degree=>16, options=>'GATHER AUTO', estimate_percent=>dbms_stats.auto_sample_size, cascade=>TRUE, method_opt=>'FOR ALL COLUMNS SIZE AUTO', granularity=>'ALL');
-- dictionary level
exec dbms_stats.gather_dictionary_stats( degree=>16, options=>'GATHER AUTO', cascade=>TRUE);
-- fixed table level
exec dbms_stats.gather_fixed_objects_stats ;

And also collected system statistics;

SELECT pname, pval1 FROM sys.aux_stats$ WHERE sname = 'SYSSTATS_MAIN';
exec dbms_stats.gather_system_stats('start');
-- produce some load, for both parallel multi-block and single-block reads
exec dbms_stats.gather_system_stats('stop');
SELECT pname, pval1 FROM sys.aux_stats$ WHERE sname = 'SYSSTATS_MAIN';

After 10g there is an automated scheduler job, since RBO is unsupported, to gather statistics. To be more in control we disabled this job;

SELECT owner, job_name,enabled
FROM DBA_SCHEDULER_JOBS
WHERE JOB_NAME = 'GATHER_STATS_JOB';

BEGIN
  DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');
  COMMIT;
END;
/

b. Automatic Storage Management(ASM) and optimizer parameters

Not only it eliminates the performance and budget cost of a file system layer, but especially because of its management advantages ASM with 10g Release 2 was our another test topic. Hüsnü may blog about his Orion test results of ASM versus Veritas file system, here I will only mention that we used external redundancy and set two important ASM parameters as Note:368055.1 advised;

Subject: Deployment of very large databases (10TB to PB range) with Automatic Storage Management (ASM)
_asm_ausize=16777216
_asm_stripesize=1048576

ASM disks are divided into allocation units of 1MB by default, but for data warehouses setting _asm_ausize higher like 16MB is important. Also like statistics for the optimizer there are several important OLAP settings which has to be tuned on 10g Release 2;

SELECT name, value FROM v$system_parameter a
WHERE a.NAME IN ('compatible', 'optimizer_features_enable', 'optimizer_mode', 'parallel_execution_message_size',
        'pga_aggregate_target', 'workarea_size_policy', 'query_rewrite_enabled',
        'query_rewrite_integrity', 'parallel_max_servers', 'disk_asynch_io',
        'db_file_multiblock_read_count', 'star_transformation_enabled')

Here we left behind our 9iR2 parameters and decided to move on with fresh parameters after 10gR2, our strategy was to leave a parameter to its default unless some tests prove that it has to be changed. Hüsnü again made several detailed tests for parameter settings of like db_file_multiblock_read_count and parallel_execution_message_size and we choosed the optimum values from those results.

c. Bitmap indexes and star transformation

After all above initial steps here comes the segment level initial setups. As we know Bitmap İndexes are best for OLAP usage because of their bitmap level(not row level) locking behavior. Also for a better bitmap index the cardinality, the amount of different values of the column, is important.

For Oracle to do star transformation,
1- We created single-column bitmap indexes on fact table’s all dimension keys. The query scenarios filters were all on these bitmap indexed columns on fact table to dimensions.
2- We created unique key constraints on dimension tables’ joined columns.
3- We also created appropriate indexes on dimension filtered columns.

Here we were excited to see if the second large fact table’s id column would be used or not, because of its high cardinality it was guessed to be having higher costs compared to a full-parallel table scan and hash join alternative. Depending on the filtering factors, selecting at most 5% to 10% of the fact, star transformation always performed better than full-parallel table scan and hash join alternative. Also bitmap index on the second fact table’s reference column was used by the optimizer. When the filtering was not selective than full-parallel table scan and hash join took its multi-block read advantage and performed much more better.

Below sql scripts were run several times and we have taken the average values, we needed to write some shell script in order to keep going at nights :)

himalaya@oracle $ nohup sqlplus ../.. @...sql > ...log 2> ...err &

-- inside sql
-- full-parallel table scan and hash join test script
spool ...log
conn ../..
SET LINESIZE 2000
SET AUTOTRACE TRACEONLY
SET TIME ON
SET TIMING ON
ALTER SESSION ENABLE PARALLEL DML ;
alter session set max_dump_file_size=unlimited ;
alter session set timed_statistics = true ;
-- alter session set STATISTICS_LEVEL = ALL ;
-- alter session set "_rowsource_execution_statistics" = true ;
alter session set tracefile_identifier = senaryo1_full ;

-- be sure that this test is not doing star trans. or query rewr.
ALTER SESSION SET star_transformation_enabled=FALSE;
SELECT /*+ norewrite use_hash(cs,ct,tr,ta,pa,sr,dc) parallel(cs) parallel(ct) parallel(tr) parallel(ta) parallel(pa) parallel(sr) pa
rallel(dc) */
/* SENARYO 1 FULL */ ..

-- star transformation test script
spool ...log
conn ../..
SET LINESIZE 2000
SET AUTOTRACE TRACEONLY
SET TIME ON
SET TIMING ON
ALTER SESSION ENABLE PARALLEL DML ;
alter session set max_dump_file_size=unlimited ;
alter session set timed_statistics = true ;
-- alter session set STATISTICS_LEVEL = ALL ;
-- alter session set "_rowsource_execution_statistics" = true ;
alter session set tracefile_identifier = senaryo1_bi ;

-- be sure that this test is not doing query rewr. and star trans. is enabled
ALTER SESSION SET star_transformation_enabled=TRUE;
SELECT /*+ norewrite STAR_TRANSFORMATION */
/* SENARYO 1 BI */ ...

d. Materialized views and query rewrite

And as a second option for read performance we wanted to see was materialized views. Of course since these are the already calculated summaries, they were the best over both star transformation and full access alternatives.

-- materialized view test script
spool ...log
conn ../..
SET LINESIZE 2000
SET AUTOTRACE TRACEONLY
SET TIME ON
SET TIMING ON
ALTER SESSION ENABLE PARALLEL DML ;
alter session set max_dump_file_size=unlimited ;
alter session set timed_statistics = true ;
-- alter session set STATISTICS_LEVEL = ALL ;
-- alter session set "_rowsource_execution_statistics" = true ;
alter session set tracefile_identifier = senaryo1_mv ;

-- force to use mat.view
ALTER SESSION SET star_transformation_enabled=FALSE;
SELECT /*+ rewrite_or_error */
/* SENARYO 1 MV */ ..

d. Microstrategy reporting tool

We use MSTR for the end user reporting needs, so we needed to be sure on MSTR’s behavior after 10gR2. Some questions were, if its driver was up to date and more important since MSTR also does some query rewrite and caching will the same queries from MSTR have the same execution plans like SQL*Plus tests we had, this was our primary concern.

Here we created a database logon trigger for MSTR database user to catch its produced queries to be sure what was going on;

CREATE OR REPLACE TRIGGER trace_trig
AFTER LOGON
ON DATABASE
DECLARE
   sqlstr VARCHAR2(200) := 'ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 8''';
BEGIN
   IF (USER = 'MSTR') THEN
      execute immediate sqlstr;
   END IF;
END trace_trig;
/

And we saw that MSTR was adding some dimension key columns to the original queries which affected materialized view query rewrite option. Also MSTR was having much more waits compared to the simple SQL*Plus client which will be another area to study deeper later.

During this first part I wanted to briefly explain our testing needs, setups and read performance tests. In the second part more important issues will be the topic, the write cost of each option because of daily ETL and data quality updates.

Especially incremental refresh results over MATERIALIZED VIEW LOGs and DBMS_MVIEW.REFRESH will be important since they performed the best results during reading tests in part 1. Here we will also be getting the most out of 10gR2, with DBMS_ADVISOR.TUNE_MVIEW, DBMS_ADVISOR.CREATE_FILE and DBMS_MVIEW.EXPLAIN_MVIEW supplied functionalities.

PS : I also want to advice these three blogs which I think are really good starting points for data warehousing with Oracle;

– The Oracle Sponge – Oracle Data Warehouse Design and Architecture by David Aldridge – http://oraclesponge.wordpress.com/
– Pete Scott’s random notes – The rantings of a middle-aged computer consultant and generally nice person – http://pjsrandom.wordpress.com/
– Rittman Mead Consulting – “Delivering Oracle Business Intelligence” – http://www.rittmanmead.com/blog/

Survey > Which Oracle Expert do you want to see in İstanbul?

Last year I made my wish list and until now Steven Feuerstein is my only wish came true. But last year I was very lucky to attend their seminars and meet with Lutz Hartmann, Julian Dyke, Joze Senegacnik, Alejandro Vargas and of course Jonathan Lewis in İstanbul.

These seminars were very beneficial for me that from now on I really do not want to attend any Oracle University education but these kind of seminars. So I decided to create my new wish list for 2008 and distribute to our OracleTURK community over an Apex survey to see what people’s feedbacks will be. Here is my list, really hardly eliminated to 10 and sorted randomly;

  • John Scott with a two/three days Oracle Apex Best Practices Seminar
  • Steve Adams with a two/three days Oracle Internals Seminar
  • Pete Finnigan with a two/three days Oracle Security Best Practices Seminar
  • Adrian Billington with a two/three days Oracle Database Developer Best Practices Seminar
  • Werner Puschitz with a two/three days Oracle on Linux Best Practices Seminar
  • Laurent Schneider with a two/three days Oracle Advanced SQL Seminar
  • Tanel Poder with a two/three days Oracle on Linux Advanced Tuning Techniques Seminar
  • Richard Foote with a two/three days Oracle Indexes Best Practices Seminar
  • Joel Goodman with a two/three days Oracle MAA Best Practices Seminar
  • Christian Antognini with a two/three days Oracle Optimizer Best Practices Seminar

And last special words goes for Thomas Kyte; dear Tom you are and will be always at the top of any kind of these wish lists of mine, please come to İstanbul and be our guest :)

ps : and if you want to know how this survey application can be possible at a quite sunday morning, here is my cookbook;

– check out this otn article for some inspiration; http://www.oracle.com/technology/oramag/oracle/06-mar/o26browser.html

– and here is my customized metadata and survey application

it just took 18 minutes for me to develop and deploy this application, over internet, so hail and god bless Apex ;)

Log Buffer #79: A Carnival of the Vanities for DBAs

Dear guest, this is the 79th edition of Log Buffer a Carnival of the Vanities for DBAs, the weekly review of database blogs. Thanks to Dave Edwards of Pythian for helping me during the week especially for non-Oracle blogs. Log Buffer is published every Friday afternoon and if you want to contribute please get in touch with Dave. This is the the Log Buffer Guidelines for the Log Buffer editors. You may also consider subscribing to the Log Buffer Feed.

Since I did Log Buffer #56 and this is my second time, I tried harder and I hope you like this Log Buffer #79. First of all, I want to announce below two new categories I added to my Log Buffer, if others may like these can also be a part of the tradition I guess :)

Oracle blogs hot topics of the week

It all started for me when I saw Justin Kestelyn‘s post with subject Blog Tagged Again. Then immediately all Oracle bloggers started to tag each other(including me:) and Howard Roger‘s critique(may not be reached at the moment since Howard shutdown immediate his site to protest blog tagging) came after. Tim Hall had different thoughts and when all bloggers were so close to a divide into two one hero, Richard Foote, came with a different approach to tagging and in my opinion he saved the idea. If you are already tagged, you may take a side as you wish, but it would be great to follow every blogger tagged by a subject like Greg Rahn or Jonathan Lewis‘s giving their 8 things on Oracle 11g Optimizer New Features :) Also with this chance I want to support some part of Howard’s critique and give a hint on efficient RSS feed following;

– WordPress blog owners please set – Options > Reading > Syndication Feeds from “Full text” to “Summary”

– Blogger blog owners please set – Settings > Site Feed > Allow Blog Feeds from “Full” to “Short”

OTN forums and Oracle-L list hot discussions of the week

And here is my second new category, the first subject is “star transformation tuning and db file sequential read wait” and in this discussion you will find how lots of more i/o may mean less response time and lots of valuable hints on bitmap indexes. Also Jonathan Lewis was requested for help during the discussion and he answered :) Second subject I picked from OTN forums this week is Strange ORDER BY results which includes nice order by (case when .. tricks.

And my favorite Oracle discussion list, Oracle-L, like always it was again very busy this week, and this is the thread I choosed; “Oracle recommends rebuilding IOTs in AQ to reduce redo” which have replies from Jonathan Lewis and Greg Rahn arguing on excess redo produced and the coalesce option over rebuilding for AQs.

And at last comes the traditional part of the Log Buffer with this week’s blog highlights starts, I had really hard time to choose because of the unusual blogging traffic of this week, please sorry me if I missed some but feel free to comment and add yours :)

Oracle

Richard Foote, after an OTN forum thread, warns for the DBMS_STATS METHOD_OPT default behavior changed in 10g with a demo.

Yousef Rifai at DBA-Village Weekly Newsletter hints on set appinfo on option of SQL*Plus to find out what sql script is running in a nested sql script setting.

This week was Tanel Poder‘s week I guess; do you also use sql*plus as a calculator? If it is XE it is also free :) Tanel also showed why does an Oracle parameter count change during session lifetime this week.

Paul Vallee of Pythian Group argues on how to qualify a good backups and provides a paper for the bad habits of a dba.

Christian Bilien is looking after the memory with processes using pmap –x at Where has all my memory gone?

Gints Plivna shows the differences between MINUS-MINUS ALL and INTERSECT-INTERSECT ALL.

Marco Gralike has some reasons for not to use Oracle VM.

Oracle 11g specials of this week came from Tyler Muth on SQL Performance Analyzer and Greg Rahn on Real-Time SQL Monitoring Using DBMS_SQLTUNE.REPORT_SQL_MONITOR. We just got over our 10g Release 2 migrations and started to feel confident with its new features, and now all these articles :)

Oracle ACEs specials starts with Hans Forbrich who has a very busy schedule and wants to have a break on OTN forums for a while, we will miss you Hans take care!

Jonathan Lewis shows the filter plan bug with an example.

John Scott argues his UKOUG 2007 presentation feedback results, I was also there, this was an expected result for me.

Kevin Closson answers on how to choose from the Last of the Non-NUMA Xeon-based Servers.

Patrick Wolf warns that Oracle APEX 3.1 Eval Instance is refreshed.

Nicolas Gasparotto is still on his Peoplesoft Road and we started to miss him at OTN forums :)

Tim Hall is doing 11g new features series, as usual these resources are best of the best, here is the Data Recovery Advisor and RMAN Enhancements. I personally believe that soon Tim Hall and Dan Morgan will beat the documentation :)

Dan Norris shares Interesting Metalink findings and Steven Feuerstein who will be in İstanbul next month for a seminar argues on a functions should return data only through the RETURN clause, of course this is a PL/SQL function :)

MySQL

On MySQL Performance Blog it is announced that the most up to date version of MySql 5.0.54 is now available for CentOS starting with CentOS4.

DbaDojo provides an example and resources on how to setup Multiple MySQL instances on EC2.

Xaprb lists what is new in Maatkit.

Sheeri Cabral of Phytian Group works with SHOW VARIABLES and shows Variables MySQL Does Not Know About :)

Sql Server

On Jeff’s SQL Server Blog an interesting database modeling dilemma is discussed.

On Danny’s SQL Server and Internals Viewer Blog an example on how to get data into XML than to get it back out is showed.

On SimonS Blog on SQL Server Stuff SQL Server 2008 Spatial Performance of database calls are benchmarked.

On Joseph Sack’s SQL Server Blog an example is given on how SET options impact the Query Optimizer index choice

PostgreSQL

Devrim GUNDUZ announces that PostgreSQL RPM Yum repository and RPM Buildfarm are now available.

Magnus Hagander shows how to fake the dbo role.

Josh Berkus discusses why to stay current with updates.

Other

Glenn Fawcett started a throughput computing series.

And of course for much more you can always visit Eddie Awad‘s http://orana.info/ or subscribe to its category RSSs.

Thanks for reading my Log Buffer and I hope you enjoyed it. Looking forward to meet you again in the Oracle BlogSphere.

Warm regards from İstanbul,
Tonguç

I am tagged also, so here are 8 things about me :)

This week I am preparing the Log Buffer #79 and it is very busy in deed, I am trying to apply the archives and hope to catch up tonight with the recovery :)

This tagging thing is this week’s hot issue, and it reached me as well by Rob van Wijk, thank you Rob :) Like Rob I’ll make an exception for this post since I have my non-Oracle blog at blogspot and I guess this will be my first non-Oracle post in my wordpress Oracle blogging career. And here we go;

1) I was born in February 1977 and have one younger brother, who is a microsoft and macromedia trainer(funny ha:) and doing very well, I am very proud of him, soon he will a Mvp I guess :)

2) I love to(and try to) play the guitar, rock’n’roll was always my escape exit from life, and I am a 3* Cmas diver. Once after the university I really thought to leave this world and pay my bills with my hobby, then Oracle turned out to be my hobby, ironically :)

3) I love Fenerbahce! This comes from my parents, I (nearly:)die for my team, no other words are needed I guess, watch our game at the Champions League, I will be there :)

4) I was born and grown in Fethiye, a beautiful Mediterranean village, where I think to go back and die happily with my grant children :) My family owns a small hotel there, so I am proudly announce here that any Oracle blogger will have %25(good one really:) discount if they will contact me if they want to visit my hometown, Fethiye :)

5) Recently I have been hit by a truck and my car was nearly destroyed, I was so lucky to get out without any harm. Happy to be still alive, lots of things to do :)

6) I am not married yet, but my grand parents are making a lot of pressure on me for a baby :)

7) I love traveling, I have been to New York, Orlando, Miami, San Francisco, Los Angles, London, Paris, Berlin, Cairo, Red Sea and some of the best places of my country, Türkiye. I want to travel to Asia and Australia soon and I also believe that I live in the best city of the world, İstanbul :)

8) This will be an all time confession; I was once an Informix lover and resisted to migrate to Oracle and even prepared a presentation at that time to my company’s management and tried to convince them not to migrate :) Also this year I am very interested with Postgre SQL, since it is open source and very similar to Oracle, I think it is a good start point to learn the internal stuff deeper.

And now it’s my turn to tag, as far as I followed these are my blogger friends who are still not tagged, tadaaaaa :)

1. Coskans Approach to Oracle http://coskan.wordpress.com/
2. Alejandro Vargas’ Blog http://blogs.oracle.com/AlejandroVargas/
3. Jeremy Schneider’s Blog http://www.ardentperf.com/category/technical
4. Software Engineer Thoughts by Pawel Barut http://pbarut.blogspot.com/
5. Patrick Wolf’s Blog http://inside-apex.blogspot.com
6. OraStory by Dominic Brooks http://orastory.wordpress.com/
7. Databasesystems info by Kubilay Çilkara http://kubilaykara.wordpress.com
8. Arul’s Oracle Zone http://oraclezone.wordpress.com/

See you tomorrow with the Log Buffer, I will also have little surprises ;)

ps : Howard Roger’s did it again; http://www.dizwell.net/prod/archives/57 world needs Howard style people more than ever today, don’t walk and question the path everyone walks :)

Oracle myths and information pollution revisited

This post is parallel to these two very valuable recent posts;
http://structureddata.org/2007/12/16/oracle-myth-busting-show-dont-tell/
http://richardfoote.wordpress.com/2007/12/12/why-are-there-so-many-oracle-related-myths-the-inconvenient-truth/

Search engines are like knives; you can cut bread or easily your hand with them. The pollution they bring into our lives is unbelievable, usually we have limited time resource for investigating on an Oracle topic, also you should never want to loose the trust people have on you which you have building up for years of hard work.

So I created this one for my Google searches, I was inspired from dizwell‘s search and now I have been using it for months and this never failed me up to now. It is really easy to use it; in the below text search and replace “resource+manager” with any combination of words you are investigating on Oracle, concatenated with “+” sign, and start to copy&paste the produced links from top to down into your browser until you are satisfied with the results;

Effective Oracle Investigation on Search Engine Google

Bottom line is, be careful with polluted internet environment, never buy any “guru”‘s word immediately, test it and make sure you are convinced of the results on your own environment. Never believe all you read or hear, just because it is printed or someone is more experienced doesn’t mean the advised to be TRUE for you..
https://tonguc.wordpress.com/beware-of-question-authorities/

ps: warning for his fans like me, dizwell has moved – http://www.dizwell.net/prod/

oracle-developer.net’s feed

If you are an Oracle developer, Adrian Billington’s http://www.oracle-developer.net and Dan Morgan’s Psoug Oracle SQL & PL/SQL Library http://psoug.org/library.html are most probably also two of your favorite resources.

I didn’t know that oracle-developer.net has an rss feed, so I wanted to share; http://www.oracle-developer.net/feed.xml

Recently 11g new features are being discussed. Also if you didn’t see yet, these two will be great for starters :)

Sql Plan Enhancements in 10g
http://www.oracle-developer.net/display.php?id=316

Pl/Sql Optimisation in 10g
http://www.oracle-developer.net/display.php?id=314

ps: does anyone know if psoug has a similar rss feed also?

New partition pruning features of 10g

As I mentioned earlier I started working with our data warehouse team on their migration project; Planning and Testing the 9iR2 to 10gR2 migration of our XXX Tb DWH

I will be testing Oracle’s 10gR2 features for performance. So before tests in details, I also wanted to mention some important very large Oracle database init.ora 10g parameters additional to ASM and CBO configurations I already mentioned; Listing 177a – Some important very large Oracle database init.ora 10g parameters

During some initial tests I experienced new partition pruning features of 10g; “PARTITION RANGE OR”-“KEY(OR)” and “PARTITION RANGE MULTI-COLUMN”-“KEY(MC)” access paths; Code Listing 177b – New partition pruning features of 10g

These tests will be really cool :)

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :
Note:368055.1
Subject: Deployment of very large databases (10TB to PB range) with Automatic Storage Management (ASM)

Oracle Database 10g (10.2 & 10.1) New Features in Data Warehousing

Oracle10g-Features für Warehouses by Sven Vetter

Top Ten New Data Warehousing Features In Oracle Database 10g by Mark Rittman

Oracle Database 10g: Top Features for DBAs Release 2 Features Addendum by Arup Nanda
Part 4: Data Warehousing and Integration Features