Frequently Asked Questions

Questions

  • Why can't we just send some AWR reports? AWR reports are great but have a few problems and limitations for our purpose:

  • We need AWR reports with a consistent format (HTML, single instance, correct timezone) and over a long period (default 10 days, with a standard 1-hour interval this results in about 240 AWR reports per database instance)

  • We need the AWRs for each database instance. DBCollect detects all instances automagically (in most cases). The alternative is to manually start SQL*Plus for each instance and generate all reports
  • AWR only provides performance and limited configuration metrics. There is no database size/config information such as sizes of tablespaces, redo logs, temp files, segments, ASM disks/diskgroups, archive/flashback/bct files. DBCollect runs additional (SELECT) queries to get this additional database information
  • No OS configuration or hardware information (such as CPU type & model, memory, etc)
  • No disk/network configuration
  • No UNIX SAR/sysstat performance data
  • No compression, backup, archiving details
  • AWRs are sometimes generated using non-English locale (cannot be parsed)
  • AWRs are sometimes generated in txt format instead of html (hard to parse, error-prone)
  • AWRs are sometimes provided as RAC versions (completely different layout, hard to parse)
  • Usually only a few AWRs are provided, sometimes with a very large interval (many hours or even days) which is not detailed enough to do accurate sizings or performance analysis
  • No way to know if there are other instances on the same system for which we need to know details

  • Is dbcollect safe to run? : dbcollect is designed to run as non-root, user but it has to be the oracle user or a user with sysdba privileges, or any other user using a credentials file. The SQL scripts only contain SELECT statements so they cannot modify database data. The Python tools cannot delete/overwrite any file except in /tmp or the output ZIP file otherwise specified in the arguments. External commands are not executed as root and are verified to only gather system info, not modify anything (some commands may be executed if using 'sudoers' access). CPU consumption is limited by default to either 50% or a maximum of 8 CPUs. These restrictions should make one confident that dbcollect is safe to run on production systems. For additional safety, condider using a credentials file.

  • Why is dbcollect written in Python 2? This is no longer supported! : Python 3 is not available by default on many older systems, i.e. Linux (RHEL/OEL/CentOS), Solaris. On EL6 I even had to backport support for Python 2.6. : Update: dbcollect now works on both Python 2 and Python 3, and Python 3 is the preferred version.

  • How long will it take to run dbcollect ? : This mostly depends on how many AWR/Statspack reports need to be generated and how many CPUs are available. Collecting the OS information usually only takes a few seconds. For normal environments, an AWR report (HTML) takes a about 1-2 seconds, Statspack even less. For a single instance environment, 10 day collect period, 1 hour interval, the amount of reports is about 240 so dbcollect will run for under 10 minutes. There are some known Oracle issues with AWR generation resulting in much longer times. The latest version of dbcollect predicts the remaining time so you have an idea. As of version 1.11, dbcollect runs AWR reports in parallel on each instance, making it much faster.

  • Does dbcollect gather confidential data? : dbcollect only retrieves system configuration files, SAR/AWR/Statspack etc. In AWR and Statspack however, a number of SQL queries (statements) can be visible. For AWR, dbcollect can remove sections containing SQL statements to prevent collecting pieces of potentially confidential data. The values of bind parameters are not visible. See the --strip option. Passwords or user credentials are never collected.

  • dbcollect appears to be a binary package. How do I know what it is doing? : dbcollect is actually a Python ZipApp package. You can unzip it using unzip and list its contents, the Python code and SQL scripts can be extracted using standard zip/unzip tools.

  • How do I know my download has not been tampered with? : If you downloaded dbcollect from github releases using https, you should be good. If you want to make sure, get the MD5 hash and I can check for you if it is the correct one: md5sum dbcollect

  • I want to check what information dbcollect has gathered : Inspect the zip file /tmp/dbcollect-.zip and check its contents.