How-To: Configuring automatic SAP HANA Cleanup with SAP HANACleaner

How-To: Configuring automatic SAP HANA Cleanup with SAP HANACleaner

You are interested in scheduling regular SAP HANA cleanup activities automatically.

Environment

SAP HANA

Cause

Certain SAP HANA cleanup tasks like purging the backup catalog or deleting old trace files (SAP Note 2119087) need to be implemented individually. SAP HANACleaner is now available to perform these tasks automatically

SAP HANACleaner is implemented via Python script.

This script is an expert tool designed by SAP support. You are allowed to use it, but SAP doesn’t take over any responsibility for problems originating from the use of this tool.

Resolution

SAP HANACleaner can be used for the following cleanup tasks:

Task SAP Note
Cleanup of backup catalog entries 2096851
Cleanup of backups 1642148
Cleanup of trace files 2380176
Cleanup of backup.log and backint.log 1642148
Cleanup of SAP HANA alerts 2147247
Cleanup of free log segments 2083715
Cleanup of internal events 2147247
Cleanup of multiple row store containers 2222277
Cleanup of data file fragmentation 1870858
Cleanup of SAP HANACleaner logs 2399996
Optimize compression of tables not compressed 2112604
Optimize compression of tables with columns not compressed 2112604
Optimize compression of tables with large UDIV overhead 2112604

You can install SAP HANACleaner in the following way:

  • Download the attached script hanacleaner.py
  • Copy it to a directory on your SAP HANA database server

Attention: Text-based “copy and paste” can result in unforeseen issues, so you should either download the file directly to the database server or make sure that you use a file-based copy approach that doesn’t modify the file content.

Once it is installed, you can start it. The following command provides you with an overview of the way how SAP HANACleaner works and the available configuration options:

python hanacleaner.py --help

When SAP HANACleaner is called without additional options (i.e. “python hanacleaner.py”) no actions are performed. You always have to specify specific options that suit your needs.

The following command line options exist to adjust the behavior:


        ----  BACKUP ENTRIES in BACKUP CATALOG (and possibly BACKUPS)  ----
-be     minimum retained number of data backup (i.e. complete data backups and data snapshots) entries in the catalog, this
        number of entries of data backups will remain in the backup catalog, all older log backup entries will also be removed
        with BACKUP CATALOG DELETE BACKUP_ID <id> (see SQL reference for more info) default: -1 (not used)
-bd     min retained days of data backup (i.e. complete data backups and data snapshots) entries in the catalog [days], the
        youngest successful data backup entry in the backup catalog that is older than this number of days is the oldest
        successful data backup entry not removed from the backup catalog, default -1 (not used)
        Note: if both -be and -bd is used, the most conservative, i.e. the flag that removes the least number entries, decides
        Note: As mentioned in SAP Note 1812057 backup entries made via backint cannot be recovered, i.e. use -be and -bd with care
        if you want to be able to recover from older data backups (it is possible to recover from a specific data backup without
        the backup catalog)
-bb     delete backups also [true/false], backups are deleted when the related backup catalog entries are deleted with
        BACKUP CATALOG DELETE BACKUP_ID <id> COMPLETE (see SQL reference for more info), default: false
-bo     output catalog [true/false], displays backup catalog before and after the cleanup, default: false
-br     output removed catalog entries [true/false], displays backup catalog entries that were removed, default: false
        Note: Please do not use -bo and -br if your catalog is huge (>10000) entries.
        ----  TRACE FILES  ----
-tc     retention days for trace file content [days], trace file content older than these number of days is removed
        from (almost) all trace files in all hosts (even currently opened tracefiles), default: -1 (not used)
-tf     retention days for trace files [days], trace files, in all hosts, that are older than this number of days are removed
        (except for the currently opened trace files), only files with certain extensions like .trc, .log etc are taken into
        account, backup.log and backint.log, are excepted, please see -zb and -zp instead, default: -1 (not used)
-to     output traces [true/false], displays trace files before and after the cleanup, default: false
-td     output deleted traces [true/false], displays trace files that were deleted, default: false
        ----  DUMP FILES  ----
-dr     retention days for dump files [days], manually created dump files (a.k.a. fullysytem dumps and runtime dumps) that are
        older than this number of days are removed, default: -1 (not used)
        ----  BACKUP LOGS  ----
-zb     backup logs compression size limit [mb], if there are any backup.log or backint.log file (see -zp below) that is bigger
        than this size limit, then it is compressed and renamed, default: -1 (not used)
-zp     zip path, specifies the path (and all subdirectories) where to look for the backup.log and backint.log files,
        default is the directory specified by the alias cdtrace
-zl     zip links [true/false], specifies if symbolic links should be followed searching for backup logs in subdirectories
        of the directory defined by zp (or by alias cdtrace), default: false
-zo     print zipped backup logs, display the backup.log and backint.log that were zipped, default: false
        ----  ALERTS  ----
-ar     min retained alerts days [days], min age (today not included) of retained statistics server alerts, default: -1 (not used)
-ao     output alerts [true/false], displays statistics server alerts before and after the cleanup, default: false
-ad     output deleted alerts [true/false], displays statistics server alerts that were deleted, default: false
        ----  OBJECT LOCKS ENTRIES with UNKOWN OBJECT NAME  ----
-kr     min retained unknown object lock days [days], min age (today not included) of retained object lock entries with unknown
        object name, in accordance with SAP Note 2147247, default: -1 (not used)
        ----  OBJECT HISTORY  ----
-om     object history table max size [mb], if the table _SYS_REPO.OBJECT_HISTORY is bigger than this threshold this table
        will be cleaned up according to SAP Note 2479702, default: -1 (not used)
-oo     output cleaned memory from object table [true/false], displays how much memory was cleaned up from object history
        table, default: false
        ---- LOG SEGMENTS  ----
-lr     max free logsegments per service [number logsegments], if more free logsegments exist for a service the statement
        ALTER SYSTEM RECLAIM LOG is executed, default: -1 (not used)
        ---- EVENTS  ----
-eh     min retained days for handled events [day], minimum retained days for the handled events, handled events that are older
        are removed by first being acknowledged and then deleted, this is done for all hosts, default: -1 (not used)
-eu     min retained days for unhandled events [day], minimum retained days for events, events that are older are removed by
        first being handled and acknowledged and then deleted, this is done for all hosts, default: -1 (not used)
        ----  AUDIT LOG  ----
-ur     retention days for audit log table [days], audit log content older than these number of days is removed,
        default: -1 (not used)
        ----  DATA VOLUMES FRAGMENTATION  ----
-fl     fragmentation limit [%], maximum fragmentation of data volume files, of any service, before defragmentation of that
        service is started: ALTER SYSTEM RECLAIM DATAVOLUME '<host>:<port>’ 120 DEFRAGMENT,        default: -1 (not used)
-fo     output fragmentation [true/false], displays data volume statistics before and after defragmentation, default: false
        ----  MULTIPLE ROW STORE TABLE CONTAINERS   ----
-rc     row store containers cleanup [true/false], switch to clean up multiple row store table containers, default: false
        Note: Unfortunately there is NO nice way to give privileges to the DB User to be allowed to do this. Either you can
        run hanacleaner as SYSTEM user (NOT recommended) or grant DATA ADMIN to the user (NOT recommended)
-ro     output row containers [true/false], displays row store tables with more than one container before cleanup, default: false
        ---- COMPRESSION OPTIMIZATION ----
        1. Both following two flags, -cc, and -ce, must be > 0 to control the force compression optimization on tables that never
        was compression re-optimized (i.e. last_compressed_record_count = 0):
-cc     max allowed raw main records, if table has more raw main rows --> compress if -ce, default: -1 (not used) e.g. 10000000
-ce     max allowed estimated size [GB], if estimated size is larger --> compress if -cc, default: -1 (not used) e.g. 1
        2. All following three flags, -cr, -cs, and -cd, must be > 0 to control the force compression optimization on tables with
        columns with compression type 'DEFAULT' (i.e. no additional compression algorithm in main)
-cr     max allowed rows, if a column has more rows --> compress if -cs&-cd, default: -1 (not used) e.g. 10000000
-cs     max allowed size [MB], if a column is larger --> compress if -cr&-cd, default: -1 (not used) e.g. 500
-cd     min allowed distinct count [%], if a column has less distinct quota --> compress if -cr&-cs, default -1 (not used) e.g. 5
        3. Both following two flags, -cu and -cq, must be > 0 to control the force compression optimization on tables whose UDIV
        quota is too large, i.e. #UDIVs/(#raw main + #raw delta)
-cq     max allowed UDIV quota [%], if the table has larger UDIV quota --> compress if -cu, default: -1 (not used) e.g. 150
-cu     max allowed UDIVs, if a column has more then this number UDIVs --> compress if -cq, default: -1 (not used) e.g. 10000000
        4. Flag -cb must be > 0 to control the force compression optimization on tables with columns with SPARSE (<122.02) or
        PREFIXED and a BLOCK index
-cb     max allowed rows, if a column has more rows and a BLOCK index and SPARSE (<122.02) or PREFIXED then this table should
        be compression re-optimized, default -1 (not used) e.g. 100000
        Following three flags are general; they control all three, 1., 2., 3., 4., compression optimization possibilities above
-cp     per partition [true/false], switch to consider flags above per partition instead of per column, default: false
-cm     merge before compress [true/false], switch to perform a delta merge on the tables before compression, default: false
-co     output compressed tables [true/false], switch to print all tables that were compression re-optimized, default: false
        ---- INTERVALL  ----
-hci    hana cleaner interval [days], number days that hanacleaner waits before it restarts, default: -1 (exits after 1 cycle)
        NOTE: Do NOT use if you run hanacleaner in a cron job!
        ---- INPUT  ----
-ff     flag file, full path to a file that contains input flags, each flag in a new line, all lines in the file that does not
        start with a flag are considered comments, if this flag is used no other flags should be given, default: '' (not used)
        ---- EXECUTE  ----
-es     execute sql [true/false], execute all crucial housekeeping tasks (useful to turn off for investigation with -os=true),
        default: true
        ---- OUTPUT  ----
-os     output sql [true/false], prints all crucial housekeeping tasks (useful for debugging with -es=false), default: false
-op     output path, full path of the folder for the output logs (if not exists it will be created), default = "" (not used)
-or     output retention days, logs in the path specified with -op are only saved for this number of days, default: -1 (not used)
-so     standard out switch [true/false], switch to write to standard out, default:  true
        ---- SERVER FULL CHECK ----
-fs     file system, path to server to check for disk full situation before hanacleaner runs, default: blank, i.e. df -h is used
                     Could also be used to specify a couple of servers with e.g. -fs "|grep sapmnt"
-if     ignore filesystems, before hanacleaner starts it checks that there is no disk full situation in any of the filesystems,
        this flag makes it possible to ignore some filesystems, with comma seperated list, from the -df h command, default: ''
        ----  SSL  ----
-ssl    turns on ssl certificate [true/false], makes it possible to use SAP HANA Cleaner despite SSL, default: false
        ----  USER KEY  ----
-k      DB user key, this one has to be maintained in hdbuserstore, i.e. as <sid>adm do
        > hdbuserstore SET <DB USER KEY> <ENV> <USERNAME> <PASSWORD>                     , default: SYSTEMKEY
        It could also be a list of comma seperated userkeys (useful in MDC environments), e.g.: SYSTEMKEY,TENANT1KEY,TENANT2KEY

The following table lists some examples how to call SAP HANACleaner for different purposes:

Command Details
python hanacleaner.py No execution of actions (“hanacleaner needs input arguments”)
python hanacleaner.py -be 10 -bd 30 -td true Clean up backup catalog entries and backups that are older than 30 days and that don’t be long to the ten newest backups
python hanacleaner.py -tc 42 -tf 42 -ar 42 -bd 42 -zb 50 -eh 2 -eu 42 Clean up statistics server alerts, traces and backup catalog entries older than 42 days, rename and compress backup.log and backint.log when size exceeds 50 MB, handle / acknowledge events after 2 / 42 days

More details and examples can be found in the attached presentation hanacleaner_intro.pdf.

Be the first to comment

Leave a Reply

Your email address will not be published.


*