ODA X8-2: the new generation

September 17, 2019, 1:50 pm

≫ Next: OpenShift 4.1 Partner Introduction Day

≪ Previous: How to get a big picture of K8s pods and PVs by script

Introduction

ODA X7-2 marked the maturity of the Oracle Database Appliance evolution, with 3 models suitable for nearly all customers using Oracle databases. The X7-2M was my favorite, with a lots of cores available (and restrainable), enough memory for dozens of databases, and strong storage performance and capacity (with a maximum of 50TB RAW NVMe). And taking up only 1U of space in your rack. Today, 2 years later, Oracle just introduced the new generation of ODAs in pair with the availability of the long awaited 19c release, aka 12.2.0.3. Let’s discover what’s interesting for you and I.

What are the changes on the hardware side?

If you were waiting for breakthrough technology, you will be disapointed. Nothing amazing here. Basically the hardware is quite the same as previous gen. You still can choose between S, M and HA appliances. S and M being single nodes, and HA a two-node configuration.

Disk size didn’t increase, at least for the S and M models. 6.4TB for a single disk, still great but no evolution since 2017. The most important change is on the form factor. A X7-2 compute node was previously 1U. X8-2 are now taller, 2U. Is it an advantage? For sure, 2U means more disks. But it’s only interesting for X8-2M because, as usual, X8-2S does not support more than 2 disks, and X8-2HA has no disk in the compute nodes. As a result, X8-2M model with maximum extension can reach 76.8TB RAW capacity, 50% more than previous gen. It’s quite a big improvement for this model. HA now has bigger SSD (7.68TB vs 3.2TB on X7-2HA). For huge databases, no more reason to use disk-based technologies.

Some customers complained about the lack of network interfaces (only 1 usable on X7-2S/M), you now have 3 of them, 1 standard and 2 optionnals. Copper, Fiber, or a mix.

Regarding the CPU, S model now shares the same 16-core CPUs as the others, but it still has 1 CPU. M has 2 CPUs, and HA 2×2 CPUs. You probaly won’t miss the 2 extra cores in the X7-2M/HA CPUs.

What’s new regarding the software?

As you may know, software will be available for the older ODA too, so everyone, except from first-gen owners, will benefit from the software update. 19c package is not yet available but stay tuned, it should be here very soon.

The software improvement you will benefit at least are:

support for 19c database: the long term long awaited version is now available on ODA
it wasn’t sure Oracle would keep a 11.2.0.4 database engine available for this brand new release, but actually it’s still available. ODA is now capable of running 2 strong terminal releases covering most of the customer needs: 11gR2 and 19c
RAC disappeared from 19c SE2, no regrets
odacli should be more stable and more powerfull: 18.3 brought the end of oakcli, since then HA and lite ODAs are sharing the same admin tool odacli. 19c will be the third version of this unified tool after the release of 18.5 several month ago.

What are the differences between the 3 models?

Oracle decided to keep 3 models, the first one (S) for an entry price point with 16 cores (6 more than X7-2S), 192GB and not expandable disks. The second one (M) for the most of us with lots of CPU cores (32: 4 less than previous gen), comfortable amount of memory (392GB) and a maximum of 76.8TB RAW capacity (+50%). The third one is for RAC lovers because it’s sometimes still a choice. This HA ODA gains an impressive increase in disk capacity with a maximum of 369TB thanks to the new 7.68TB SSDs on high performance version. There is still an offer for those who don’t care about maximum I/O speed: ODA HA High Capacity. It uses a mix of SSD and HDD.

Model	DB Edition	nodes	U	RAM GB	RAM max GB	RAW TB	RAW TB max	base price
ODA X8-2S	SE2/EE	1	2	192	384	12.8	12.8	18’500$
ODA X8-2M	SE2/EE	1	2	384	768	12.8	76.8	30’000$
ODA X8-2HA HP	SE2/EE	2	8/12	2×384	2×768	46	369	77’000$
ODA X8-2HA HC	SE2/EE	2	8/12	2×384	2×768	298	596	77’000$

Which one should you choose?

If your database(s) can comfortably fit in the S model, don’t hesitate as you will probably never need more. ODA X8-2S is the perfect choice for those using Standard Edition 2. Take a second one with DBVisit Standby and it’s a real bargain for a disaster protected Oracle database environment.

Most interesting model is still the M, like the previous generation was. M is quite afordable, and extremely dense regarding the TB available (76TB in 2U). And it’s upgradable in case you don’t buy it fully loaded at the beginning.

If you still want/need RAC and all these clustered and complicated stuff, the HA is for you and it’s basically 2 ODA X8-2M without local storage and with a SAS disk enclosure. But don’t forget that 2 ODAs X8-2M will be less expensive than one X8-2HA, if you can replace RAC with Data Guard.

What about the licenses and the support?

ODA is not sold with the database licenses: you need to bring yours or buy them at the same time. With Standard Edition 2, you’ll need 1 license per ODA on S and M models. 2 licenses for HA model but you will not be able to use RAC, so this combination makes no sense. If you’re using Enterprise Edition, you’ll need at least 1 license on a S and M models (2 activated cores) and at least 2 licenses on HA (2 activated cores per node). Enabling your EE license on a ODA will actually decrease the number of cores on the server to make sure you are compliant but it doesn’t prevent you to use an unlicensed option.

Regarding the support, as other hardware vendors you’ll have to pay for your ODA to be supported, in case of hardware or software failure. Support for the database licenses is the same as the other platforms. Don’t forget that if you’re still using 11gR2 or 12cR1 database, paying standard support is not enough, extended support is recommended, but it comes at a supplemental cost. But without extended support MOS will not prevent you to download and apply the 11.2.0.4 patches, as it’s part of the global patch.

Conclusion

Everyone is focused on the Cloud now. But on premise solutions are still great to keep everything under control, at home. These new appliances confirmed that ODA is a strong product in the Oracle engineered system line, and that’s why most of the customer love them. Not considering ODA for your next infrastructure is definitly missing a mainstream solution for your databases.

Cet article ODA X8-2: the new generation est apparu en premier sur Blog dbi services.

↧

OpenShift 4.1 Partner Introduction Day

September 18, 2019, 10:57 am

≫ Next: When Read-Scale availability groups and Windows Failover Cluster are not good friends

≪ Previous: ODA X8-2: the new generation

Today was the OpenShift Partner Introduction Day by RedHat. This event happened at Lausanne
There were people with different backgrounds.

After presenting the RedHat compagny, the speaker explained what is Openshifts and why people must adopt it.
What we will retain is that OpenShift is trusted enterprise Kubernetes

With OpenShift we can for example
-Automated, full-stack installation from the container host to application services
-Seamless Kubernetes deployment to any cloud or on-premises environment
-Autoscaling of cloud resources
-One-click updates for platform, services,and applications

The new features in the version 4.1 were presented. The speaker also showed the Red Hat Openshift business value

The notions of Ansible Playbooks, Operator, CRI-O, Helm … were also explained.
The speaker also did a little demonstration of creating a small project with OpenShift.
Below the speaker during the demonstration

This was a very interesting event. It was general and allowed people to understand where Openshift is located in the architecture of containerization. But we have to retain that there are lot of components to understand when using OpenShift

Cet article OpenShift 4.1 Partner Introduction Day est apparu en premier sur Blog dbi services.

↧

When Read-Scale availability groups and Windows Failover Cluster are not good friends

September 19, 2019, 10:11 pm

≫ Next: Oracle 19c : Point-In-Time Recovery in a PDB

≪ Previous: OpenShift 4.1 Partner Introduction Day

A couple of days ago, with some fellow French data platform MVPS (@thesqlgrrrl and @Conseilit) we discussed an issue around Read-Scale availability groups and it made me think I had forgotten to share about a weird behavior I experienced with them.

Basically, Read-Scale availability groups are clusterless infrastructures meaning there is no need to install an underlying cluster. Obviously, you will not benefit from resource orchestration and automatic failover, but this is obviously the intended behavior and their sole purpose and design is to scale out a read workload

Let’s say you have installed a Read-Scale availability group that includes 2 replicas with one primary and one secondary.

If there is a network failure between replicas the normal behavior should be:

– The primary replica continues to handle the R/W workload. Regarding the replication type (sync/async) the current queries may be blocked until reaching the session timeout and then everything should back to normal. Obviously, transactions will fill up the transaction log until the network outage is fixed.

– The secondary replica continues to handle the RO traffic assuming you use a direct connection to it and you don’t use the transparent redirection capability of the AG listener which may lead you to face the following error message:

Error: Microsoft ODBC Driver 17 for SQL Server : Unable to access the ‘<database>’ database because no online secondary replicas are enabled for read-only access. Check the availability group configuration to verify that at least one secondary replica is configured for read-only access. Wait for an enabled replica to come online, and retry your read-only operation..

But here comes the interesting part of the story. Let’s introduce in the game the Windows Failover Cluster. As said previously there is no need to install such HA feature on Windows to make the Read-Scale AG working correctly. But suppose now you want to test read scale capabilities on an existing AG infrastructure that relies on the underlying cluster for HA. Why to do this? Well, because in my case with my customer we didn’t want to provision a dedicated infrastructure to only test Read-Scale AGs and we wanted to take advantage of the existing one for that. Anyway, referring to the Microsoft documentation and my google Fu as well, we didn’t find out any know evidence of “incompatibility” of using Read-Scale AG in conjunction of the Windows Failover Cluster. So, we just went ahead, and we added a new SQL Server 2017 Read-Scale AG to an existing infrastructure that already included other AGs for HA (and therefore implicitly an underlying WSFC installed).

The new installed Read-Scale AG was configured correctly as show below:

Simulating a failure scenario to voluntary shutdown the WSFC gave surprising results because we didn’t expect to see connections issues with the new Read-Scale AG both on the primary and the secondary. Let’s clarify this point: As mentioned above, for AG with HA capabilities, this result is consistent with a failed Windows Failover Cluster that has compromised the AG infrastructure. But with Read-Scale AGs because they are not tied to any HA mechanism, we didn’t expect to see the AG to get a “Resolving” state with no access to the primary database as confirmed with the following error message:

Msg 983, Level 14, State 1, Line 3
Unable to access availability database ‘<database>’ because the database replica is not in the PRIMARY or SECONDARY role.
Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role.
Try the operation again later.

Looking at the SQL Server error log gave us back some clues:

…
Always On Availability Groups: Local Windows Server Failover Clustering node is no longer online. This is an informational message only. No user action is required.
Always On: The availability replica manager is going offline because the local Windows Server Failover Clustering (WSFC) node has lost quorum. This is an informational message only. No user action is required.
Always On: The local replica of availability group ‘AGSCALE’ is stopping. This is an informational message only. No user action is required.
The state of the local availability replica in availability group ‘AGSCALE’ has changed from ‘PRIMARY_NORMAL’ to ‘RESOLVING_NORMAL’. The state changed because the local instance of SQL Server is shutting down. For more information, see the SQL Server error log or cluster log. If this is a Windows Server Failover Clustering (WSFC) availability group, you can also see the WSFC management console.
Always On Availability Groups connection with secondary database terminated for primary database ‘<database>’ on the availability replica ‘WIN20192\SQL17’ with Replica ID: {5e3d8c42-9c52-449a-aa90-a16665aca055}. This is an informational message only. No user action is required.
The availability group database “<database>” is changing roles from “PRIMARY” to “RESOLVING” because the mirroring session or availability group failed over due to role synchronization. This is an informational message only. No user action is required.
State information for database ‘<database>’ – Hardened Lsn: ‘(42:10776:1)’ Commit LSN: ‘(42:10768:3)’ Commit Time: ‘Sep 19 2019 1:45PM’
Always On: The availability replica manager is starting. This is an informational message only. No user action is required.
Always On Availability Groups: Waiting for local Windows Server Failover Clustering service to start. This is an informational message only. No user action is required.
Error: 983, Severity: 14, State: 73.
Unable to access availability database ‘<database>’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.
Always On Availability Groups: Local Windows Server Failover Clustering service started. This is an informational message only. No user action is required.
…
Attempt to access non-existent or uninitialized availability group with ID ‘{890B3B2C-3BD5-EE9D-1594-899F989700CF}.’. This is usually an internal condition, such as the availability group is being dropped or the local WSFC node has lost quorum. In such cases, and no user action is required.
Error: 983, Severity: 14, State: 1.
Unable to access availability database ‘<database>’ because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.
…

The Windows Failover Cluster has shutdown due to a quorum lost (expected result) but also bring the Read-Scale down … The above output tends to think that even with CLUSTER_TYPE=NONE there still exists a strong relationship between the AG and the underlying cluster. By the way, if you look at the sys.dm_hadr_cluster_members you get info of existing cluster.

SELECT 
	member_name,
	member_type_desc AS member_type,
	member_state_desc AS member_state,
	number_of_quorum_votes
FROM sys.dm_hadr_cluster_members

From an architecture standpoint, I would say mixing both AGs with HA capabilities and Read-Scale AGs may be not a good practice at all and make no sense, and we probably came across a borderline scenario. However, my feeling is that technically speaking it should work as described in the Microsoft documentation I noticed this behavior with SQL Server 2017 and different CUs and Windows Server 2016. I got the same behavior with Windows Server 2019 as well.

Please feel free to comment if there is something I omitted !

Anyway, we finally dedicated an environment for Read-Scale AGs and we got consist results this time

See you!

Cet article When Read-Scale availability groups and Windows Failover Cluster are not good friends est apparu en premier sur Blog dbi services.

↧

Oracle 19c : Point-In-Time Recovery in a PDB

September 20, 2019, 9:51 am

≫ Next: Rolling Upgrade of a Galera Cluster with ClusterControl

≪ Previous: When Read-Scale availability groups and Windows Failover Cluster are not good friends

Point-In-Time Recovery is also possible in a multitenant environment. As in Non-CDB, a recovery catalog can be used or not. In this blog we will see how to recover a dropped tablespace in a PDB. We will also see the importance of using a recovery catalog or not.
A PITR of a PDB does not affect remaining PBDs. That means that while doing a PITR in PDB, people can use the other PDBs. In this blog we are using an oracle 19c database with local undo mode enabled

SQL> 
  1  SELECT property_name, property_value
  2  FROM   database_properties
  3* WHERE  property_name = 'LOCAL_UNDO_ENABLED'

PROPERTY_NAME        PROPE
-------------------- -----
LOCAL_UNDO_ENABLED   TRUE
SQL>

SELECT con_id, tablespace_name FROM   cdb_tablespaces WHERE  tablespace_name LIKE 'UNDO%';

    CON_ID TABLESPACE_NAME
---------- ------------------------------
         3 UNDOTBS1
         4 UNDOTBS1
         1 UNDOTBS1

SQL>

We suppose that
-We have a tablespace named MYTABPDB2
-We have a valid backup of the whole database
-A recovery catalog is not used

Now connecting to the PDB2, let’s drop a tablespace after creating a restore point.

SQL> show con_name;

CON_NAME
------------------------------
PDB2

SQL> create restore point myrestpoint;

Restore point created.

SQL>
SQL> drop tablespace mytabpdb2 including contents and datafiles;

Tablespace dropped.

SQL>

And now let’s perform a PITR to the restore point myrestpoint

1- Connect to the root container

[oracle@oraadserver ~]$ rman target /

[oracle@oraadserver ~]$ rman target /

Recovery Manager: Release 19.0.0.0.0 - Production on Fri Sep 20 13:07:07 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1546409981)

RMAN>

2- Close the PDB

RMAN> ALTER PLUGGABLE DATABASE PDB2 close;

using target database control file instead of recovery catalog
Statement processed

RMAN>

3- Do the PITR

RMAN> run
{
  SET TO RESTORE POINT myrestpoint;
   RESTORE PLUGGABLE DATABASE pdb2;
   RECOVER PLUGGABLE DATABASE pdb2;
}2> 3> 4> 5> 6>

executing command: SET until clause

Starting restore at 20-SEP-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=54 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00013 to /u01/app/oracle/oradata/ORCL/pdb2/system01.dbf
channel ORA_DISK_1: restoring datafile 00014 to /u01/app/oracle/oradata/ORCL/pdb2/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00015 to /u01/app/oracle/oradata/ORCL/pdb2/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00016 to /u01/app/oracle/oradata/ORCL/pdb2/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/ORCL/92359E387C754644E0531502A8C02C00/backupset/2019_09_20/o1_mf_nnndf_TAG20190920T141945_gr9jzry9_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/ORCL/92359E387C754644E0531502A8C02C00/backupset/2019_09_20/o1_mf_nnndf_TAG20190920T141945_gr9jzry9_.bkp tag=TAG20190920T141945
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
Finished restore at 20-SEP-19

Starting recover at 20-SEP-19
current log archived
using channel ORA_DISK_1


starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 20-SEP-19

RMAN>

4- Open the PDB on resetlogs mode

RMAN> alter pluggable DATABASE  pdb2 open resetlogs;

Statement processed

RMAN>

I did not get any error from RMAN, but when looking the alert log file, I have following errors

PDB2(4):Pluggable database PDB2 dictionary check beginning
PDB2(4):Tablespace 'MYTABPDB2' #7 found in data dictionary,
PDB2(4):but not in the controlfile. Adding to controlfile.
PDB2(4):File #25 found in data dictionary but not in controlfile.
PDB2(4):Creating OFFLINE file 'MISSING00025' in the controlfile.
PDB2(4):Pluggable Database PDB2 Dictionary check complete
PDB2(4):Database Characterset for PDB2 is AL32UTF8

Seems there is some issue with the recovery of MYTABPDB2 tablespace. Connected to PDB2 I can have

SQL> select FILE_NAME,TABLESPACE_NAME from dba_data_files where TABLESPACE_NAME='MYTABPDB2';

FILE_NAME
--------------------------------------------------------------------------------
TABLESPACE_NAME
------------------------------
/u01/app/oracle/product/19.0.0/dbhome_3/dbs/MISSING00025
MYTABPDB2

The tablespace was not recovered as expected.
What happens? In fact this issue is expected according Doc ID 2435452.1 where we can find
If the point in time recovery of the pluggable database is performed without the catalog, then it is expected to fail

As we are not using a recovery catalog, backup information are stored in the control file and it seems that the actual control file is no longer aware of the data file 25.
As specified in the document, we have to use a recovery catalog

Now let’s connect to a catalog and do again the same PITR
After connecting to the catalog we do a full backup. Then we drop the tablespace and run again the same recovery command while connecting to the catalog. We use the time before the tablespace was dropped.

[oracle@oraadserver trace]$ rman catalog rman/rman@rmancat

Recovery Manager: Release 19.0.0.0.0 - Production on Fri Sep 20 15:28:29 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to recovery catalog database

RMAN> connect target /

connected to target database: ORCL (DBID=1546409981)

After closing PDB2 we run following bloc

RMAN> run
{
  SET UNTIL TIME "to_date('20-SEP-2019 15:27:00','DD-MON-YYYY HH24:MI:SS')";
   RESTORE PLUGGABLE DATABASE pdb2;
   RECOVER PLUGGABLE DATABASE pdb2;
}
2> 3> 4> 5> 6>
executing command: SET until clause

Starting restore at 20-SEP-19
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00013 to /u01/app/oracle/oradata/ORCL/pdb2/system01.dbf
channel ORA_DISK_1: restoring datafile 00014 to /u01/app/oracle/oradata/ORCL/pdb2/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00015 to /u01/app/oracle/oradata/ORCL/pdb2/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00016 to /u01/app/oracle/oradata/ORCL/pdb2/users01.dbf
channel ORA_DISK_1: restoring datafile 00026 to /u01/app/oracle/oradata/ORCL/pdb2/mytabpdb201.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/ORCL/92359E387C754644E0531502A8C02C00/backupset/2019_09_20/o1_mf_nnndf_TAG20190920T152554_gr9nws0x_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/ORCL/92359E387C754644E0531502A8C02C00/backupset/2019_09_20/o1_mf_nnndf_TAG20190920T152554_gr9nws0x_.bkp tag=TAG20190920T152554
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15

datafile 26 switched to datafile copy
input datafile copy RECID=5 STAMP=1019489668 file name=/u01/app/oracle/oradata/ORCL/pdb2/mytabpdb201.dbf
Finished restore at 20-SEP-19
starting full resync of recovery catalog
full resync complete

Starting recover at 20-SEP-19
using channel ORA_DISK_1


starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 20-SEP-19

RMAN>

We then open PDB2 with resetlogs mode and then verify with sqlplus

SQL> select FILE_NAME,TABLESPACE_NAME from dba_data_files where TABLESPACE_NAME='MYTABPDB2';

FILE_NAME
--------------------------------------------------------------------------------
TABLESPACE_NAME
------------------------------
/u01/app/oracle/oradata/ORCL/pdb2/mytabpdb201.dbf
MYTABPDB2


SQL>

And this time the PITR works fine. The tablespace was restored.

Conclusion

As seen in this blog, it is recommended to use a recovery catalog when coming to do some PITR operations in a multitenant environment.

Cet article Oracle 19c : Point-In-Time Recovery in a PDB est apparu en premier sur Blog dbi services.

↧

Rolling Upgrade of a Galera Cluster with ClusterControl

September 23, 2019, 1:29 am

≫ Next: Managing Licenses with AWS License Manager

≪ Previous: Oracle 19c : Point-In-Time Recovery in a PDB

Rolling Upgrade is easy

In this blog, I will show you how easy it is with Clustercontrol to perform a Galera cluster “Rolling Upgrade” without any loss of service.
Let’s say we want to upgrade from Percona XtraDB Cluster version 5.6 to 5.7.
The same procedure can be used to upgrade MariaDB (from 10.1 to 10.2 or 10.3)

Prerequisites

First of all, make sure that on your Galera cluster all nodes are synchronized.
From the Dashboard, on the tab Overview, then the Galera Nodes window, in the Last “Committed” column , all 3 figures must be identical.

Then disable from the GUI the “cluster & node auto-recovery”either by clicking on both until it gets red or by setting temporarely on the clustercontrol server in the
/etc/cmon.d/cmon_N.cnf file (N stands for Cluster ID) the 2 following parameters:
– enable_cluster_autorecovery=0
– enable_node_autorecovery=0
don’t forget to restart the cmon service.
systemctl restart cmon
It is very important & even crucial otherwise Clustercontrol will try everytime to restart the Galera node when you will stop it during the upgrade process.

Now we have to put the first Galera node in maintenance mode for one hour.

The Cluster status bar should be now as following.

Cluster Upgrade

Log on the first master node using your favorite terminal emulator “putty” or “MobaXterm”, open 2 sessions and stop the Percona service on the first node.
# service mysql status SUCCESS! MySQL (Percona XtraDB Cluster) running (19698) # service mysql stop Shutting down MySQL (Percona XtraDB Cluster).. SUCCESS! # service mysql status ERROR! MySQL (Percona XtraDB Cluster) is not running
Remove now existing Percona XtraDB Cluster and Percona XtraBackup packages
#[root@node1 yum.repos.d]# yum remove percona-xtrabackup* Percona-XtraDB-Cluster* Loaded plugins: fastestmirror, ovl Setting up Remove Process Resolving Dependencies --> Running transaction check ---> Package Percona-XtraDB-Cluster-56.x86_64 1:5.6.44-28.34.1.el6 will be erased ---> Package Percona-XtraDB-Cluster-client-56.x86_64 1:5.6.44-28.34.1.el6 will be erased ---> Package Percona-XtraDB-Cluster-galera-3.x86_64 0:3.34-1.el6 will be erased ---> Package Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.44-28.34.1.el6 will be erased ---> Package Percona-XtraDB-Cluster-shared-56.x86_64 1:5.6.44-28.34.1.el6 will be erased ---> Package percona-xtrabackup.x86_64 0:2.3.10-1.el6 will be erased --> Finished Dependency Resolution Removed: Percona-XtraDB-Cluster-56.x86_64 1:5.6.44-28.34.1.el6 Percona-XtraDB-Cluster-client-56.x86_64 1:5.6.44-28.34.1.el6 Percona-XtraDB-Cluster-galera-3.x86_64 0:3.34-1.el6 Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.44-28.34.1.el6 Percona-XtraDB-Cluster-shared-56.x86_64 1:5.6.44-28.34.1.el6 percona-xtrabackup.x86_64 0:2.3.10-1.el6 Complete!
Install the new packages
#[root@node1 yum.repos.d]# yum install Percona-XtraDB-Cluster-57 Resolving Dependencies --> Running transaction check ---> Package Percona-XtraDB-Cluster-57.x86_64 0:5.7.26-31.37.1.el6 will be installed --> Processing Dependency: Percona-XtraDB-Cluster-client-57 = 5.7.26-31.37.1.el6 for package: Percona-XtraDB-Cluster-57-5.7.26-31.37.1.el6.x86_64 --> Processing Dependency: Percona-XtraDB-Cluster-server-57 = 5.7.26-31.37.1.el6 for package: Percona-XtraDB-Cluster-57-5.7.26-31.37.1.el6.x86_64 --> Running transaction check ---> Package Percona-XtraDB-Cluster-client-57.x86_64 0:5.7.26-31.37.1.el6 will be installed ---> Package Percona-XtraDB-Cluster-server-57.x86_64 0:5.7.26-31.37.1.el6 will be installed --> Processing Dependency: Percona-XtraDB-Cluster-shared-57 = 5.7.26-31.37.1.el6 for package: Percona-XtraDB-Cluster-server-57-5.7.26-31.37.1.el6.x86_64 --> Processing Dependency: percona-xtrabackup-24 >= 2.4.12 for package: Percona-XtraDB-Cluster-server-57-5.7.26-31.37.1.el6.x86_64 --> Processing Dependency: qpress for package: Percona-XtraDB-Cluster-server-57-5.7.26-31.37.1.el6.x86_64 --> Running transaction check ---> Package Percona-XtraDB-Cluster-shared-57.x86_64 0:5.7.26-31.37.1.el6 will be installed ---> Package percona-xtrabackup-24.x86_64 0:2.4.15-1.el6 will be installed ---> Package qpress.x86_64 0:11-1.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved Installed: Percona-XtraDB-Cluster-57.x86_64 0:5.7.26-31.37.1.el6 Dependency Installed: Percona-XtraDB-Cluster-client-57.x86_64 0:5.7.26-31.37.1.el6 Percona-XtraDB-Cluster-server-57.x86_64 0:5.7.26-31.37.1.el6 Percona-XtraDB-Cluster-shared-57.x86_64 0:5.7.26-31.37.1.el6 percona-xtrabackup-24.x86_64 0:2.4.15-1.el6 qpress.x86_64 0:11-1.el6 Complete!
Start the node outside the cluster (in standalone mode) by setting the wsrep_provider variable to none.
$ ps -edf|grep -i mysql $ mysqld --user=mysql --wsrep-provider='none'
Run now mysql_upgrade in the second session[root@node1 /]# mysql_upgrade -u root -p Enter password: Checking if update is needed. Checking server version. Running queries to upgrade MySQL server. Checking system database. mysql.columns_priv OK mysql.db OK mysql.engine_cost OK mysql.user OK Upgrading the sys schema. Checking databases. sys.sys_config OK Upgrade process completed successfully. Checking if update is needed
When the upgrade is over, stop the mysqld process.
You can either kill the mysqld process ID or use mysqladmin shutdown with the MySQL root user credentials.
$ mysqladmin shutdown -uroot -p
Now you can restart the upgraded node to join the Galera cluster in the first session.
$ service mysql start Starting Percona-Xtradb-server.190612 13:04:33 mysqld_safe Logging to '/var/log/mysql/mysqld.log'. 190612 13:04:33 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql .. SUCCESS!

Post-exec tasks

From the GUI, disable the maintenance mode and check for the new version by logging in the instance.
[root@node1 /]# mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 64 Server version: 5.7.27-30-57 Percona XtraDB Cluster (GPL), Release rel30, Revision 64987d4, WSREP version 31.39, wsrep_31.39 Once the first node is upgraded, you can repeat exactly the same procedure for all the other nodes in the cluster.

Now you can repeat exactly the same procedure for the other nodes of the cluster. At the end, Clustercontrol should display the same version for all nodes.

Conclusion

Rolling upgrade of a Galera cluster with Clustercontrol is really easy and fast with no or very few impact on the service.

Cet article Rolling Upgrade of a Galera Cluster with ClusterControl est apparu en premier sur Blog dbi services.

↧

Managing Licenses with AWS License Manager

September 25, 2019, 4:51 am

≫ Next: Red Hat Enterprise Linux 8 – Stratis

≪ Previous: Rolling Upgrade of a Galera Cluster with ClusterControl

Introduction

Computing environments became more and more agile over these last years. Companies need to provide solutions helping people to quickly set up new resources, starting and stopping them, scaling them according to the need and finally, removing them. In such environments, it could be tricky to follow license compliance when resources are changing on hourly basis.

Having a look on AWS services, I saw that AWS provides a license managing tool named “AWS License Manager”. I took few minutes in order to:

Understand which resources this service is able to monitor
How it works
Test it with an on-premise Linux server executing an oracle database

License Manager Service

The first step in order to use License Manager is to select it in the list of AWS Services.

AWS Services List

After having clicked on AWS License Manager, the AWS License Manager window will appear.

Now, we simply have to create a license configuration with required license terms according to the software vendor. You can setup different king of metrics such as

vPCUs
Cores
Sockets
Instances

License Manager also provides the possibility to enforce license limit, meaning that it prevents license usage after available licenses are exhausted.

AWS Create License configuration

In a context of on-premise License monitoring, it is important to notice that sockets and cores license’s type are not accepted. Therefore, in this example I used vCPUs.

Error while trying to associate Socket License to an on-premise host

AWS System Manager

Once the license configuration created, it’s now mandatory to use another AWS Service, AWS System Manager. This service allows you to view and control your infrastructure on AWS. AWS System Manager not only allows you to view and control your Amazon EC2 Instance but also on-premises servers, virtual machines (including VMs in other cloud environments). Some System Manager capabilities are not free, however in the context of this example everything is free.

AWS System Manager Agent (SSM Agent)

In order to benefit from AWS System Manager we need to install AWS Systems Manager Agent (SSM Agent) on our on-premised host. Indeed, SSM Agent is an Amazon software that can be installed and configured on an Amazon EC2 instance, an on-premises server, or a virtual machine (VM) and provides a solution to update, manage, and configure resources. SSM Agent is installed, by default on instances created from Windows Server 2016 and Windows Server 2019, Amazon Linux, Ubuntu Server Images AMIs. However, if you are running an on-premise server you need to install it. The process is really straightforward as presented below.

[root@vmrefdba01 ~]# mkdir /tmp/ssm
[root@vmrefdba01 ~]# curl https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm -o /tmp/ssm/amazon-ssm-agent.rpm
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 18.9M  100 18.9M    0     0  3325k      0  0:00:05  0:00:05 --:--:-- 4368k
[root@vmrefdba01 ~]# sudo yum install -y /tmp/ssm/amazon-ssm-agent.rpm
Loaded plugins: refresh-packagekit, ulninfo
Setting up Install Process
Examining /tmp/ssm/amazon-ssm-agent.rpm: amazon-ssm-agent-2.3.707.0-1.x86_64
Marking /tmp/ssm/amazon-ssm-agent.rpm to be installed
public_ol6_UEK_latest                                    | 2.5 kB     00:00
public_ol6_UEK_latest/primary_db                         |  64 MB     00:07
public_ol6_latest                                        | 2.7 kB     00:00
public_ol6_latest/primary_db                             |  18 MB     00:07
Resolving Dependencies
--> Running transaction check
---> Package amazon-ssm-agent.x86_64 0:2.3.707.0-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

======================================================================================================================================
 Package                            Arch                     Version                        Repository                           Size
======================================================================================================================================
Installing:
 amazon-ssm-agent                   x86_64                   2.3.707.0-1                    /amazon-ssm-agent                    61 M

Transaction Summary
======================================================================================================================================
Install       1 Package(s)

Total size: 61 M
Installed size: 61 M
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : amazon-ssm-agent-2.3.707.0-1.x86_64                                                                                1/1
amazon-ssm-agent start/running, process 3896
  Verifying  : amazon-ssm-agent-2.3.707.0-1.x86_64                                                                                1/1

Installed:
  amazon-ssm-agent.x86_64 0:2.3.707.0-1

Complete!

Creating an activation

Once the agent installed, we have to create a new “Activation” in the AWS System Manager Service by clicking on “Create activation“. At the end of the creation you will get an Activation Code and an Activation ID (in the green field below). You have to keep this information for the agent configuration.

AWS System Manager Activation

Agent Configuration

In order to register your on-premise instance on AWS, you simply have to execute the following command with the activation code and activation id provided by AWS System Manager

sudo amazon-ssm-agent -register -code "<cf Activation Code>" -id "<cf Activation ID>" -region "us-east-2"

2019-09-19 13:53:05 INFO Successfully registered the instance with AWS SSM using Managed instance-id: mi-0756a9f0dc25be3cd

Once registered the Managed Instance should appear as presented below in AWS Systems Manager

AWS Systems Manager – Managed Instances

The Platform type is detected as well as the Kernel version, IP address and computer name. AWS Systems Manager provides also a package inventory and many other kinds of inventory such as Network inventory, Files inventory, aso…

AWS Systems Manager – Application Inventory

Association between License Configuration and Resource ID

We now have to make the link between the Managed Instance (resource) and the license configuration. The goal of course is to define which license configuration will be applied to which resource. In order to proceed, we have to go into the AWS License Manager, and select “Search Inventory” into the menu. Then we simply have to select the resource and then click on “Associate license Configuration”.

AWS License Manager – Search Inventory

The following window will appear, allowing you to define to which license configuration matches which resource:

Having a look in the AWS License Manager Dashboard, you can see that 1 out of 1 license is consumed since I dedicated one vCPU to my virtual machine and I provided 1vCPU license to this instance.

AWS License Manager – Dashboard

Core Messages

AWS License Manager offers more functionalities for EC2 Instances than for on-premise servers.
AWS License Manager offers functionalities in order to monitor socket, vCPU, Cores and Instances.
AWS License Manager definitively helps to manage licenses but doesn’t fit with all requirements and license model.
AWS Systems Manager is a powerful tool providing several functionalities.

Strenghts

AWS License Manager is free.
AWS License Manager offers possibilities to monitor on-premise resources.
AWS License Manager provides solution in order to prevent instance to run if license compliance doesn’t fit.
AWS License Manager and AWS System Manager are straightforward to install and configure.
AWS License Manager and AWS System Manager offers a good documentation.
AWS System Manager offers many free functionalities (Patch Manager, Session Manager, Insights Dashboard, aso…).
AWS System Manager offers many functionalities and is the basis of several other AWS tools such as AWS Config which allows to monitor instance’s compliance.

Weaknesses

AWS License Manager is not able by default to monitor options usage such as the ones of an Oracle database (Partitioning, Active Data Guard, aso…).
AWS License Manager is not able to calculate Oracle Processors, meaning taking into consideration core factors.
AWS System Manager is not able to monitor socket or cores on on-premise resources, only vCPUs.

Cet article Managing Licenses with AWS License Manager est apparu en premier sur Blog dbi services.

↧

Red Hat Enterprise Linux 8 – Stratis

September 27, 2019, 7:46 am

≫ Next: Using non-root SQL Server containers on Docker and K8s

≪ Previous: Managing Licenses with AWS License Manager

The Initial Release (8.0.0) of Red Hat Enterprise Linux 8 is available since May 2019.
I’ve already blogged about one of its new feature (AppStream) during the Beta version. In this post I will present Stratis, which is a new local storage-management solution available on RHEL8.

Introduction

LVM, fdisk, ext*, XFS,… there is plenty of terms, tools and technologies available for managing disks and file systems on a Linux server. In a general way, setting up the initial configuration of storage is not so difficult, but when it comes to manage this storage (meaning most of the time extend it), that’s where things can get a bit more complicated.
The goal of Stratis is to provide an easy way to work on local storage, from the initial setup to the usage of more advanced features.
Like Btrfs or ZFS, Stratis is a “volume-managing filesystems”. VMF’s particularity is that it can be used to manage volume-management and filesystems layers into one, using the concept of “pool” of storage, created from one or more block devices.

Stratis is implemented as a userspace daemon triggered to configure and monitor existing components :
[root@rhel8 ~]# ps -ef | grep stratis root 591 1 0 15:31 ? 00:00:00 /usr/libexec/stratisd –debug [root@rhel8 ~]#
To interact with the deamon a CLI is available (stratis-cli) :
[root@rhel8 ~]# stratis --help usage: stratis [-h] [--version] [--propagate] {pool,blockdev,filesystem,fs,daemon} ... Stratis Storage Manager optional arguments: -h, --help show this help message and exit --version show program's version number and exit --propagate Allow exceptions to propagate subcommands: {pool,blockdev,filesystem,fs,daemon} pool Perform General Pool Actions blockdev Commands related to block devices that make up the pool filesystem (fs) Commands related to filesystems allocated from a pool daemon Stratis daemon information [root@rhel8 ~]#

Among the Stratis features we can mention :
> Thin provisioning
> Filesystem snapshots
> Data integrity check
> Data caching (cache tier)
> Data redundancy (raid1, raid5, raid6 or raid10)
> Encryption

Stratis is only 2 years old and the current version is 1.0.3. Therefore, certain features are not yet available such as redundancy for example :
[root@rhel8 ~]# stratis daemon redundancy NONE: 0 [root@rhel8 ~]#

Architecture

Startis architecture is composed of 3 layers :
Block device
A blockdev is the storage used to make up the pool. That could be :
> Hard drives / SSDs
> iSCSI
> mdraid
> Device Mapper Multipath
> …

Pool
A pool is a set of Block devices.

Filesystem
Filesystems are created from the pool. Stratis supports up to 2^4 filesystems per pool. Currently you can only created XFS filesystem on top of a pool.

Let’s try…

I have a new empty 5G disk on my system. This is the blockdev I want to use :
[root@rhel8 ~]# lsblk /dev/sdb NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdb 8:16 0 5G 0 disk [root@rhel8 ~]#

I create pool composed of this unique blockdev…
[root@rhel8 ~]# stratis pool create pool01 /dev/sdb

…and verify :
[root@rhel8 ~]# stratis pool list Name Total Physical Size Total Physical Used pool01 5 GiB 52 MiB [root@rhel8 ~]#

On top of this pool I create a XFS filesystem called “data”…
[root@rhel8 ~]# stratis fs create pool01 data [root@rhel8 ~]# stratis fs list Pool Name Name Used Created Device UUID pool01 data 546 MiB Sep 04 2019 16:50 /stratis/pool01/data dc08f87a2e5a413d843f08728060a890 [root@rhel8 ~]#

…and mount it on /data directory :
[root@rhel8 ~]# mkdir /data [root@rhel8 ~]# mount /stratis/pool01/data /data [root@rhel8 ~]# df -h /data Filesystem Size Used Avail Use% Mounted on /dev/mapper/stratis-1-8fccad302b854fb7936d996f6fdc298c-thin-fs-f3b16f169e8645f6ac1d121929dbb02e 1.0T 7.2G 1017G 1% /data [root@rhel8 ~]#
Here the ‘df’ command report the current used and free sizes as seen and reported by XFS. In fact this is the thin-device :
[root@rhel8 ~]# lsblk /dev/mapper/stratis-1-8fccad302b854fb7936d996f6fdc298c-thin-fs-f3b16f169e8645f6ac1d121929dbb02e NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT /dev/mapper/stratis-1-8fccad302b854fb7936d996f6fdc298c-thin-fs-f3b16f169e8645f6ac1d121929dbb02e 253:7 0 1T 0 stratis /data [root@rhel8 ~]#

This is not very useful, because the real usage of the storage is less due to thin provisioning. And also because Stratis will automatically grow the filesystem if it nears XFS’s currently sized capacity.

Let’s extend the pool with a new disk of 1G…
[root@rhel8 ~]# lsblk /dev/sdc NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdc 8:32 0 1G 0 disk [root@rhel8 ~]# [root@rhel8 ~]# stratis pool add-data pool01 /dev/sdc

…and check :
[root@rhel8 ~]# stratis blockdev Pool Name Device Node Physical Size State Tier pool01 /dev/sdb 5 GiB In-use Data pool01 /dev/sdc 1 GiB In-use Data [root@rhel8 pool01]# stratis pool list Name Total Physical Size Total Physical Used pool01 6 GiB 602 MiB [root@rhel8 ~]#

A nice feature of Stratis is the possibility to duplicate a filesystem with a snapshot.
For this test I create a new file on the filesystem “data” we just added :
[root@rhel8 ~]# touch /data/new_file [root@rhel8 ~]# ls -l /data total 0 -rw-r--r--. 1 root root 0 Sep 4 20:43 new_file [root@rhel8 ~]#

The operation is straight forward :
[root@rhel8 ~]# stratis fs snapshot pool01 data data_snap [root@rhel8 ~]#

You can notice that Stratis don’t make a difference between a filesystem and a snapshot filesystem. They are the same kind of “object” :
[root@rhel8 ~]# stratis fs list Pool Name Name Used Created Device UUID pool01 data 546 MiB Sep 04 2019 16:50 /stratis/pool01/data dc08f87a2e5a413d843f08728060a890 pool01 data_snap 546 MiB Sep 04 2019 16:57 /stratis/pool01/data_snap a2c45e9a15e74664bab5de992fa884f7 [root@rhel8 ~]#

I can now mount the new Filesystem…
[root@rhel8 ~]# mkdir /data_snap [root@rhel8 ~]# mount /stratis/pool01/data_snap /data_snap [root@rhel8 ~]# df -h /data_snap Filesystem Size Used Avail Use% Mounted on /dev/mapper/stratis-1-8fccad302b854fb7936d996f6fdc298c-thin-fs-a2c45e9a15e74664bab5de992fa884f7 1.0T 7.2G 1017G 1% /data_snap [root@rhel8 ~]#

…and check that my test file is here :
[root@rhel8 ~]# ls -l /data_snap total 0 -rw-r--r--. 1 root root 0 Sep 4 20:43 new_file [root@rhel8 ~]#

Nice ! But… can I snapshot a filesystem in “online” mode, meaning when data are writing on it ?
Let’s create another snapshot from one session, while a second session is writing on the /data filesystem.
From session 1 :
[root@rhel8 ~]# stratis fs snapshot pool01 data data_snap2

And from session 2, in the same time :
[root@rhel8 ~]# dd if=/dev/zero of=/data/bigfile.txt bs=4k iflag=fullblock,count_bytes count=4G

Once done, the new filesystem is present…
[root@rhel8 ~]# stratis fs list Pool Name Name Used Created Device UUID pool01 data_snap2 5.11 GiB Sep 27 2019 11:19 /stratis/pool01/data_snap2 82b649724a0b45a78ef7092762378ad8

…and I can mount it :
[root@rhel8 ~]# mkdir /data_snap2 [root@rhel8 ~]# mount /stratis/pool01/data_snap /data_snap2 [root@rhel8 ~]#

But the file inside seems to have changed (corruption) :
[root@rhel8 ~]# md5sum /data/bigfile.txt /data_snap2/bigfile.txt c9a5a6878d97b48cc965c1e41859f034 /data/bigfile.txt cde91bbaa4b3355bc04f611405ae4430 /data_snap2/bigfile.txt [root@rhel8 ~]#

So, the answer is no. Stratis is not able to duplicate a file system online (at least for the moment). Thus I would strongly recommend to un-mount the filesystem before creating a snapshot.

Conclusion

Stratis is an easy-to-use tool for managing local storage on RHEL8 server. But due to its immaturity I would not recommend to use it in a productive environment yet. moreover some interesting features like raid management or data integrity check are not available for the moment, but I’m quite sure that the tool will evolve quickly !

If you want to now more, all is here.
Enjoy testing Stratis and stay tuned to discover its evolution…

Cet article Red Hat Enterprise Linux 8 – Stratis est apparu en premier sur Blog dbi services.

↧

Using non-root SQL Server containers on Docker and K8s

September 29, 2019, 9:54 am

≫ Next: Migrating Oracle database from windows to ODA

≪ Previous: Red Hat Enterprise Linux 8 – Stratis

This is something that I waited for a while, in fact since SQL Server 2017 … and the news came out on Wednesday 09^th September 2019. Running Non-Root SQL Server Containers is now possible either on the next version of SQL Server (2019) and it has been backported on SQL Server 2017 as well. Non-root SQL Server containers will likely be part of hidden gem of SQL Server new features, but this is definitely a good news for me because it will facilitate the transition of SQL Server containers on production from a security standpoint.

At this stage, no need to precise why it is not a best practice to run SQL Server containers or more generally speaking applications with root privileges within a container. For further information, I invite you to take a look at the different threats implied by a such configuration with your google-fu.

Let’s start with docker environments. First, Microsoft provides a Docker file to build an image either for SQL Server 2017 and SQL Server 2019. We may notice the Docker file is already based on a SQL Server docker image and performs some extra configuration for non-root privilege capabilities. I put here the interesting part:

# Exmple of creating a SQL Server 2019 container image that will run as a user 'mssql' instead of root
# This is example is based on the official image from Microsoft and effectively changes the user that SQL Server runs as
# and allows for dumps to generate as a non-root user


FROM mcr.microsoft.com/mssql/server:2019-latest

# Create non-root user and update permissions
#
RUN useradd -M -s /bin/bash -u 10001 -g 0 mssql
RUN mkdir -p -m 770 /var/opt/mssql && chgrp -R 0 /var/opt/mssql

# Grant sql the permissions to connect to ports <1024 as a non-root user
#
RUN setcap 'cap_net_bind_service+ep' /opt/mssql/bin/sqlservr

# Allow dumps from the non-root process
# 
RUN setcap 'cap_sys_ptrace+ep' /opt/mssql/bin/paldumper
RUN setcap 'cap_sys_ptrace+ep' /usr/bin/gdb

# Add an ldconfig file because setcap causes the os to remove LD_LIBRARY_PATH
# and other env variables that control dynamic linking
#
RUN mkdir -p /etc/ld.so.conf.d && touch /etc/ld.so.conf.d/mssql.conf
RUN echo -e "# mssql libs\n/opt/mssql/lib" >> /etc/ld.so.conf.d/mssql.conf
RUN ldconfig

USER mssql
CMD ["/opt/mssql/bin/sqlservr"]

Note the different sections where the mssql user is created and is used when running the image. So, the new image specification implies running the sqlservr process using this mssql user as shown below:

$ docker exec -ti sql19 top

The user process is well identified by its name because it is already defined in the /etc/password file within the container namespace:

$ docker exec -ti sql19 cat /etc/passwd | grep mssql
mssql:x:10001:0::/home/mssql:/bin/bash

Let’s go ahead and let’s talk about persisting SQL Server database files on an external storage. In this case, we need to refer to the Microsoft documentation to configure volumes and underlying storage permissions regarding the scenario we will have to deal with.

If you don’t specify any user (and group) when spinning up the container, the sqlservr process will run with the identity of the mssql user created inside the container and as part of the root group. The underlying host filesystem must be configured accordingly, either a user with same UID = 10001 or the root group GUID = 0). Otherwise chances are you will experience permission issues with the following error message:

SQL Server 2019 will run as non-root by default.
This container is running as user mssql.
To learn more visit https://go.microsoft.com/fwlink/?linkid=2099216.
/opt/mssql/bin/sqlservr: Error: Directory [/var/opt/mssql/system/] could not be created.  Errno [13]

If you want to run the container as part of a custom user and group created on your own, you must be aware of the different database file placement scenarios. The first one consists in using the default configuration with all the SQL Server logs, data and transaction log files in /var/opt/mssql path. In this case, your custom user UID and GUID can be part of the security context of the hierarchy folder on the host as follows:

$ ls -l | grep sqlserver
drwxrwx---. 6 mssql mssql 59 Sep 27 19:08 sqlserver

$ id mssql
uid=1100(mssql) gid=1100(mssql) groups=1100(mssql),100(users)

The docker command below specifies the UID and GUID of my custom user through the -u parameter:

docker run -d \
 --name sql19 \
 -u $(id -u mssql):$(id -g mssql) \
 -e "MSSQL_PID=Developer" \
 -e "ACCEPT_EULA=Y" \
 -e "SA_PASSWORD=Password1" \
 -e "MSSQL_AGENT_ENABLED=True" \
 -e "MSSQL_LCID=1033" \
 -e "MSSQL_MEMORY_LIMIT_MB=2048" \
 -v "/u00/sqlserver:/var/opt/mssql" \
 -p 1451:1433 -d 2019-latest-non-root

Note the username is missing and replaced by the UID of the mssql user created on my own.

This is a normal behavior because my user is not known within the container namespace. There is no record from my user with UID = 1001. The system only knows the mssql user with UID = 10001 as shown below:

I have no name!@e698c3db2180:/$ whoami
whoami: cannot find name for user ID 1100

$ cat /etc/passwd | grep mssql | cut -d":" -f 1,3
mssql:10001

For a sake of curiosity, we may wonder how SQL Server makes the choice of using the correct user for the sqlservr process. Indeed, I created two users with the same name but with different UIDs and I think that after some investigations, taking a look at the uid_entry point definition in the microsoft/mssql-docker github project could help understanding this behavior:

If we don’t specify the UID / GUID during the container’s creation, the whoami command will fail and the mssql user’s UID defined in the Dockerfile (cf. USER mssql) will be chosen.

The second scenario consists in introducing some SQL Server best practices in terms of database file placement. In a previous blog post, I wrote about a possible implementation based on a flexible architecture for SQL Server on Linux and which may fit with containers. In this case, database files will be stored outside of the /var/opt/mssql default path and in this case, the non-root container has the restriction that it must run as part of the root group as mentioned in the Microsoft documentation:

The non-root container has the restriction that it must run as part of the root group unless a volume is mounted to '/var/opt/mssql' that the non-root user can access. The root group doesn’t grant any extra root permissions to the non-root user.

Here my implementation of the flexible architecture template with required Linux permissions in my context:

$ ls -ld /u[0-9]*/sql*2/
drwxrwx---. 2 mssql root    6 Sep 24 22:02 /u00/sqlserver2/
drwxrwx---. 2 mssql root 4096 Sep 27 14:20 /u01/sqlserverdata2/
drwxrwx---. 2 mssql root   25 Sep 27 14:20 /u02/sqlserverlog2/
drwxrwx---. 2 mssql root    6 Sep 24 22:04 /u03/sqlservertempdb2/
drwxrwx---. 2 mssql root    6 Sep 27 10:09 /u98/sqlserver2/

… with:

/u00/sqlserver2 (binaries structure that will contain remaining files in /var/opt/mssql path)
/u01/sqlserverdata2 (data files including user, system and tempdb databases)
/u02/sqlserverlog2 (transaction log files)
/u98/sqlserver2 (database backups)

And accordingly, my docker command and parameters to start my SQL Server container that will sit on my flexible architecture:

docker run -d \
 --name sql19 \
 -u $(id -u mssql):0 \
 -e "MSSQL_PID=Developer" \
 -e "ACCEPT_EULA=Y" \
 -e "SA_PASSWORD=Password1" \
 -e "MSSQL_AGENT_ENABLED=True" \
 -e "MSSQL_LCID=1033" \
 -e "MSSQL_MEMORY_LIMIT_MB=2048" \
 -e "MSSQL_MASTER_DATA_FILE=/u01/sqlserverdata/master.mdf" \
 -e "MSSQL_MASTER_LOG_FILE=/u02/sqlserverlog/mastlog.ldf" \
 -e "MSSQL_DATA_DIR=/u01/sqlserverdata" \
 -e "MSSQL_LOG_DIR=/u02/sqlserverlog" \
 -e "MSSQL_BACKUP_DIR=/u98/sqlserver" \
 -v "/u00/sqlserver2:/var/opt/mssql" \
 -v "/u01/sqlserverdata2:/u01/sqlserverdata" \
 -v "/u02/sqlserverlog2:/u02/sqlserverlog" \
 -v "/u98/sqlserver2:/u98/sqlserver" \
 -p 1451:1433 -d 2019-latest-non-root

The mssql user created on my own from the host (with UID = 1100) is used by the sqlservr process:

The system and user database files are placed according to my specification:

master> create database test;
Commands completed successfully.
Time: 0.956s
master> \n ldd %%
+--------+----------------+---------------------------------+-----------+
| DB     | logical_name   | physical_name                   | size_MB   |
|--------+----------------+---------------------------------+-----------|
| master | master         | /u01/sqlserverdata/master.mdf   | 71        |
| master | mastlog        | /u02/sqlserverlog/mastlog.ldf   | 32        |
| tempdb | tempdev        | /u01/sqlserverdata/tempdb.mdf   | 128       |
| tempdb | templog        | /u01/sqlserverdata/templog.ldf  | 128       |
| tempdb | tempdev2       | /u01/sqlserverdata/tempdb2.ndf  | 128       |
| tempdb | tempdev3       | /u01/sqlserverdata/tempdb3.ndf  | 128       |
| tempdb | tempdev4       | /u01/sqlserverdata/tempdb4.ndf  | 128       |
| model  | modeldev       | /u01/sqlserverdata/model.mdf    | 128       |
| model  | modellog       | /u01/sqlserverdata/modellog.ldf | 128       |
| msdb   | MSDBData       | /u01/sqlserverdata/MSDBData.mdf | 236       |
| msdb   | MSDBLog        | /u01/sqlserverdata/MSDBLog.ldf  | 12        |
| test   | test           | /u01/sqlserverdata/test.mdf     | 128       |
| test   | test_log       | /u02/sqlserverlog/test_log.ldf  | 128       |
+--------+----------------+---------------------------------+-----------+

I may correlate the above output with corresponding files persisted on underlying storage according to my flexible architecture specification:

$ sudo ls -lR /u[0-9]*/sqlserver*2/
/u00/sqlserver2/:
total 4
drwxrwx---. 2 mssql root 4096 Sep 28 17:39 log
drwxr-xr-x. 2 mssql root   25 Sep 28 17:39 secrets

/u00/sqlserver2/log:
total 428
-rw-r-----. 1 mssql root  10855 Sep 28 17:39 errorlog
-rw-r-----. 1 mssql root  10856 Sep 28 17:37 errorlog.1
-rw-r-----. 1 mssql root      0 Sep 28 17:37 errorlog.2
-rw-r-----. 1 mssql root  77824 Sep 28 17:37 HkEngineEventFile_0_132141586653320000.xel
-rw-r-----. 1 mssql root  77824 Sep 28 17:39 HkEngineEventFile_0_132141587692350000.xel
-rw-r-----. 1 mssql root   2560 Sep 28 17:39 log_1.trc
-rw-r-----. 1 mssql root   2560 Sep 28 17:37 log.trc
-rw-r-----. 1 mssql root   6746 Sep 28 17:37 sqlagent.1
-rw-r-----. 1 mssql root   6746 Sep 28 17:39 sqlagent.out
-rw-r-----. 1 mssql root    114 Sep 28 17:39 sqlagentstartup.log
-rw-r-----. 1 mssql root 106496 Sep 28 17:37 system_health_0_132141586661720000.xel
-rw-r-----. 1 mssql root 122880 Sep 28 17:41 system_health_0_132141587698940000.xel

/u00/sqlserver2/secrets:
total 4
-rw-------. 1 mssql root 44 Sep 28 17:39 machine-key

/u01/sqlserverdata2/:
total 105220
-rw-r-----. 1 mssql root      256 Sep 27 14:20 Entropy.bin
-rw-r-----. 1 mssql root  4653056 Sep 28 17:39 master.mdf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 modellog.ldf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 model.mdf
-rw-r-----. 1 mssql root 14024704 Sep 27 14:20 model_msdbdata.mdf
-rw-r-----. 1 mssql root   524288 Sep 27 14:20 model_msdblog.ldf
-rw-r-----. 1 mssql root   524288 Sep 27 14:20 model_replicatedmaster.ldf
-rw-r-----. 1 mssql root  4653056 Sep 27 14:20 model_replicatedmaster.mdf
-rw-r-----. 1 mssql root 15466496 Sep 28 17:39 msdbdata.mdf
-rw-r-----. 1 mssql root   786432 Sep 28 17:39 msdblog.ldf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 tempdb2.ndf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 tempdb3.ndf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 tempdb4.ndf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 tempdb.mdf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 templog.ldf
-rw-r-----. 1 mssql root  8388608 Sep 28 17:39 test.mdf

/u02/sqlserverlog2/:
total 10240
-rw-r-----. 1 mssql root 2097152 Sep 28 17:39 mastlog.ldf
-rw-r-----. 1 mssql root 8388608 Sep 28 17:39 test_log.ldf

/u03/sqlservertempdb2/:
total 0

/u98/sqlserver2/:
total 0

What next? Because in production your containers will run on the top of orchestrator like Kubernetes, the question is how to implement such privilege restriction in this context? Kubernetes provides security context at different levels including pod and containers. In this blog post example, I applied the security context at the container level within the container specification.

Let’s set the context. Here the picture of my K8s environment:

$ kubectl get nodes
NAME                     STATUS   ROLES    AGE   VERSION
k8m.dbi-services.test    Ready    master   97d   v1.14.1
k8n1.dbi-services.test   Ready    <none>   97d   v1.14.1
k8n2.dbi-services.test   Ready    <none>   97d   v1.14.1

I used the new local-storage Storage class (available with K8s v.1.14+):

$ kubectl get sc
NAME            PROVISIONER                    AGE
local-storage   kubernetes.io/no-provisioner   4d

$ kubectl describe sc local-storage
Name:                  local-storage
IsDefaultClass:        No
Annotations:           <none>
Provisioner:           kubernetes.io/no-provisioner
Parameters:            <none>
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     WaitForFirstConsumer
Events:                <none>

I configured a persistent volume based on this local-storage class and that pointing to the /mnt/local-storage on my K81n node. The access mode and Retain policy are configured according to meet the best practices for databases.

$ cat StoragePV.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-local-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  volumeMode: Filesystem
  storageClassName: local-storage
  local:
    path: /mnt/localstorage
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - k8n1.dbi-services.test

For a sake of simplicity, I applied the default configuration with all SQL Server related files related stored in /var/opt/mssql. I configured the underlying storage and folder permissions accordingly with my custom mssql user (UID = 10001) and group (GUID = 10001) created on the K8n1 host. Note that the UID matches with that of the mssql user created within the container.

$ hostname
k8n1.dbi-services.test

$ id mssql
uid=10001(mssql) gid=10001(mssql) groups=10001(mssql)

$ ls -ld /mnt/localstorage/
drwxrwx--- 6 mssql mssql 59 Sep 26 20:57 /mnt/localstorage/

My deployment file is as follows. It includes the security context that specifies a non-root container configuration with my custom user’s UID / GUID created previously (runAsUser and runAsGroup parameters):

$ cat ReplicaSet.yaml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: mssql-deployment-2
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: mssql-2
    spec:
      securityContext:
        runAsUser: 10001
        runAsGroup: 10001
      terminationGracePeriodSeconds: 10
      containers:
      - name: mssql-2
        image: trow.kube-public:31000/2019-latest-non-root
        ports:
        - containerPort: 1433
        env:
        - name: MSSQL_PID
          value: "Developer"
        - name: ACCEPT_EULA
          value: "Y"
        - name: MSSQL_SA_PASSWORD
          valueFrom:
            secretKeyRef:
              name: sql-secrets
              key: sapassword
        volumeMounts:
        - name: mssqldb
          mountPath: /var/opt/mssql
      volumes:
      - name: mssqldb
        persistentVolumeClaim:
          claimName: mssql-data-2

Obviously, if you don’t meet the correct security permissions on the underlying persistent volume, you will get an error when provisioning the MSSQL pod because the sqlservr process will not get the privileges to create or to access the SQL Server related files as shown below:

$ kubectl get pod
NAME                                 READY   STATUS   RESTARTS   AGE
mssql-deployment-2-8b4d7f7b7-x4x8w   0/1     Error    2          30s

$ kubectl logs mssql-deployment-2-8b4d7f7b7-x4x8w
SQL Server 2019 will run as non-root by default.
This container is running as user mssql.
To learn more visit https://go.microsoft.com/fwlink/?linkid=2099216.
/opt/mssql/bin/sqlservr: Error: Directory [/var/opt/mssql/system/] could not be created.  Errno [13]

If well configured, everything should work as expected and your container should run and interacts correctly with the corresponding persistent volume in the security context defined in your YAML specification:

All this stuff applies to SQL Server 2017.

See you!

Cet article Using non-root SQL Server containers on Docker and K8s est apparu en premier sur Blog dbi services.

↧

Migrating Oracle database from windows to ODA

October 1, 2019, 12:47 am

≫ Next: A Small Footprint Docker Container with Documentum command-line Tools

≪ Previous: Using non-root SQL Server containers on Docker and K8s

Nowadays I have been working on an interesting customer project where I had to migrate windows oracle standard databases to ODA. The ODAs are X7-2M Models, running version 18.5. This version is coming with Red Hat Enterprise Linx 6.10 (Santiago). Both windows databases and target ODA databases are running PSU 11.2.0.4.190115. But this would definitively also be working for oracle 12c and oracle 18c databases. The databases are licensed with Standard Edition, so migrating through data guard was not possible. Through this blog I would like to share the experience I could get on this topic as well as the method and steps I have been using to successfully migrate those databases.

Limitations

Windows and Linux platform been on the same endian, I have been initially thinking that it would not be more complicated than simply duplicating the windows database to an ODA instance using the last backup. ODA databases are OMF databases, so can not be easier, as no convert parameter is needed.
After having created a single instance database on the ODA, exported the current database pfile and adapted it for the ODA, created the needed TNS connections, I have been running a single RMAN duplicate command :
RMAN> run { 2> set newname for database to new; 3> duplicate target database to 'ODA_DBNAME' backup location '/u99/app/oracle/backup'; 4> }

Note : If the database is huge, as for example, more than a Tera bytes, and your sga is small, you might want to increase it. Having a bigger sga size will lower the restore time. Minimum 50 GB would be a good compromise. Also if your ODA is from the ODA-X7 family you will benefit from the NVMe technologie. As per my experience, a duplication of 1.5 TB database, with backup stored locally, did not take more than 40 minutes.

I have been more than happy to see the first duplication step been successfully achieved :
Finished restore at 17-JUL-2019 16:45:10

And I was expecting the same for the next recovery part.

Unfortunately, this didn’t end as expected and I quickly got following restore errors :
Errors in memory script RMAN-03015: error occurred in stored script Memory Script RMAN-06136: ORACLE error from auxiliary database: ORA-01507: database not mounted ORA-06512: at "SYS.X$DBMS_RCVMAN", line 13661 ORA-06512: at line 1 RMAN-03015: error occurred in stored script Memory Script RMAN-20000: abnormal termination of job step RMAN-11003: failure during parse/execution of SQL statement: alter database recover logfile '/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_07_17/o1_mf_1_25514_glyf3yd3_.arc' RMAN-11001: Oracle Error: ORA-10562: Error occurred while applying redo to data block (file# 91, block# 189) ORA-10564: tablespace DBVISIT ORA-01110: data file 91: '/u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/o1_mf_dbvisit_glyczqcj_.dbf' ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 501874 ORA-00600: internal error code, arguments: [4502], [0], [], [], [], [], [], [], [], [], [], [] RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of Duplicate Db command at 07/17/2019 16:45:32 RMAN-05501: aborting duplication of target database

Troubleshooting the problem I could understand that migrating database from Windows to Linux might not be so simple. Following oracle Doc ID is describing the problem :
Restore From Windows To Linux using RMAN Fails (Doc ID 2003327.1)
Cross-Platform Database Migration (across same endian) using RMAN Transportable Database (Doc ID 1401921.1)
RMAN DUPLICATE/RESTORE/RECOVER Mixed Platform Support (Doc ID 1079563.1)
Restore From Windows To Linux using RMAN Fails (Doc ID 2003327.1)

The problem is coming from the fact that recovering redo transactions between windows and linux platform is not supported if the database is not a standby one. For standard database version, the only possibility would be to go through a cold backup which, in my case, was impossible knowing the database size, the time taken to execute a backup and the short maintenance windows.

Looking for other solution and doing further tests, I could find a solution that I’m going to describe in the next steps.

Restoring the database from the last backup

In order to restore the database, I have been running next steps.

Start the ODA instance in no mount :

SQL> startup nomount

Restore the last available control file from backup with rman :

RMAN> connect target / RMAN> restore controlfile from '/mnt/backupNFS/oracle/ODA_DBNAME/20190813_233004_CTL_ODA_DBNAME_1179126808_S2864_P1.BCK';

Mount the database :

SQL> alter database mount;

Catalog the backup path :

RMAN> connect target / RMAN> catalog start with '/mnt/backupNFS/oracle/ODA_DBNAME';

And finally restore the database :

RMAN> connect target / RMAN> run { 2> set newname for database to new; 3> restore database; 4> switch datafile all; 5> }

Convert the primary database to a physical standby database

In order to be able to recover the database we will convert the primary database to a physical standby one.

We can check the actual status and see that our database is a primary one in mounted state :

SQL> select status,instance_name,database_role,open_mode from v$database,v$Instance; STATUS INSTANCE_NAME DATABASE_ROLE OPEN_MODE ------------ ---------------- ---------------- -------------------- MOUNTED ODA_DBNAME PRIMARY MOUNTED

We will convert the database to a physical standby

SQL> alter database convert to physical standby; Database altered.

We need to restart the database.

SQL> shutdown immediate SQL> startup mount

We can check new database status

Get the current windows SCN database

We are now ready to recover the database and the application can be stopped. The next steps will now be executed during the maintenance windows. The windows database listener can be stopped to make sure there is no new connection.

We will make sure there is no existing application session on the database :

SQL> set linesize 300 SQL> set pagesize 500 SQL> col machine format a20 SQL> col service_name format a20 SQL> select SID, serial#, username, machine, process, program, status, service_name, logon_time from v$session where username not in ('SYS', 'PUBLIC') and username is not null order by status, username;

We will create a restore point :

SQL> create restore point for_migration_14082019; Restore point created.

We will get the last online log transactions archived :

SQL> ALTER SYSTEM ARCHIVE LOG CURRENT; System altered.

We will retrieve the SCN corresponding to the restore point :

SQL> col scn format 999999999999999 SQL> select scn from v$restore_point where lower(name)='for_migration_14082019'; SCN ---------------- 13069540631

We will backup the last archive log. This will be executed on the windows database using our dbi services internal DMK tool (https://www.dbi-services.com/offering/products/dmk-management-kit/) :

servicedbi@win_srv:E:\app\oracle\local\dmk_custom\bin\ [ODA_DBNAME] ./rman_backup_ODA_DBNAME_arc.bat E:\app\oracle\local\dmk_custom\bin>powershell.exe -command "E:\app\oracle\local\dmk_ha\bin\check_primary.ps1 ODA_DBNAME 'dmk_rman.ps1 -s ODA_DBNAME -t bck_arc.rcv -c E:\app\oracle\admin\ODA_DBNAME\etc\rman.cfg [OK]::KSBL::RMAN::dmk_dbbackup::ODA_DBNAME::bck_arc.rcv Logfile is : E:\app\oracle\admin\ODA_DBNAME\log\ODA_DBNAME_bck_arc_20190814_141754.log RMAN return Code: 0 2019-08-14_02:19:01::check_primary.ps1::MainProgram ::INFO ==> Program completed

Recover the database

The database can now be recovered till our 13069540631 SCN number.

We will first need to catalog new archive log backups :

RMAN> connect target / RMAN> catalog start with '/mnt/backupNFS/oracle/ODA_DBNAME';

And recover the database till SCN 13069540632 :

RMAN> connect target / RMAN> run { 2> set until scn 13069540632; 3> recover database; 4> } archived log file name=/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30098_go80084r_.arc RECID=30124 STAMP=1016289320 archived log file name=/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30099_go80084x_.arc thread=1 sequence=30099 channel default: deleting archived log(s) archived log file name=/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30099_go80084x_.arc RECID=30119 STAMP=1016289320 archived log file name=/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30100_go8008bg_.arc thread=1 sequence=30100 channel default: deleting archived log(s) archived log file name=/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30100_go8008bg_.arc RECID=30121 STAMP=1016289320 media recovery complete, elapsed time: 00:00:02 Finished recover at 14-AUG-2019 14:35:23

We can check the alert log and see that recovering has been performed until SCN 13069540632 :

oracle@ODA02:/u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/ [ODA_DBNAME] taa ORA-279 signalled during: alter database recover logfile '/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30098_go80084r_.arc'... alter database recover logfile '/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30099_go80084x_.arc' Media Recovery Log /u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30099_go80084x_.arc ORA-279 signalled during: alter database recover logfile '/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30099_go80084x_.arc'... alter database recover logfile '/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30100_go8008bg_.arc' Media Recovery Log /u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30100_go8008bg_.arc Wed Aug 14 14:35:23 2019 Incomplete Recovery applied until change 13069540632 time 08/14/2019 14:13:46 Media Recovery Complete (ODA_DBNAME) Completed: alter database recover logfile '/u03/app/oracle/fast_recovery_area/ODA_DBNAME_RZA/archivelog/2019_08_14/o1_mf_1_30100_go8008bg_.arc'

We can check the new ODA database current SCN :

SQL> col current_scn format 999999999999999 SQL> select current_scn from v$database; CURRENT_SCN ---------------- 13069540631

Convert database to primary again

Database can now be converted back to primary.
SQL> alter database activate standby database; Database altered.

At this step if the windows source database would be running 11.2.0.3 version, we could successfully upgrade the new ODA database to 11.2.0.4 following common oracle database upgrade process.

And finally we can open our database and have the database been migrated from windows to linux.

SQL> alter database open; Database altered.

oracle@ODA02:/u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/ [ODA_DBNAME] ODA_DBNAME ********* dbi services Ltd. ********* STATUS : OPEN DB_UNIQUE_NAME : ODA_DBNAME_RZA OPEN_MODE : READ WRITE LOG_MODE : ARCHIVELOG DATABASE_ROLE : PRIMARY FLASHBACK_ON : NO FORCE_LOGGING : YES VERSION : 11.2.0.4.0 *************************************

Post migration steps

There will be a few post migration steps to be executed.

Created redo logs again

Redo logs are still stamped with windows path and therefore have been created in $ORACLE_HOME/dbs folder. In this steps we will create new OMF one again.

Checking current online log members :

SQL> set linesize 300 SQL> set pagesize 500 SQL> col member format a100 SQL> select a.GROUP#, b.member, a.status, a.bytes/1024/1024 MB from v$log a, v$logfile b where a.GROUP#=b.GROUP#; GROUP# MEMBER STATUS MB ---------- ---------------------------------------------------------------------------------------------------- ---------------- ---------- 6 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_6_1.LOG UNUSED 500 6 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_6_2.LOG UNUSED 500 5 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_5_2.LOG UNUSED 500 5 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_5_1.LOG UNUSED 500 4 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_4_2.LOG UNUSED 500 4 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_4_1.LOG UNUSED 500 3 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_3_2.LOG UNUSED 500 3 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_3_1.LOG UNUSED 500 2 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_2_2.LOG UNUSED 500 2 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_2_1.LOG UNUSED 500 1 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_1_2.LOG CURRENT 500 1 /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_1_1.LOG CURRENT 500

Drop the first unused redo log group keeping only one :

SQL> alter database drop logfile group 6; Database altered. SQL> alter database drop logfile group 5; Database altered. SQL> alter database drop logfile group 4; Database altered. SQL> alter database drop logfile group 3; Database altered. SQL> alter database add logfile group 3 size 500M; Database altered.

Create the recent dropped group again :

SQL> alter database add logfile group 3 size 500M; Database altered. SQL> alter database add logfile group 4 size 500M; Database altered. SQL> alter database add logfile group 5 size 500M; Database altered. SQL> alter database add logfile group 6 size 500M; Database altered.

Drop the last unused redo log group and create it again :

SQL> alter database drop logfile group 2; Database altered. SQL> alter database add logfile group 2 size 500M; Database altered.

Execute a switch log file and checkpoint so the current redo group becomes unused :

SQL> alter system switch logfile; System altered. SQL> alter system checkpoint; System altered.

Drop it and create it again :

SQL> alter database drop logfile group 1; Database altered. SQL> alter database add logfile group 1 size 500M; Database altered.

Check redo group members :

SQL> select a.GROUP#, b.member, a.status, a.bytes/1024/1024 MB from v$log a, v$logfile b where a.GROUP#=b.GROUP#; GROUP# MEMBER STATUS MB ---------- ---------------------------------------------------------------------------------------------------- ---------------- ---------- 3 /u03/app/oracle/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_3_go81rj4t_.log INACTIVE 500 3 /u02/app/oracle/oradata/ODA_DBNAME_RZA/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_3_go81rjqn_.log INACTIVE 500 4 /u03/app/oracle/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_4_go81ron1_.log UNUSED 500 4 /u02/app/oracle/oradata/ODA_DBNAME_RZA/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_4_go81rp6o_.log UNUSED 500 5 /u03/app/oracle/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_5_go81rwhs_.log UNUSED 500 5 /u02/app/oracle/oradata/ODA_DBNAME_RZA/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_5_go81rx1g_.log UNUSED 500 6 /u03/app/oracle/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_6_go81s1rk_.log UNUSED 500 6 /u02/app/oracle/oradata/ODA_DBNAME_RZA/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_6_go81s2bx_.log UNUSED 500 2 /u03/app/oracle/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_2_go81sgdf_.log CURRENT 500 2 /u02/app/oracle/oradata/ODA_DBNAME_RZA/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_2_go81sgxd_.log CURRENT 500 1 /u03/app/oracle/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_1_go81vpls_.log UNUSED 500 1 /u02/app/oracle/oradata/ODA_DBNAME_RZA/redo/ODA_DBNAME_RZA/onlinelog/o1_mf_1_go81vq4v_.log UNUSED 500

Delete the wrong previous redo log members files :

oracle@ODA02:/u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/ [ODA_DBNAME] cdh oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/ [ODA_DBNAME] cd dbs oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] ls -ltrh *REDO*.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_6_2.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_6_1.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_5_2.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_5_1.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_4_2.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_4_1.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_3_2.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_3_1.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_2_2.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 14:59 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_2_1.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 15:05 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_1_2.LOG -rw-r----- 1 oracle asmadmin 501M Aug 14 15:05 I:FAST_RECOVERY_AREAODA_DBNAME_SITE1ONLINELOGREDO_1_1.LOG oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] rm *REDO*.LOG

Created temp file again

Checking current temp file we can see that the path is still the windows one :

SQL> set linesize 300 SQL> col name format a100 SQL> select b.name, b.status, b.bytes/1024/1024 MB, a.name from v$tablespace a, v$tempfile b where a.TS#=b.TS#; NAME STATUS MB NAME ---------------------------------------------------------------------------------------------------- ------- ---------- ------------------------------------------- F:\ORADATA\ODA_DBNAME\TEMPORARY_DATA_1.DBF ONLINE 8192 TEMPORARY_DATA

We can check that the default temporary tablespace is TEMPORARY_DATA

SQL> col property_value format a50 SQL> select property_name, property_value from database_properties where property_name like '%DEFAULT%TABLESPACE%'; PROPERTY_NAME PROPERTY_VALUE ------------------------------ -------------------------------------------------- DEFAULT_TEMP_TABLESPACE TEMPORARY_DATA DEFAULT_PERMANENT_TABLESPACE USER_DATA

Let’s create a new temp tablespace and make it the default one

SQL> create temporary tablespace TEMP tempfile size 8G; Tablespace created. SQL> alter database default temporary tablespace TEMP; Database altered. SQL> select property_name, property_value from database_properties where property_name like '%DEFAULT%TABLESPACE%'; PROPERTY_NAME PROPERTY_VALUE ------------------------------ -------------------------------------------------- DEFAULT_TEMP_TABLESPACE TEMP DEFAULT_PERMANENT_TABLESPACE USER_DATA

Drop previous TEMPORARY_DATA tablespace

SQL> drop tablespace TEMPORARY_DATA including contents and datafiles; Tablespace dropped. SQL> select b.file#, b.name, b.status, b.bytes/1024/1024 MB, a.name from v$tablespace a, v$tempfile b where a.TS#=b.TS#; FILE# NAME STATUS MB NAME ---------- ---------------------------------------------------------------------------------------------------- ------- ---------- 3 /u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/o1_mf_temp_go83m1tp_.tmp ONLINE 8192 TEMP

Create TEMPORARY_DATA tablespace again and make it the default one :

SQL> create temporary tablespace TEMPORARY_DATA tempfile size 8G; Tablespace created. SQL> select b.file#, b.name, b.status, b.bytes/1024/1024 MB, a.name from v$tablespace a, v$tempfile b where a.TS#=b.TS#; FILE# NAME STATUS MB NAME ---------- ---------------------------------------------------------------------------------------------------- ------- ---------- 1 /u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/o1_mf_temporar_go83wfd7_.tmp ONLINE 8192 TEMPORARY_DATA 3 /u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/o1_mf_temp_go83m1tp_.tmp ONLINE 8192 TEMP SQL> alter database default temporary tablespace TEMPORARY_DATA; Database altered. SQL> select property_name, property_value from database_properties where property_name like '%DEFAULT%TABLESPACE%'; PROPERTY_NAME PROPERTY_VALUE ------------------------------ -------------------------------------------------- DEFAULT_TEMP_TABLESPACE TEMPORARY_DATA DEFAULT_PERMANENT_TABLESPACE USER_DATA

And finally drop the intermediare temp tablespace :

SQL> drop tablespace TEMP including contents and datafiles; Tablespace dropped. SQL> select b.file#, b.name, b.status, b.bytes/1024/1024 MB, a.name from v$tablespace a, v$tempfile b where a.TS#=b.TS#; FILE# NAME STATUS MB NAME ---------- ---------------------------------------------------------------------------------------------------- ------- ---------- 1 /u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/o1_mf_temporar_go83wfd7_.tmp ONLINE 8192 TEMPORARY_DATA

Appropriate max size can be given to the new created temp tablespace

SQL> alter database tempfile '/u02/app/oracle/oradata/ODA_DBNAME_RZA/ODA_DBNAME_RZA/datafile/o1_mf_temporar_go83wfd7_.tmp' autoextend on maxsize 31G; Database altered.

Remove wrong temp file stored in $ORACLE_HOME/dbs

oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] ls -ltr -rw-r--r-- 1 oracle oinstall 2851 May 15 2009 init.ora -rw-r--r-- 1 oracle oinstall 64 Jul 25 08:10 initODA_DBNAME.ora.old -rw-r----- 1 oracle oinstall 2048 Jul 25 08:10 orapwODA_DBNAME -rw-r--r-- 1 oracle oinstall 67 Jul 25 08:31 initODA_DBNAME.ora -rw-r----- 1 oracle asmadmin 8589942784 Aug 14 08:14 F:ORADATAODA_DBNAMETEMPORARY_DATA_1.DBF -rw-rw---- 1 oracle asmadmin 1544 Aug 14 14:59 hc_ODA_DBNAME.dat -rw-r----- 1 oracle asmadmin 43466752 Aug 14 15:48 snapcf_ODA_DBNAME.f oracle@RZA-ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] rm F:ORADATAODA_DBNAMETEMPORARY_DATA_1.DBF

Apply specific ODA parameters

Following specific ODA parameters can be updated to the new created instance.

SQL> alter system set "_datafile_write_errors_crash_instance"=false scope=spfile; System altered. SQL> alter system set "_db_writer_coalesce_area_size"=16777216 scope=spfile; System altered. SQL> alter system set "_disable_interface_checking"=TRUE scope=spfile; System altered. SQL> alter system set "_ENABLE_NUMA_SUPPORT"=FALSE scope=spfile; System altered. SQL> alter system set "_FILE_SIZE_INCREASE_INCREMENT"=2143289344 scope=spfile; System altered. SQL> alter system set "_gc_policy_time"=0 scope=spfile; System altered. SQL> alter system set "_gc_undo_affinity"=FALSE scope=spfile; System altered. SQL> alter system set db_block_checking='FULL' scope=spfile; System altered. SQL> alter system set db_block_checksum='FULL' scope=spfile; System altered. SQL> alter system set db_lost_write_protect='TYPICAL' scope=spfile; System altered. SQL> alter system set sql92_security=TRUE scope=spfile; System altered. SQL> alter system set use_large_pages='only' scope=spfile; System altered.

“_fix_control”parameter is specific to Oracle12c and not compatible Oracle 11g. See Doc ID 2145105.1.

Register database in grid

After applying specific ODA instance parameters, we can register the database in the grid and start it with the grid.

oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl add database -d ODA_DBNAME_RZA -o /u01/app/oracle/product/11.2.0.4/dbhome_1 -c SINGLE -i ODA_DBNAME -x RZA-ODA02 -m ksbl.local -p /u02/app/oracle/oradata/ODA_DBNAME_RZA/dbs/spfileODA_DBNAME.ora -r PRIMARY -s OPEN -t IMMEDIATE -n ODA_DBNAME -j "/u02/app/oracle/oradata/ODA_DBNAME_RZA,/u03/app/oracle" SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl start database -d ODA_DBNAME_RZA oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl status database -d ODA_DBNAME_RZA Instance ODA_DBNAME is running on node rza-oda02 oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] ODA_DBNAME ********* dbi services Ltd. ********* STATUS : OPEN DB_UNIQUE_NAME : ODA_DBNAME_RZA OPEN_MODE : READ WRITE LOG_MODE : ARCHIVELOG DATABASE_ROLE : PRIMARY FLASHBACK_ON : NO FORCE_LOGGING : YES VERSION : 11.2.0.4.0 *************************************

We can check the well functionning :
oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl stop database -d ODA_DBNAME_RZA oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl status database -d ODA_DBNAME_RZA Instance ODA_DBNAME is not running on node rza-oda02 oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] ODA_DBNAME ********* dbi services Ltd. ********* STATUS : STOPPED ************************************* oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl start database -d ODA_DBNAME_RZA oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] srvctl status database -d ODA_DBNAME_RZA Instance ODA_DBNAME is running on node rza-oda02 oracle@ODA02:/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/ [ODA_DBNAME] ODA_DBNAME ********* dbi services Ltd. ********* STATUS : OPEN DB_UNIQUE_NAME : ODA_DBNAME_RZA OPEN_MODE : READ WRITE LOG_MODE : ARCHIVELOG DATABASE_ROLE : PRIMARY FLASHBACK_ON : NO FORCE_LOGGING : YES VERSION : 11.2.0.4.0 *************************************

Conclusion

Going through a physical standby database, I was able to migrate successfully the windows databases into ODA linux one. I have been able to achieve migration of source 11.2.0.4 databases but also 11.2.0.3 database by adding an upgrade step in the process.

Cet article Migrating Oracle database from windows to ODA est apparu en premier sur Blog dbi services.

↧

A Small Footprint Docker Container with Documentum command-line Tools

October 1, 2019, 2:31 pm

≫ Next: Documentum Administrator in a Container

≪ Previous: Migrating Oracle database from windows to ODA

The aim here is to have a minimalist container with the usual Documentum clients idql, iapi, dmawk and dmqdocbroker.
Documentum Administrator (DA) could also be a useful addition to our toolbox. We will show how to containerize it in a future article. A word of warning is in order here: the title’s catchy “small footprint” qualifier is relative. Don’t forget that said container will contain a certified O/S, a JRE, the DFCs and a few support shared libraries. Under these conditions, at more than 600 Mb, it is as small as it can get.
Since those clients are part of the content server (CS) and not packaged separately, we must first start with a full installation of the content server binaries and next we’ll remove all the non-essential parts. We chose the latest available CS version, currently v16.4.
The cute little utility dctm-wrapper discussed in Connecting to a Repository via a Dynamically Edited dfc.properties File (part I) is also included because it makes it unnecessary to manually edit the container’s dfc.properties file each time a new repository on a different machine needs to be accessed; this will be done dynamically by the utility itself.
The instructions to create the image are given in a dockerfile. We chose to base the image on the Centos distribution because as a Red Hat derivative it is implicitly certified and does not require any subscription. However, any other Documentum-certified Linux distribution such as RedHat, Suze or Ubuntu will do. Some adaptation may be in order though, e.g. adding libraries to $LD_LIBRARY_PATH or syminking libraries under another version suffix. Check the installation manual for details.
The working directory is quite flat:
dmadmin@dmclient:~/builds/documentum/clients$ ll -R .: total 24 -rw-rw-r-- 1 dmadmin dmadmin 68 Aug 1 10:28 dctm-secrets drwxrwxr-x 2 dmadmin dmadmin 4096 Aug 2 11:55 files -rw-rw-r-- 1 dmadmin dmadmin 7107 Aug 28 23:49 Dockerfile ./files: total 2151304 -rw-rw-r-- 1 dmadmin dmadmin 872 Aug 1 10:28 linux_install_properties -rwxrwxr-x 1 dmadmin dmadmin 1393568256 Aug 1 10:28 content_server_16.4_linux64_oracle.tar -rwxrwxr-x 1 dmadmin dmadmin 809339975 Aug 1 10:28 CS_16.4.0080.0129_linux_ora_P08.tar.gz -rw-rw-r-- 1 dmadmin dmadmin 349 Aug 1 10:28 CS_patch08.properties -rwxrwxr-x 1 dmadmin dmadmin 7090 Aug 2 11:55 dctm-wrapper
There is only one sub-directory, files. Read on for more details.

The clients dockerfile

Let’s take a look at the clients dockerfile:

# syntax = docker/dockerfile:1.0-experimental
# we are using the secret mount type;
# cec - dbi-services - July 2019

FROM centos:latest
MAINTAINER "cec@dbi"

RUN yum install -y sudo less unzip gunzip tar iputils hostname gawk wget bind-utils net-tools && yum clean all

ARG soft_repo
ARG INSTALL_OWNER
ARG INSTALL_OWNER_UID
ARG INSTALL_OWNER_GROUP
ARG INSTALL_OWNER_GID
ARG INSTALL_HOME
ARG INSTALL_TMP
ARG PRODUCT_MAJOR_VERSION
ARG JBOSS
ARG DOCUMENTUM
ARG DOCUMENTUM_SHARED
ARG DM_HOME
ARG MNT_SHARED_FOLDER
ARG SCRIPTS_DIR

ENV LC_ALL C

USER root
RUN mkdir -p ${INSTALL_TMP} ${SCRIPTS_DIR} ${DOCUMENTUM} ${MNT_SHARED_FOLDER}

# copy the uncompressed files;
COPY ${soft_repo}/CS_patch08.properties ${soft_repo}/linux_install_properties ${soft_repo}/dctm-wrapper ${INSTALL_TMP}/.

# copy and expand the packages;
ADD ${soft_repo}/content_server_16.4_linux64_oracle.tar ${INSTALL_TMP}/.
ADD ${soft_repo}/CS_16.4.0080.0129_linux_ora_P08.tar.gz ${INSTALL_TMP}/.

RUN groupadd --gid ${INSTALL_OWNER_GID} ${INSTALL_OWNER_GROUP}                                                                                       && \
    useradd --uid ${INSTALL_OWNER_UID} --gid ${INSTALL_OWNER_GID} --shell /bin/bash --home-dir /home/${INSTALL_OWNER} --create-home ${INSTALL_OWNER} && \
    usermod -a -G wheel ${INSTALL_OWNER}                                                                                                             && \
    chown -R ${INSTALL_OWNER}:${INSTALL_OWNER_GROUP} ${INSTALL_HOME} ${INSTALL_TMP} ${SCRIPTS_DIR} ${DOCUMENTUM} ${MNT_SHARED_FOLDER}                && \
    chmod 775 ${INSTALL_HOME} ${INSTALL_TMP} ${SCRIPTS_DIR} ${DOCUMENTUM} ${MNT_SHARED_FOLDER}

# set the $INSTALL_OWNER's password passed in the secret file;
RUN --mount=type=secret,id=dctm-secrets,dst=/dctm-secrets . /dctm-secrets && echo ${INSTALL_OWNER}:"${INSTALL_OWNER_PASSWORD}" | /usr/sbin/chpasswd
RUN rm /tmp/dctm-secrets

# make the CLI comfortable again;
USER ${INSTALL_OWNER}
RUN echo >> ${HOME}/.bash_profile                                          && \
    echo "set -o vi" >> ${HOME}/.bash_profile                              && \
    echo "alias ll='ls -alrt'" >> ${HOME}/.bash_profile                    && \
    echo "alias psg='ps -ef | grep -i'" >> ${HOME}/.bash_profile           && \
    echo "export DM_HOME=${DM_HOME}"           >> ${HOME}/.bash_profile    && \
    echo "export DOCUMENTUM=${DOCUMENTUM}"     >> ${HOME}/.bash_profile    && \
    echo "export PATH=.:${SCRIPTS_DIR}:\$PATH" >> ${HOME}/.bash_profile    && \
    echo >> ${HOME}/.bash_profile                                          && \
    echo "[[ -f \${DM_HOME}/bin/dm_set_server_env.sh ]] && . \${DM_HOME}/bin/dm_set_server_env.sh 2>&1 > /dev/null || true" >> ${HOME}/.bash_profile && \
    echo >> ${HOME}/.bash_profile                                          && \
    mv ${INSTALL_TMP}/dctm-wrapper ${SCRIPTS_DIR}/. && chmod +x ${SCRIPTS_DIR}/dctm-wrapper && \
    ln -s ${SCRIPTS_DIR}/dctm-wrapper ${SCRIPTS_DIR}/widql && ln -s ${SCRIPTS_DIR}/dctm-wrapper ${SCRIPTS_DIR}/wiapi && ln -s ${SCRIPTS_DIR}/dctm-wrapper ${SCRIPTS_DIR}/wdmawk

# install and then cleanup useless stuff from image;
WORKDIR ${DOCUMENTUM}
RUN . ${HOME}/.bash_profile && cd ${INSTALL_TMP} && ./serverSetup.bin -f ./linux_install_properties && echo $? && \
    tar xvf CS_16.4.0080.0129_linux_ora_P08.tar && \
    chmod +x ./patch.bin && ./patch.bin LAX_VM $DOCUMENTUM/java64/JAVA_LINK/jre/bin/java -f CS_patch08.properties && echo $? && \
    cd ${DOCUMENTUM} && rm -r ${INSTALL_TMP} /tmp/install* && \
    rm -r tcf && rm -r tools && rm -r ${JBOSS} && rm -r jmsTools && rm -r uninstall && rm -r dba && rm -r java64/1.8.0_152/db && rm -r java64/1.8.0_152/include && rm java64/1.8.0_152/javafx-src.zip && \
    rm java64/1.8.0_152/src.zip && rm -r java64/1.8.0_152/man && rm -r temp/* && \
    cd ${DOCUMENTUM}/product/16.4/ && \
    rm -r diagtools && rm -r lib && rm -r convert && rm -r unsupported && rm -r oracle && rm -r install && mkdir smaller_bin && mv bin/dm_set_server_env.sh smaller_bin/. && mv bin/iapi* smaller_bin/. && \
    mv bin/idql* smaller_bin/. && mv bin/dmqdocbroker* smaller_bin/. && mv bin/dmawk* smaller_bin/. && mv bin/libkmclient_shared.so* smaller_bin/. && mv bin/libdmcl*.so* smaller_bin/. && \
    mv bin/libsm_sms.so* smaller_bin/. && mv bin/libsm_clsapi.so* smaller_bin/. && mv bin/libsm_env.so* smaller_bin/. && mv bin/java.ini smaller_bin/. && rm -r bin && mv smaller_bin bin && \
    cd ${DOCUMENTUM}/dfc/ && for f in *; do mv $f _${f}; done && mv _aspectjrt.jar aspectjrt.jar && mv _commons-lang-2.6.jar commons-lang-2.6.jar && mv _dfc.jar dfc.jar && mv _log4j.jar log4j.jar && rm -r _* 

# keep the container rolling;
CMD bash -c "while true; do sleep 60; done"

# build the image;
# copy/paste the commented lines below starting with the cat command uncommented:
# cat - <<'eot' | gawk '{gsub(/#+ */, ""); print}'
# export INSTALL_OWNER=dmadmin
# export INSTALL_OWNER_UID=1000 
# export INSTALL_OWNER_GROUP=${INSTALL_OWNER}
# export INSTALL_OWNER_GID=1000
# export INSTALL_HOME=/app
# export INSTALL_TMP=${INSTALL_HOME}/tmp
# export SCRIPTS_DIR=${INSTALL_HOME}/scripts
# export PRODUCT_MAJOR_VERSION=16.4
# export JBOSS=wildfly9.0.1
# export DOCUMENTUM=${INSTALL_HOME}/dctm 
# export DOCUMENTUM_SHARED=${DOCUMENTUM}
# export DM_HOME=${DOCUMENTUM}/product/${PRODUCT_MAJOR_VERSION}
# export MNT_SHARED_FOLDER=${INSTALL_HOME}/shared
# time DOCKER_BUILDKIT=1 docker build --squash --no-cache --progress=plain --secret id=dctm-secrets,src=./dctm-secrets \
#  --build-arg soft_repo=./files                              \
#  --build-arg INSTALL_OWNER=${INSTALL_OWNER}                 \
#  --build-arg INSTALL_OWNER_UID=${INSTALL_OWNER_UID}         \
#  --build-arg INSTALL_OWNER_GROUP=${INSTALL_OWNER_GROUP}     \
#  --build-arg INSTALL_OWNER_GID=${INSTALL_OWNER_GID}         \
#  --build-arg INSTALL_HOME=${INSTALL_HOME}                   \
#  --build-arg INSTALL_TMP=${INSTALL_TMP}                     \
#  --build-arg PRODUCT_MAJOR_VERSION=${PRODUCT_MAJOR_VERSION} \
#  --build-arg JBOSS=${JBOSS}                                 \
#  --build-arg DOCUMENTUM=${DOCUMENTUM}                       \
#  --build-arg DOCUMENTUM_SHARED=${DOCUMENTUM_SHARED}         \
#  --build-arg DM_HOME=${DM_HOME}                             \
#  --build-arg MNT_SHARED_FOLDER=${MNT_SHARED_FOLDER}         \
#  --build-arg SCRIPTS_DIR=${SCRIPTS_DIR}                     \
#  --tag="dbi/dctm-clients:v1.0"                              \
#  .
# eot
# retag the squashed image as it has not tag:
# docker tag <image_id> dbi/dctm-clients:v1.0

# run the image and remove the container on exit;
# docker run -d --rm --hostname=container-clients dbi/dctm-clients:v1.0

# run the image and keep the container on exit;
# docker run -d --name container-clients --hostname=container-clients dbi/dctm-clients:v1.0

# for trans-host container access, connect the container to an existing overlay or macvlan network, e.g.:
# docker network connect dctmolnet01 container-clients

The pragma on line 1 says that we will be using an experimental feature that needs a special syntax, see below for details.
On line 5, we access the docker on-line registry to download an image of Centos and base our own image on it. We will also throw in a few utilities that can be helpful later.
On lines 10 to 23 we use ARG to receive the parameters passed to the build. We could have used ENV instead and hard-coded them in the dockerfile but the problem with that alternative is that those environment variables would have persisted in the containers derived from the image and since they are not used outside the build, they would just unnecessarily pollute the environment. Docker has currently no way to remove environment variables. They can be set to an empty value though, but this is not enough.
On line 31, the needed files, the CS and patch installation archives along with their property files plus anything else, are imported into the image from the local host’s directory ${soft_repo}. This directory must be local to the current one and both should only contain the exact needed files in order to keep the build context small. In effect, during the context creation, the current directory and everything under it will be recursively read, which is quite a slow process.
On line 39, the user dmadmin is added to the wheel group. This is a convenient trick in Centos to make dmadmin a sudoer of any command.
On line 44, the experimental feature “secrets” is finally used. This enhancement lets us pass a file containing confidential information, typically passwords, without leaving any traces in the image’s overlay filesystems. The RUN statement is executed atomically and, at the end, the mounted file is dismounted silently. Outside that RUN statement, it is like nothing had happened but we quietly managed to change dmadmin user’s password. Here is its content:
dmadmin@dmclient:~/builds/documentum/da$ cat dctm-secrets export INSTALL_OWNER=dmadmin export INSTALL_OWNER_PASSWORD=dmadmin
It is a bash file that is sourced in the RUN statement to give access to the key-value tuples through environment variables.
On lines 59 and 60, dctm-wrapper is installed as explained in the aforementioned article, i.e. by symlinking it to widql, wiapi and wdmawk. When any of these programs is executed, it will be resolved to the wrapper, which knows what to do next. Please, refer to that article for details.
On lines 64, the Documentum content server is installed in silent mode; the parameter file is the default sample one:
dmadmin@dmclient:~/builds/documentum/clients$ cat files/linux_install_properties # silent install response file # used to install the binaries; INSTALLER_UI=silent KEEP_TEMP_FILE=false ####installation ##default documentum home directory, on linux, this attribute is supposed to be set in the env so don't need ##SERVER.DOCUMENTUM=/opt/documentum ##app server port APPSERVER.SERVER_HTTP_PORT=9080 ##app server password APPSERVER.SECURE.PASSWORD=jboss ##enable cas as default SERVER.CAS_LICENSE=XXXXXXXXX # provide the lockboxpassphrase for existing lockbox files. the file name and lockboxpassphrase must be start with 1 and sequentially increased and cannot skip any number # for example, SERVER.LOCKBOX_FILE_NAME1=lockbox1.lb, SERVER.LOCKBOX_FILE_NAME2=lockbox2.lb, SERVER.LOCKBOX_FILE_NAME3=lockbox3.lb # same rule applies to SERVER.LOCKBOX_PASSPHRASE.PASSWORD= SERVER.LOCKBOX_FILE_NAME1=lockbox.lb SERVER.LOCKBOX_PASSPHRASE.PASSWORD1=
Since we do not create repositories but just use a minimalist set of files needed by the clients, that file could be simplified to:
# silent install response file INSTALLER_UI=silent KEEP_TEMP_FILE=false
On line 66, the patch P08, the latest available one for CS v16.4 as of this writing, is extracted and on the next line it is installed in silent mode too. Its edited property file CS_patch08.properties looks like this:
dmadmin@dmclient:~/builds/documentum/clients$ cat files/CS_patch08.properties # Sun Apr 28 13:24:15 CEST 2019 # Replay feature output # --------------------- # This file was built by the Replay feature of InstallAnywhere. # It contains variables that were set by Panels, Consoles or Custom Code. #all #--- INSTALLER_UI=silent USER_SELECTED_PATCH_ZIP_FILE=CS_16.4.0080.0129_linux_ora.tar.gz common.installLocation=/app/dctm
Lines 68 to 74 delete all the files from the content server package that are not used by the clients. Small files are not worth the effort so they are left behind.
The DFCs have been empirically kept to a minimum. Normally, there should not be any surprise because the iapi, idql and dmawk are legacy clients that only scratch the surface of the DFCs and don’t use any other jars.
On line 77, an endless loop is defined in order to keep the container running. In effect, if no process is (no longer) running inside a container, that container will be shut down. Here, the container does nothing per se, it just provides tools to be invoked externally and would exit immediately if it weren’t for the sleeping loop.
Starting with line 79, the commented out instructions to build and run the image are listed, for we feel it is handy to have them all in one place. They could also be wrapped up into a script file if this is more convenient.

Building the image

To build the image, use the command below:

export INSTALL_OWNER=dmadmin
export INSTALL_OWNER_UID=1000
export INSTALL_OWNER_GROUP=${INSTALL_OWNER}
export INSTALL_OWNER_GID=1000
export INSTALL_HOME=/app
export INSTALL_TMP=${INSTALL_HOME}/tmp
export SCRIPTS_DIR=${INSTALL_HOME}/scripts
export PRODUCT_MAJOR_VERSION=16.4
export JBOSS=wildfly9.0.1
export DOCUMENTUM=${INSTALL_HOME}/dctm
export DOCUMENTUM_SHARED=${DOCUMENTUM}
export DM_HOME=${DOCUMENTUM}/product/${PRODUCT_MAJOR_VERSION}
export MNT_SHARED_FOLDER=${INSTALL_HOME}/shared
time DOCKER_BUILDKIT=1 docker build --squash --no-cache --progress=plain --secret id=dctm-secrets,src=./dctm-secrets \
--build-arg soft_repo=./files                              \
--build-arg INSTALL_OWNER=${INSTALL_OWNER}                 \
--build-arg INSTALL_OWNER_UID=${INSTALL_OWNER_UID}         \
--build-arg INSTALL_OWNER_GROUP=${INSTALL_OWNER_GROUP}     \
--build-arg INSTALL_OWNER_GID=${INSTALL_OWNER_GID}         \
--build-arg INSTALL_HOME=${INSTALL_HOME}                   \
--build-arg INSTALL_TMP=${INSTALL_TMP}                     \
--build-arg PRODUCT_MAJOR_VERSION=${PRODUCT_MAJOR_VERSION} \
--build-arg JBOSS=${JBOSS}                                 \
--build-arg DOCUMENTUM=${DOCUMENTUM}                       \
--build-arg DOCUMENTUM_SHARED=${DOCUMENTUM_SHARED}         \
--build-arg DM_HOME=${DM_HOME}                             \
--build-arg MNT_SHARED_FOLDER=${MNT_SHARED_FOLDER}         \
--build-arg SCRIPTS_DIR=${SCRIPTS_DIR}                     \
--tag="dbi/dctm-clients:v1.0"                              \
.

Don’t forget the tiny little dot at the bottom
The instructions are given on lines 81 to 112 in the dockfile too. Just uncomment the first line (the cat command), copy/paste all the lines up to the dot included and add a line containing only the word “eot” to close the here-document; afterwards, copy/paste all the generated text on the shell and Bob’s your uncle.
The image creation takes up a little under 5.5 minutes on my old, faithful machine, mostly taken up by the Documentum installer (3mn) and next by the ––squash option on line 14 (42s). This too is an experimental option that merges together all the file system layers into a single one and removes deleted files. The net result is a much more compact image, it goes from 5.06 Gb down to 625 Mb. We *have* to use it here otherwise there would be no point in going through the effort of removing unnecessary files to produce an as compact as possible image.
The resulting size is now exactly the one effectively taken up by all the files in the image as shown from within a running image:
[dmadmin@container-clients dctm]$ sudo du -ms / ... 622 /
There are pros and cons of using ––squash but that option has no perceptible drawbacks in our case.
Let’s now look at the produced images’sizes from the outside:

dmadmin@dmclient:~/builds/documentum/clients$ docker image ls
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
<none>                <none>              85b59eb66c00        3 minutes ago       625MB
dbi/dctm-clients      v1.0                bb0038f8a577        4 minutes ago       5.06GB

Note that the ––squash option produced a new image, here with the id 85b59eb66c00. We’d expect it to replace the original image instead of creating a new one but it’s OK too, we have both now, the original one and the squashed one. Note also how that image’s size has shrunk almost ten-fold. Let’s tag it the way it should have been:

dmadmin@dmclient:~/builds/documentum/clients$ docker tag 85b59eb66c00 dbi/dctm-clients:v1.0
dmadmin@dmclient:~/builds/documentum/clients$ docker image ls
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
dbi/dctm-clients      v1.0                85b59eb66c00        About an hour ago   625MB
...

Running the image

To run the image as a temporary container, use the command below:
dmadmin@dmclient:~/builds/documentum/clients$ docker run -d --rm --hostname=container-clients dbi/dctm-clients:v1.0
To run the image and keep the container under the name container-clients on exit:

dmadmin@dmclient:~/builds/documentum/clients$ docker run -d --name container-clients --hostname=container-clients dbi/dctm-clients:v1.0
# check it:
dmadmin@dmclient:~/builds/documentum/clients$ docker container ls
CONTAINER ID        IMAGE                   COMMAND                  CREATED             STATUS              PORTS                    NAMES
97dd2f14e3ef        dbi/dctm-clients:v1.0   "/bin/sh -c 'bash -c…"   5 seconds ago       Up 3 seconds                                 container-clients
...

Good. Let’s now move on and test the container. Note that usually we specify a network the container should be connected into, either when starting it or later, as shown in the example below with a custom bridge network named dctmbrnet:
# connect the container to a custom bridge network after it has been started: docker network connect dctmbrnet container-client # connect the container to a custom bridge network when starting it up: dmadmin@dmclient:~/builds/documentum/clients$ docker run -d --name container-clients --hostname=container-clients --network dctmbrnet dbi/dctm-clients:v1.0

Testing the container-clients container from the inside

In this scenario, we need first to enter the container and then to run the usual command-line clients.
Let’s suppose the container is connected to some network that allows it to access remote containerized repositories. Such network can be a simple bridge network if the repositories’ containers and the container-client’s container are on the same host. Or it can be an overlay network or a macvlan network if the containers are distributed across several hosts. Or it can be any non docker network, e.g. managed by Kubernetes. So, let’s first enter the container-client’s container and then test the tools:
# enter container-client from its host: dmadmin@dmclient:~/builds/documentum/clients$ docker exec -it container-client /bin/bash -l # connect to the container02's dmtest02 repository: [dmadmin@container-clients dctm]$ widql dmtest02:container02:1489 -Udmadmin -Pdmadmin OpenText Documentum idql - Interactive document query interface Copyright (c) 2018. OpenText Corporation All rights reserved. Client Library Release 16.4.0070.0035 Connecting to Server using docbase dmtest02 [DM_SESSION_I_SESSION_START]info: "Session 0100c35080002a52 started for user dmadmin." Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle 1> quit Bye Connection to dmclient closed.
Remember that widql/wiapi/etc… end up calling the native Documentum tools idql/iapi/etc… So, the containerized clients do work as expected.

Testing the container-clients container from its host

Since we are logged on the container’s host, we can pass commands to the container on the command-line, as show below:
dmadmin@dmclient:~/builds/documentum/clients$ docker exec -it container-clients bash -l widql dmtest02:container02:1489 -Udmadmin -Pdmadmin ... Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle 1> quit Bye Connection to dmclient closed.
Note that we use widql here because we don’t want to manually edit in advance the container-clients’ dfc.properties file; widql will do it on-the-fly for us.
The above command can be shortened by defining the following aliases:
alias widqlc='docker exec -it container-clients bash -l widql' alias wiapic='docker exec -it container-clients bash -l wiapi' alias dmawkc="docker exec -it container-clients bash -l wdmawk" alias wdmqc='docker exec -it container-clients bash -l dmqdocbroker'
Their usage would be as simple as:
dmadmin@dmclient:~/builds/documentum/clients$ widqlc dmtest02:container02:1489 -Udmadmin -Pdmadmin dmadmin@dmclient:~/builds/documentum/clients$ wdmqc -t container02 -p 1489 -i dmadmin@dmclient:~/builds/documentum/clients$ dmawkc -v dmtest02:container02:1489 '{print}' ~/.bash_profile
container-clients will do the connection to the repositories on the host’s behalf. As such, container-clients works as a proxy for the host.

Testing the container-clients container from a different host

In this scenario, we are logged on a machine different from the container’s host. Since we did not install ssh in the container, it is not possible to run something like “ssh dmadmin@container-clients widql dmtest02:container02:1489”, even supposing that the container is accessible through the network (we’d need to connect it to a macvlan network for example). But, if allowed to, we could access its host via ssh and ask it politely to send a command to the container, as examplified below:
dmadmin@dmclient2:~$ ssh -t dmadmin@dmclient docker exec -it container-clients bash -l widql dmtest02:container02:1489 -Udmadmin -Pdmadmin ... Connecting to Server using docbase dmtest02 [DM_SESSION_I_SESSION_START]info: "Session 0100c35080002b1a started for user dmadmin." Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle 1> quit Bye Connection to dmclient closed.
which again could be simplified through a local alias:
alias widqlc="ssh -t dmadmin@dmclient docker exec -it container-clients bash -l widql"
and invoked thusly:
dmadmin@dmclient2:~$ widqlc dmtest02:container02:1489 -Udmadmin -Pdmadmin ... Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle 1> quit Bye Connection to dmclient closed.

Testing the container-clients container from another container

A similar scenario would be for a container (instead of a normal host) to use the container-clients to access a remote repository. In this case, the ssh client is needed on the source container and ssh server on the host dmclient (which generally has it already) and, supposing it has been installed, the same “ssh -t dmadmin@dmclient docker exec…” command as above, or an alias thereof, could be used.
Of course, if container-client has the whole ssh package installed and enabled, there is no need any more to go through its host machine to invoke the tools; a command like “ssh -t dmadmin@container-clients widql dmtest02:container02:1489” could be used directly or, again, a shortcut for it, from any node on the network, be it a container or a traditional host. The dmadmin account’s credentials are now needed but this is not a big deal.

Conclusion

The container-clients container allows to avoid having to install the Documentum clients over and over again on administrative machines, when that is possible at all. Having such a container available also avoids having to log on servers just to use their clients locally, again if that is possible at all, for less risks to pollute or corrupt the installation. We have used it only with containers so far but, network permitting, it could also be used to access remote, non containerized repositories just as easily and transparently. Although quite useful as-is, it could be made even more so by adding Documentum Administrator in it. This is shown in the next article here Documentum Administrator in a container.

Cet article A Small Footprint Docker Container with Documentum command-line Tools est apparu en premier sur Blog dbi services.

↧

Documentum Administrator in a Container

October 1, 2019, 2:43 pm

≫ Next: Documentum – Large documents in xPlore indexing

≪ Previous: A Small Footprint Docker Container with Documentum command-line Tools

In article A Small Footprint Docker Container with Documentum command-line Tools, we presented a possible way to containerize the usual Documentum clients idql, iapi, dmawk and dmdocbroker. Those are command-line tools mostly used in batch processing or for one-of-its-kind manual tasks such as investigating or troubleshooting. OpenText also offers Documentum Administrator (aka DA), a java WDK-based client which is very useful for the occasional manual tasks. With its GUI running in a browser, it gives a 2-dimensional, point-and-click view of a repository’s objects. As it is so useful, let’s add it into our transportable toolbox. Ideally, it should be included in the aforementioned container, which we did, but at more than 1 Gb for DA alone, the footprint is now much larger, almost by a factor 3 ! We tried to keep it small as much as possible by using an existing remote global registry repository but the image’s size is still slightly below 1 Gb now. Anyway, traditional storage is cheap nowadays and the idea to have a complete installation of DA in just a few minutes as many times as we want is a tremendous incentive, so let’s see how we did it.

Overview

Since we want to enhance our Documentum clients’ toolbox, it makes sense to incrementally install DA in the same image. As this image already embeds a JDK, there will be no need to install it and the application server and DA will use the clients’ one. Remember that all those clients are DFCs programs and therefore they need a JVM to run. An alternative solution would be to containerize DA separately from the command-line clients; in such a choice, java shall be installed as a top-most pre-requisite.
In order to keep the image’s size as small as possible, and also to avoid any additional licensing issue, we will deploy DA v16.4 on the open source tomcat application server. Tomcat’s installation is straightforward: just uncompress its tarball in its destination directory. Deployment of DA is similarly simple: extract its war file’s content in a directory, say da, under tomcat’s webapps sub-directory. Once done, configuration must be done by editing several files according to the OpenTextDocumentum® Platform and Platform Extensions Version 16.4 Installation Guide and OpenText Documentum® Web Development Kit and Webtop Deployment Guide. Of course, since we are building images, all these steps will be scripted.
DA requires a global registry repository to store preferences and other settings. While this dependency looks superficial and non essential to its working, it is mandatory or DA won’t work at all if that repository is missing. Here are the related error messages output in the catalina.out log file:
21:01:04,494 ERROR [localhost-startStop-1] com.documentum.web.common.Trace - Preference repository is not available, reason: Preference Repository is not reachable. com.documentum.web.env.PreferenceRepository$DisabledException: Preference Repository is not reachable. at com.documentum.web.env.PreferenceRepository.initialize(PreferenceRepository.java:306) at com.documentum.web.env.PreferenceRepository.(PreferenceRepository.java:217) at com.documentum.web.env.PreferenceRepository.(PreferenceRepository.java:61) at com.documentum.web.env.PreferenceRepository$ApplicationListener.notifyApplicationStart(PreferenceRepository.java:76) ...
A nastier one, whose relationship with a missing global registry repository is less than obvious, is also output later:
21:01:30,299 ERROR [http-nio-8080-exec-8] com.documentum.web.common.Trace - Encountered error in error message component jsp java.lang.NullPointerException at com.documentum.web.formext.config.ConfigThemeResolver.getTheme(ConfigThemeResolver.java:167) at com.documentum.web.formext.config.ConfigThemeResolver.getStylesheets(ConfigThemeResolver.java:207) at com.documentum.web.common.BrandingService$ThemeResolverWrapper.getStylesheets(BrandingService.java:257) at com.documentum.web.form.WebformIncludes.renderStyleSheetIncludes(WebformIncludes.java:175) ...
Apparently, themes are also stored in that repository, or depend on information in it. Anyway, in order to ease up the image creation, we will use an existing remote repository from another container. In the alternative, stand-alone DA solution, the global registry would be installed along DA in the same container.
All the containers are linked here through a docker overlay network because they sit on distinct hosts.

The dockerfile

Here it is:

# syntax = docker/dockerfile:1.0-experimental
# we are using the secret mount type;
# cec - dbi-services - August 2019

FROM dbi/dctm-clients:v1.0
MAINTAINER "cec"

ARG soft_repo
ARG INSTALL_OWNER
ARG INSTALL_BASE
ARG INSTALL_HOME
ARG INSTALL_TMP
ARG INITIAL_DOCBROKER
ARG INITIAL_DOCBROKER_PORT
ARG DOCBASE_ADMIN
ARG DOCUMENTUM
ARG DOCUMENTUM_OWNER
ARG GLOBAL_REGISTRY
ARG JAVA_HOME
ARG TOMCAT_VERSION
ARG CATALINA_BASE
ARG CATALINA_HOME
ARG CATALINA_TMPDIR
ARG JRE_HOME
ARG CLASSPATH

USER root
RUN sed --in-place --regexp-extended -e 's|(%wheel)|\# \1|' -e 's|^#+ *(%wheel[^N]+NOPASSWD)|\1|' /etc/sudoers && \
    useradd --shell /bin/bash --home-dir /home/${INSTALL_OWNER} --create-home ${INSTALL_OWNER} && usermod -a -G wheel ${INSTALL_OWNER}

# set the $INSTALL_OWNER's password passed in the secret file;
RUN --mount=type=secret,id=dctm-secrets,dst=/tmp/dctm-secrets . /tmp/dctm-secrets && echo ${INSTALL_OWNER}:"${INSTALL_OWNER_PASSWORD}" | /usr/sbin/chpasswd

RUN mkdir -p ${INSTALL_TMP} ${CATALINA_BASE}                                                   && \
    chown -R ${INSTALL_OWNER}:${INSTALL_OWNER} ${INSTALL_HOME} ${INSTALL_TMP} ${CATALINA_BASE} && \
    chmod -R 755 ${INSTALL_HOME} ${INSTALL_TMP} ${CATALINA_BASE}

# copy the binary files;
COPY ${soft_repo}/da.war ${soft_repo}/DA_16.4.0120.0023.zip ${soft_repo}/${TOMCAT_VERSION}.tar.gz ${soft_repo}/docbrokers.tar ${INSTALL_TMP}/.
COPY ${soft_repo}/entrypoint.sh ${INSTALL_HOME}/.
RUN chown ${INSTALL_OWNER}:${INSTALL_OWNER} ${INSTALL_TMP}/* ${INSTALL_HOME}/entrypoint.sh && chmod u=rwx,g=rx,o=r ${INSTALL_HOME}/entrypoint.sh

# make the CLI comfortable again;
USER ${INSTALL_OWNER}
RUN echo >> ${HOME}/.bash_profile                                                                               && \
    echo "set -o vi" >> ${HOME}/.bash_profile                                                                   && \
    echo "alias ll='ls -alrt'" >> ${HOME}/.bash_profile                                                         && \
    echo "alias psg='ps -ef | grep -i'" >> ${HOME}/.bash_profile                                                && \
    echo "export JAVA_HOME=${JAVA_HOME}" >> ${HOME}/.bash_profile                                               && \
    echo "export TOMCAT_VERSION=${TOMCAT_VERSION}" >> ${HOME}/.bash_profile                                     && \
    echo "export CATALINA_BASE=${CATALINA_BASE}" >> ${HOME}/.bash_profile                                       && \
    echo "export CATALINA_HOME=${CATALINA_HOME}" >> ${HOME}/.bash_profile                                       && \
    echo "export CATALINA_TMPDIR=${CATALINA_TMPDIR}" >> ${HOME}/.bash_profile                                   && \
    echo "export JRE_HOME=\${JAVA_HOME}/jre" >> ${HOME}/.bash_profile                                           && \
    echo "export CLASSPATH=\${CATALINA_HOME}/bin/bootstrap.jar:\${CATALINA_HOME}/bin/tomcat-juli.jar" >> ${HOME}/.bash_profile && \
    echo "export PATH=.:\${CATALINA_HOME}/bin:\${JAVA_HOME}/bin:\$PATH" >> ${HOME}/.bash_profile                && \
    echo >> ${HOME}/.bash_profile

# install tomcat and configuration as per the OpenTextDocumentum® Platform and Platform Extensions Version 16.4 Installation Guide;
USER ${INSTALL_OWNER}
RUN tar xvf ${INSTALL_TMP}/${TOMCAT_VERSION}.tar.gz --directory ${CATALINA_BASE}/../. && \
    sed --in-place --regexp-extended "$ a \\\n# custo for DA\norg.apache.jasper.compiler.Parser.STRICT_WHITESPACE=false\njnlp.com.rsa.cryptoj.fips140loader=true" ${CATALINA_HOME}/conf/catalina.properties && \
    sed --in-place --regexp-extended -e "/<load-on-startup>3<\/load-on-startup>/i <init-param\>\\n   <param-name>enablePooling<\/param-name>\\n   <param-value>false<\/param-value>\\n<\/init-param>" -e "s|(<session-timeout>)[0-9]+|\11440|" -e "/<session-config>/a <cookie-config>\\n<http-only>false<\/http-only>\\n<!--secure>true<\/secure-->\\n<\/cookie-config>" ${CATALINA_HOME}/conf/web.xml && \
    sed --in-place --regexp-extended "s|(<Context)|\1 useHttpOnly=\"false\"|" ${CATALINA_HOME}/conf/context.xml && \
    sed --in-place --regexp-extended "/<Connector port=\"8080\"/a compression=\"on\"\\ncompressionMinSize=\"2048\"\\ncompressableMimeType=\"text\/html,text\/xml,application\/xml,text\/plain,text\/css,text\/javascript,text\/json,application\/x-javascript,application\/javascript,application\/json\"\\nuseSendfile=\"false\"" ${CATALINA_HOME}/conf/server.xml && \
    touch ${CATALINA_HOME}/bin/setenv.sh && sed --in-place --regexp-extended '$ a JAVA_OPTS="-server -XX:+UseParallelOldGC -Xms256m -Xmx1024m"' ${CATALINA_HOME}/bin/setenv.sh && \
    sed --in-place --regexp-extended -e '/(<Valve .*)/i <\!--' -e '/<Manager /i -->' ${CATALINA_HOME}/webapps/manager/META-INF/context.xml && \
    sed --in-place --regexp-extended -e '/(<Valve .*)/i <\!--' -e '/<Manager /i -->' ${CATALINA_HOME}/webapps/host-manager/META-INF/context.xml && \
    sed --in-place --regexp-extended "/<\/tomcat-users>/i \ \ <role rolename=\"manager-gui\"\/>\\n\ \ <role rolename=\"manager-script\"\/>\\n\ \ <role rolename=\"manager-jmx\"\/>\\n\ \ <role rolename=\"manager-status\"\/>\\n\ \ <user username=\"tomcat\" password=\"tomcat\" roles=\"manager-gui,admin-gui,admin-script,manager-script,manager-jmx,manager-status\"\/>" ${CATALINA_HOME}/conf/tomcat-users.xml && \
    tar xvf ${INSTALL_TMP}/docbrokers.tar --directory ${CATALINA_BASE}/webapps/.

# deployment of da.war and edition of WEB-INF/classes/dfc.properties;
RUN . ${HOME}/.bash_profile && mkdir ${CATALINA_HOME}/webapps/da && cd ${CATALINA_HOME}/webapps/da && mv ${INSTALL_TMP}/da.war . && jar xvf ./da.war && rm ${CATALINA_HOME}/webapps/da/da.war && \
    mv ${INSTALL_TMP}/DA_16.4.0120.0023.zip ${CATALINA_HOME}/webapps/da && unzip -o DA_16.4.0120.0023.zip && rm DA_16.4.0120.0023.zip && \
    touch ${CATALINA_HOME}/webapps/da/WEB-INF/classes/dfc.properties && mkdir ${INSTALL_HOME}/dfc_data && \
    sed --in-place --regexp-extended -e "s|^(dfc\.data\.dir *=.*)|# \1|" -e "1s|^|dfc\.data\.dir=${INSTALL_HOME}\/dfc_data\\n|" -e "1s|^|#include ${DOCUMENTUM}/config/dfc.properties\\n|" -e "$ a dfc.docbroker.host[0]=${INITIAL_DOCBROKER}\\ndfc.docbroker.port[0]=${INITIAL_DOCBROKER_PORT}" -e '/^dfc\.diagnostics\.resources\.enable *=.*/d' -e '$ a dfc\.diagnostics\.resources\.enable=false' -e '/^dfc\.config\.check_interval *=.*/d' -e '$ a dfc\.config\.check_interval = 5' ${CATALINA_HOME}/webapps/da/WEB-INF/classes/dfc.properties && \
    sed --in-place --regexp-extended "s|(<session-timeout>)[0-9]+|\11440|" ${CATALINA_HOME}/webapps/da/WEB-INF/web.xml

USER root
RUN --mount=type=secret,id=dctm-secrets,dst=/tmp/dctm-secrets . /tmp/dctm-secrets; su - ${INSTALL_OWNER} -c ". \${HOME}/.bash_profile; cd \${CATALINA_HOME}/webapps/da; passwords=\"`cd \${CATALINA_HOME}/webapps/da ; \${JAVA_HOME}/bin/java -cp WEB-INF/classes:WEB-INF/lib/dfc.jar:WEB-INF/lib/commons-io-1.2.jar com.documentum.web.formext.session.TrustedAuthenticatorTool ${DMC_WDK_PREFERENCES_PASSWORD} ${DMC_WDK_PRESETS_PASSWORD}`\"; gawk -v passwords=\"\$passwords\" 'BEGIN {ind = 0; while (match(passwords, /Encrypted: \[[^]]+\]/)) {pass_tab[ind++] = substr(passwords, RSTART + 12, RLENGTH - 13); passwords = substr(passwords, RSTART + RLENGTH)}} \
{if (match(\$0, /<presets>/)) {print; while (getline > 0 && !match(\$0, /<password><\/password>/)) print; print \"<password>\" pass_tab[1] \"</password>\"} else if (match(\$0, /<preferencesrepository>/)) {print; while (getline > 0 && !match(\$0, /<password><\/password>/)) print; print \"<password>\" pass_tab[0] \"</password>\"} else print}' ${CATALINA_HOME}/webapps/da/wdk/app.xml > /tmp/app.xml; mv /tmp/app.xml ${CATALINA_HOME}/webapps/da/wdk/app.xml; \
    sed --in-place --regexp-extended -e '/<cookie_validation>.+$/{N;s|true|false|}' ${CATALINA_HOME}/webapps/da/wdk/app.xml; \
DM_BOF_REGISTRY_PASSWORD_ENCRYPTED=`cd ${CATALINA_HOME}/webapps/da; \${JAVA_HOME}/bin/java -cp WEB-INF/classes:WEB-INF/lib/dfc.jar com.documentum.fc.tools.RegistryPasswordUtils ${DM_BOF_REGISTRY_PASSWORD}`; \
    sed --in-place --regexp-extended \"$ a dfc.globalregistry.repository=${GLOBAL_REGISTRY}\\ndfc.globalregistry.username=${DM_BOF_REGISTRY_USER}\\ndfc.globalregistry.password=\${DM_BOF_REGISTRY_PASSWORD_ENCRYPTED}\" ${CATALINA_HOME}/webapps/da/WEB-INF/classes/dfc.properties" && \
    find ${INSTALL_HOME} -user root -exec chown -R ${INSTALL_OWNER}:${INSTALL_OWNER} {} \;

# cleanup;
USER root
RUN rm -r ${INSTALL_TMP} /tmp/dctm-secrets

# start tomcat and DA;
USER ${INSTALL_OWNER}
WORKDIR ${INSTALL_HOME}
ENTRYPOINT ["/app/tomcat/entrypoint.sh"]

# build the image;
# cat - <<'eot' | gawk '{gsub(/#+ */, ""); print}'
# export INSTALL_OWNER=tomcat
# export INSTALL_BASE=/app
# export INSTALL_HOME=${INSTALL_BASE}/tomcat
# export INSTALL_TMP=${INSTALL_BASE}/tmp
# export INITIAL_DOCBROKER=container02
# export INITIAL_DOCBROKER_PORT=1489
# export DOCBASE_ADMIN=dmadmin
# export DOCUMENTUM=${INSTALL_BASE}/dctm
# export DOCUMENTUM_OWNER=dmadmin
# export GLOBAL_REGISTRY=dmtest02
# export JAVA_HOME=${DOCUMENTUM}/java64/JAVA_LINK
# export TOMCAT_VERSION=apache-tomcat-9.0.22
# export TOMCAT_VERSION=apache-tomcat-8.5.43
# export CATALINA_BASE=${INSTALL_HOME}/${TOMCAT_VERSION}
# export CATALINA_HOME=${CATALINA_BASE}
# export CATALINA_TMPDIR=${CATALINA_HOME}/temp
# export JRE_HOME=${JAVA_HOME}/jre
# export CLASSPATH=${CATALINA_HOME}/bin/bootstrap.jar:${CATALINA_HOME}/bin/tomcat-juli.jar
# cp entrypoint.sh docbrokers.tar files/.
# time DOCKER_BUILDKIT=1 docker build --squash --no-cache --progress=plain --secret id=dctm-secrets,src=./dctm-secrets \
#  --build-arg INSTALL_OWNER=${INSTALL_OWNER}                                                       \
#  --build-arg INSTALL_BASE=${INSTALL_BASE}                                                         \
#  --build-arg INSTALL_HOME=${INSTALL_HOME}                                                         \
#  --build-arg INSTALL_TMP=${INSTALL_TMP}                                                           \
#  --build-arg INITIAL_DOCBROKER=${INITIAL_DOCBROKER}                                               \
#  --build-arg INITIAL_DOCBROKER_PORT=${INITIAL_DOCBROKER_PORT}                                     \
#  --build-arg DOCBASE_ADMIN=${DOCBASE_ADMIN}                                                       \
#  --build-arg DOCUMENTUM=${DOCUMENTUM}                                                             \
#  --build-arg DOCUMENTUM_OWNER=${DOCUMENTUM_OWNER}                                                 \
#  --build-arg bSetGRPassword=${bSetGRPassword}                                                     \
#  --build-arg GLOBAL_REGISTRY=${GLOBAL_REGISTRY}                                                   \
#  --build-arg JAVA_HOME=${JAVA_HOME}                                                               \
#  --build-arg TOMCAT_VERSION=${TOMCAT_VERSION}                                                     \
#  --build-arg CATALINA_BASE=${CATALINA_BASE}                                                       \
#  --build-arg CATALINA_HOME=${CATALINA_HOME}                                                       \
#  --build-arg CATALINA_TMPDIR=${CATALINA_TMPDIR}                                                   \
#  --build-arg JRE_HOME=${JRE_HOME}                                                                 \
#  --build-arg CLASSPATH=${CLASSPATH}                                                               \
#  --tag="dbi/dctm-clients-da:v1.0" --build-arg soft_repo=./files                                   \
#  .
# eot

# retag the image & cleanup unused images;
# docker tag <image-id> dbi/dctm-clients-da:v1.0
# docker system prune

# run the image and remove the container on exit;
# docker run -d --rm --hostname=container-clients --network=dctmolnet01 --publish=8080:8080 dbi/dctm-clients-da:v1.0

# run the image and keep the container on exit;
# docker run -d --name container-clients --hostname=container-clients --network=dctmolnet01 --publish=8080:8080 dbi/dctm-clients-da:v1.0

As said, we chose to base DA’s image upon the container-clients’ one to inherit its O/S and JDK.
On lines 8 to 25, the build parameters are received in ARG variables. Unlike ENV variables, those won’t persist in the image, which is fine as they are not needed after the build any more.
On line 29, the account tomcat, owner of the application server installation, is created. For convenience, the tomcat user is also a sudoer, which under Centos is easily achieved just by adding the user into the wheel group. On line 28, we also edit the sudoers file to allow a passwordless sudo command. On line 32, tomcat’s password is taken from the sourced secret file and set. Secrets is a docker’s useful, albeit still experimental, feature aimed at passing confidential information to the build without leaving any trace in the intermediary images, which is ideal for passwords.
Furthermore, see line 67 to 69, the user tomcat has also been given full access to all of the tomcat’s Web applications, such as the manager and the host-manager; there is no reason not to.
On line 39, the binary packages are copied from a local sub-directory into the image. Here is the working directory’s layout:
dmadmin@dmclient:~/builds/documentum/da$ ll -R total 72 -rw-rw-r-- 1 dmadmin dmadmin 459 Aug 25 13:45 dctm-secrets -rw-rw-r-- 1 dmadmin dmadmin 150 Aug 28 22:58 entrypoint.sh -rw-rw-r-- 1 dmadmin dmadmin 17451 Sep 1 14:09 Dockerfile drwxrwxr-x 2 dmadmin dmadmin 4096 Sep 4 13:46 files -rw-rw-r-- 1 dmadmin dmadmin 20480 Sep 6 13:47 docbrokers.tar ./files: total 131852 -rw-rw-r-- 1 dmadmin dmadmin 9717059 Aug 20 11:44 apache-tomcat-8.5.43.tar.gz -rw-rw-r-- 1 dmadmin dmadmin 114358079 Aug 27 17:45 DA_16.4.0120.0023.zip
The files sub-directory only contains the minimum binaries’s archives to install into the image.
One line 70, the application docbrokers is deployed. More on this later.
On lines 73 to 77, DA is deployed and the recommended customizations are applied. Note on line 76 the #include statement of the client’s dfc.properties file. The reason will be explained later, although you can already figure out why (think to widql, etc …). On line 77, the session timeout is also increased to one day for it is really annoying and stupid to have to authentify 30 times in a day in each application while investigating; doing that at the workstation level is plenty enough.
One lines 80 and 81, the presets and preference accounts’ passwords are encrypted and inserted directly in wdk/app.xml. Documentum however recommends to copy and edit the relevant sections into custom/app.xml, which would be a better practice. Just don’t tell anyone.
On lines 83 to 84, the same is done for the global registry’s user dm_bof_registry. All those accounts passwords must have been set preliminarily in the global registry repository. Also, they are stored as clear-text values in the secret file which is passed to the build. Here is an example of the secret file used here:
dmadmin@dmclient:~/builds/documentum/da$ cat dctm-secrets export INSTALL_OWNER=tomcat export INSTALL_OWNER_PASSWORD=tomcat export DMC_WDK_PREFERENCES_USER=dmc_wdk_preferences_owner export DMC_WDK_PREFERENCES_PASSWORD=dmc_wdk_preferences_password export DMC_WDK_PRESETS_USER=dmc_wdk_presets_owner export DMC_WDK_PRESETS_PASSWORD=dmc_wdk_presets_password export DM_BOF_REGISTRY_USER=dm_bof_registry export DM_BOF_REGISTRY_PASSWORD=dm_bof_registry export DOCBASE_ADMIN=dmadmin export DOCBASE_ADMIN_PASSWORD=dmadmin
While presets and preference accounts’ passwords must be encrypted using the installed DA classes (and therefore received in clear text form), dm_bof_registry’s password could have been passed in encrypted form since it was already encrypted during the content server’s installation and stored in its local dfc.properties file. On the security-side however this would have made no difference because its encrypted password can be used interchangeably with the clear-text password to login ! So instead of introducing an asymetry here, we chose to (re-)encrypt all the passwords in one place using their respective tools (yes, they use different encryption algorihms and programs).
Note the mess on some RUN lines: nobody sane in their mind would do it that way but we just wanted to push it to the extreme with quoting and escaping on the RUN command-line, a kind of challenge, especially with a user account switch in between: being root is necessary to access the secrets but the rest must be done as user tomcat inside one long quoted string of commands. The complexity also comes from the need to access dockerfile’s environment variables along with the on-line script’s; the latter must be escaped. Strangely enough, even gawk script’s internal $0 variable must be escaped despite the script is bracketed inside single quotes. Normal persons would use an external bash script invoked by the RUN statements. Let’s hope that the HTML rendition of the dockerfile for the article did not skip special characters such as the < and >.
On line 85, we restore proper ownership for some files that inexplicably passed under the root ownership and on line 89 the mount point of the secret file is removed.
On line 94, we define an entrypoint for starting the tomcat server. Here is its content:

dmadmin@dmclient:~/builds/documentum/da$ cat entrypoint.sh 

#!/bin/bash
. /home/tomcat/.bash_profile
${CATALINA_BASE}/bin/startup.sh
tail -F ${CATALINA_HOME}/logs/catalina.out

Note the -F option on line 6; we follow the log file by name so this will work even after a log file rotation (by default, every 24 hours).
After the web application server is started, the running container will just tail on the main log to stdout. The command
docker logs --follow --timestamps <da-container>
can then be used to follow the container’s stdout from outside the container.
Line 97 to 137 explain how to build the image. See the next paragraph for doing this.
Line 116 copies the entrypoint.sh script and the docbrokers application into the sub-directory from which the dockerfile will import files into the image.
Note on line 142 the prune command. This is very useful to remove the unused images and build caches, along with other objects, which reclaims lots of Gb. We wish we knew this command earlier, it would have spared us lots of volume enlargments sessions in Virtual Box/Gparted ! As it also removes stopped containers, be sure this is acceptable to you. We reckon docker containers are meant to be easily disposable and recreated but sometimes it takes time (e.g. about haft an hour for a containerized repository) and it can be annoying to unvoluntarily destroy such containers.
Lines 145 and 148 show two ways to run the image, either as an ephemeral container or as a permanent container. This is discussed later in it own paragraph.

Building the image

In order to extract the statements from the dockerfie, copy/paste into a shell prompt the text starting at the “cat” command down to the “.” (lines 97 to 137 from the dockerfile) included and add a “eot” here-document terminator on a new line, then copy/paste all the generated statements at a prompt. Those statements could has well be wrapped into a bash script if wished so. Or just recalled from the command-line’s history if some cycles of adjustments and testings are necessary.
Here are the build’s statements:

export INSTALL_OWNER=tomcat
export INSTALL_BASE=/app
export INSTALL_HOME=${INSTALL_BASE}/tomcat
export INSTALL_TMP=${INSTALL_BASE}/tmp
export INITIAL_DOCBROKER=container02
export INITIAL_DOCBROKER_PORT=1489
export DOCBASE_ADMIN=dmadmin
export DOCUMENTUM=${INSTALL_BASE}/dctm
export DOCUMENTUM_OWNER=dmadmin
export GLOBAL_REGISTRY=dmtest02
export JAVA_HOME=${DOCUMENTUM}/java64/JAVA_LINK
export TOMCAT_VERSION=apache-tomcat-8.5.43
export CATALINA_BASE=${INSTALL_HOME}/${TOMCAT_VERSION}
export CATALINA_HOME=${CATALINA_BASE}
export CATALINA_TMPDIR=${CATALINA_HOME}/temp
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=${CATALINA_HOME}/bin/bootstrap.jar:${CATALINA_HOME}/bin/tomcat-juli.jar
cp entrypoint.sh docbrokers.tar files/.
time DOCKER_BUILDKIT=1 docker build --squash --no-cache --progress=plain --secret id=dctm-secrets,src=./dctm-secrets \
--build-arg INSTALL_OWNER=${INSTALL_OWNER}                                                       \
--build-arg INSTALL_BASE=${INSTALL_BASE}                                                         \
--build-arg INSTALL_HOME=${INSTALL_HOME}                                                         \
--build-arg INSTALL_TMP=${INSTALL_TMP}                                                           \
--build-arg INITIAL_DOCBROKER=${INITIAL_DOCBROKER}                                               \
--build-arg INITIAL_DOCBROKER_PORT=${INITIAL_DOCBROKER_PORT}                                     \
--build-arg DOCBASE_ADMIN=${DOCBASE_ADMIN}                                                       \
--build-arg DOCUMENTUM=${DOCUMENTUM}                                                             \
--build-arg DOCUMENTUM_OWNER=${DOCUMENTUM_OWNER}                                                 \
--build-arg GLOBAL_REGISTRY=${GLOBAL_REGISTRY}                                                   \
--build-arg JAVA_HOME=${JAVA_HOME}                                                               \
--build-arg TOMCAT_VERSION=${TOMCAT_VERSION}                                                     \
--build-arg CATALINA_BASE=${CATALINA_BASE}                                                       \
--build-arg CATALINA_HOME=${CATALINA_HOME}                                                       \
--build-arg CATALINA_TMPDIR=${CATALINA_TMPDIR}                                                   \
--build-arg JRE_HOME=${JRE_HOME}                                                                 \
--build-arg CLASSPATH=${CLASSPATH}                                                               \
--tag="cec/dctm-clients-da:v1.0" --build-arg soft_repo=./files                                   \
.

The build takes up to 5 mn, with 40s spent in the ––squash option alone, so in case you modify and enhance the dockerfile, you may want to activate this option only at the end, once the dockerfile is fully debugged in order to speed up the intermediate build cycles.
Let’s see the image size:
dmadmin@dmclient:~/builds/documentum/da$ docker image REPOSITORY TAG IMAGE ID CREATED SIZE <none> <none> 72e7e3a93146 10 seconds ago 948MB cec/dctm-clients-da v1.0 3f7caa1979cf About a minute ago 1.38GB cec/dctm-clients v1.0 e8f7a22479a6 About an hour ago 625MB
We notice a 30% size increase after DA is installed in container-clients, with more than 300 Mb of additional size. The ––squash option has removed about 400 Mb of overlaid filesystem space, from 1380 Mb down to 948 Mb. The toolbox with the Documentum command-line clients plus the Documentum Administrator weighs now 948 Mb, quite the toolbox !

Running the image

Launching the image, to get either a temporary or a permanent container, is possible, see lines 145 respectively 148 in the dockerfile. When fired up, the container starts tomcat and deploys all its applications. For our tests, the container is plugged into a docker overlay network to be able to access repositories running in other containers and DA is accessed using the host’s hostname on port 8080, e.g. http://192.168.56.10:8080/da. If several such containers run on the same host, just publish their internal port 8080 to different host’s ports to avoid any conflict.
Once running, the familiar login screen is displayed. At this point, the only accessible repository presented in the drop-down control is the global registry, which is generally not very useful. To add new repositories, we could enter the container and edit ${CATALINA_BASE}/webapps/da/WEB-INF/classes/dfc.properties but we would then fall into the same limitation as with the command-line tools and their ${DOCUMENTUM}/config.dfc.properties file, which we solved using dctm_wrapper and its symlinks widql, wiapi, etc… (see the article Connecting to a Repository via a Dynamically Edited dfc.properties File (part I)). Can we do the same here with DA ? We can of course, else we wouldn’t have asked the rethoritical question 😉 . Please, see the next three paragraphs.

Using the command-lines tools in the DA container

As said before, the DA image is built upon the command-line tools’ image. Therefore, it includes them and they are still available in a DA container. The difference with A Small Footprint Docker Container with Documentum command-line Tools is that the default container’s user is now tomcat instead of dmadmin. Hence, the docker exec commands must specify the user that owns the tools, as shown below:
dmadmin@dmclient:~/builds/documentum/clients$ docker exec -it --user dmadmin container-clients-da bash -l widql dmtestgr02:containergr02:1489 -Udmadmin -Pdmadmin OpenText Documentum idql - Interactive document query interface Copyright (c) 2018. OpenText Corporation All rights reserved. Client Library Release 16.4.0070.0035 Connecting to Server using docbase dmtest02 [DM_SESSION_I_SESSION_START]info: "Session 0100c3508000351c started for user dmadmin." Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle 1> quit Bye
The rest of the linked article still applies here.

Using the containerized DA

In order to access repositories, their docbroker must be added to the DA’s ${CATALINA_BASE}/webapps/da/WEB-INF/classes/dfc.properties file. Besides manually editing this file, which is always tedious, it is possible to use a smarter trick: connect to the repository of interest through the container’s w% tools such as widql (see the aforementioned article) specifying that the new docbroker should be appended and kept in dfc.properties’ file on exiting. since this file is by default read every 30 seconds, the new configuration will be available after this delay at most. The drop down list of reachable repositories on the login screen will then display all the repositories that project to that new docbroker along with the existing docbrokers’ ones.
Example:
# first, display dmadmin's current dfc.properties file in container-client-da: [dmadmin@container-clients-da ~]$ cat /app/dctm/config/dfc.properties dfc.data.dir=/app/dctm dfc.tokenstorage.dir=/app/dctm/apptoken dfc.tokenstorage.enable=false dfc.docbroker.host[0]=dummy dfc.docbroker.port[0]=1489
The couple of host/port at index 0 is assigned dummy values so values at same index position in DA own’s dfc.properties file are not hidden (they are used for the global registry).
Let’s now connect using the container’s widql tools as dmadmin from its host (as an example; network permitting, this could be done from everywhere):
dmadmin@dmclient:~/builds/documentum/da$ widqlc dmtestgr02:containergr02:1489 --keep --append ... Connected to OpenText Documentum Server running Release 16.4.0080.0129 Linux64.Oracle 1> quit
Now, display again dmadmin’s current dfc.properties file in container:
[dmadmin@container-clients ~]$ cat /app/dctm/config/dfc.properties dfc.data.dir=/app/dctm dfc.tokenstorage.dir=/app/dctm/apptoken dfc.tokenstorage.enable=false dfc.docbroker.host[0]=dummy dfc.docbroker.port[0]=1489 dfc.docbroker.host[1]=containergr02 dfc.docbroker.port[1]=1489
The new entries have been added at index position 1.
Log off/on DA and the repository dmtestgr02 should be listed in the Repository drop-down control.
Note that the widqlc command used is an alias for:
dmadmin@dmclient:~/builds/documentum/da$ alias widqlc alias widqlc='docker exec -it --user dmadmin dctm-clients-da bash -l widql'
as explained in previous paragraph.
But is there still a better, clearer way instead of switching tools ? Yes, indeed. It is the purpose of the custom docbrokers application that is deployed along DA. Read on.

The docbrokers application

What about calling up a web page to show or edit DA’s dfc.properties file ? E.g.:
http://da-container:8080/docbrokers/show http://da-container:8080/docbrokers/append?dmtest02:container02:1489 http://da-container:8080/docbrokers/remove?index=n
These are the functions implemented by the docbrokers application:
show just displays in a web page the current content of the DA’s dfc.properties file;
append adds a couple of dfc.docbroker.host/dfc.docbroker.port to DA’s dfc.properties file using the next available index;
remove deletes the entry couple at the given index;
The default page, index, is loaded when no action is specified in the URL, e.g. http://da-container:8080/docbrokers. It simply displays a list of the supported actions. Here is its content:

#! /bin/bash
# cec - dbi-services - September 2019
 
# the index.html page;
# Usage:
#        http://...:.../docbrokers
# it displays the available functions in a html page;
 
cat <<eot
Content-type: text/html
 
 
<html>
<head><title>docbrokers application's available functions</title></head>
<body bgcolor=#F3E2A9 text=black>
<b>To show the current content of the $CATALINA_BASE/webapps/da/WEB-INF/classes/dfc.properties: </b><button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/docbrokers/show', '_blank');">Show</button>
<br>
Use: http://${HTTP_HOST}/docbrokers/show
<br>
<br>
<b>To append an entry in the dfc.properties file: </b><button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/docbrokers/append', '_blank');">Append</button>
<br>
Use: http://${HTTP_HOST}/docbrokers/append?[repository:]machine:port
<br>
<br>
<b>To remove the index-th entry from the dfc.properties file: </b><button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/docbrokers/remove?index=n', '_blank');">Remove</button>
<br>
Use: http://${HTTP_HOST}/docbrokers/remove?index=n
<br>
<br>
</body>
<button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/da/component/logoff', '_blank');">To logged out screen</button>
</html>
eot

To keep things simple, append reuses the bash dctm-wrapper script, which will force us to do some CGI programming. Fortunately, Tomcat includes a, normally disabled, cgi servlet, and it will let us do that.
Here is the show function:

#! /bin/bash
# cec - dbi-services - September 2019
# the show function;
# Usage:
#        http://...:.../docbrokers/show
# it displays in a html page the content of the dfc.properties file pointed to by $DFC_CONFIG;
 
cat <<eot
Content-type: text/html
 
 
<html>
<head><title>Show dfc.properties entries</title></head>
<body bgcolor=#F3E2A9 text=black>
<b>Here is the current content of the $CATALINA_BASE/webapps/da/WEB-INF/classes/dfc.properties:</b>
<br>
<br>
$(cat $CATALINA_BASE/webapps/da/WEB-INF/classes/dfc.properties | gawk '{printf("%s\n<br>\n", $0)}')
<br>
</body>
<button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/da/component/logoff', '_blank');">To logged out screen</button>
</html>
eot

Here is the append function:

#! /bin/bash
# cec - dbi-services - September 2019
# the append function;
# Usage:
#        http://...:.../docbrokers/append?repo:machine:port
# it invokes the dctm-wrapper script to add an entry into the dfc.properties file pointed to by $DFC_CONFIG;
# the above will append the entries dfc.docbroker.host[i]=machine and dfc.docbroker.port[i]=port
# where i is the next available index;
 
export DFC_CONFIG=/app/tomcat/apache-tomcat-8.5.43/webapps/da/WEB-INF/classes/dfc.properties
 
cat <<eot
Content-type: text/html
 
 
<meta http-equiv="pragma" content="no-cache" />
<html>
<head><title>Docbrokers append</title></head>
<body bgcolor=#F3E2A9 text=black>
<br>
<b>$DFC_CONFIG file before:</b>
<br>
$(cat $DFC_CONFIG | gawk '{printf("%s\n<br>\n", $0)}')
<br>
eot
 
[[ ! -z "$QUERY_STRING" ]] && chmod a+r $DFC_CONFIG; sudo su - dmadmin -c "export DFC_CONFIG=${DFC_CONFIG}; /app/scripts/dctm-wrapper "$(if [[ -z $(echo $QUERY_STRING | cut -d: -f 3) ]]; then echo dummy:$QUERY_STRING; else echo $QUERY_STRING; fi)" --append" > /tmp/$$; mv /tmp/$$ $DFC_CONFIG
 
cat <<eot
<b>$DFC_CONFIG file after:</b>
<br>
$(cat $DFC_CONFIG | gawk '{printf("%s\n<br>\n", $0)}')
<br>
 
</body>
<button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/da/component/logoff', '_blank');">To logged out screen</button>
</html>
eot

Here is the remove function:

#! /bin/bash
# cec - dbi-services - September 2019
# the remove function;
# Usage:
#        http://...:.../docbrokers/remove?entry=n
# it removes the nth entries dfc.docbroker.host[n] & dfc.docbroker.port[n] from the dfc.properties file pointed to by $DFC_CONFIG;
 
export DFC_CONFIG=${CATALINA_BASE}/webapps/da/WEB-INF/classes/dfc.properties
 
cat <<eot
Content-type: text/html
 
 
<html>
<head><title>Docbrokers remove</title></head>
<body bgcolor=#F3E2A9 text=black>
 
<b>$DFC_CONFIG file before:</b>
<br>
$(cat $CATALINA_BASE/webapps/da/WEB-INF/classes/dfc.properties | gawk '{printf("%s\n<br>\n", $0)}')
<br>
eot
 
# syntax is: http://da-container:8080/docbrokers/remove?index=n
gawk -v QUERY_STRING="$QUERY_STRING" 'BEGIN {
   nb_fields = split(QUERY_STRING, tab, "=")
   if (nb_fields != 2 || tab[1] != "index" || !match(tab[2], /^[0-9]+$/))
      tab[2]="n"
   index_to_remove = tab[2]
}
{
   if (!match($0, /^dfc.docbroker.host\[[0-9]+\]=/)) print
   else {
      match($0, /\[[0-9]+\]/); index_number = substr($0, RSTART + 1, RLENGTH - 2)
      if (index_number != index_to_remove) {
         print;
         getline; print
      }
      else getline
   }
}' $DFC_CONFIG > /tmp/$$; mv /tmp/$$ $DFC_CONFIG
 
cat <<eot
<b>$DFC_CONFIG file after:</b>
<br>
$(cat $DFC_CONFIG | gawk '{printf("%s\n<br>\n", $0)}')
<br>
 
</body>
<button type="button" autofocus onclick="window.open('http://${HTTP_HOST}/da/component/logoff', '_blank');">To logged out screen</button>
</html>
eot

All these scripts belong to the new docbrokers application; they are archived in docbrokers.tar which is later exploded in the webapps directory (see line 70 in the dockerfile). All the configuration changes at the tomcat level (e.g. to activate the cgi servlet) are located in the application own’s WEB-INF/context.xml and WEB-INF/web.xml so they don’t interfere with the other deployed applications.

With these functions, it is very easy to edit the containerized DA’s dfc.properties file. After the change, a click on the button “To logged out screen” will display DA’s logoff page from which the login page is one click away. This step is necessary to make sure the Repository drop-down control is properly updated to reflect the requested change.

Conclusion

Our toolbox has now a few basic tools to help us do some of our daily administrative tasks. While quite useful as-is, the containerized DA could still be enhanced to remove its dependency from an external repository, e.g. having a local, dedicated global registry along with its local database, typically a PostgreSQL one. The image would be much bigger but also much more transportable. Time permitting, we’ll implement this alternative.
Another improvement area would be to add a clean shutdown of the tomcat server, for example using signals, as presented in the article How to stop Documentum processes in a docker container, and more (part I) and maybe add some restarting and/or monitoring facility. Also, now that the Pandora box has been opened of custom extensions deployment in tomcat along DA, anything is possible in that container.

Cet article Documentum Administrator in a Container est apparu en premier sur Blog dbi services.

↧

Documentum – Large documents in xPlore indexing

October 3, 2019, 12:30 pm

≫ Next: SQL Server 2019 Accelerated Database Recovery – Instantaneous rollback and aggressive log truncation

≪ Previous: Documentum Administrator in a Container

Documentum is using xPlore for the Full Text indexing/search processes. If you aren’t very familiar with how xPlore is working, you might want to know how it is possible to index large documents or you might be confused about some documents not being indexed (and therefore not searchable). In this blog, I will try to explain how xPlore can be configured to be able to index these big documents without causing too much trouble because by default, these documents are just not indexed which might be an issue. Documents tend to be bigger and bigger and therefore the thresholds for the xPlore indexing might be a little bit outdated…

In this blog, I will go through all thresholds that can be configured on the different components and I will try to explain a little bit what it’s all about. Before starting, I believe a very short (and definitively not exhaustive) introduction on the indexing process of the xPlore is therefore required. As soon as you install an IndexAgent, it will trigger the creation of several things on the associated repository, including new events registration on the ‘dmi_registry‘. When working with documents, these events (‘dm_save‘, ‘dm_saveasnew‘, …) will generate new entries in the ‘dmi_queue_item‘. The IndexAgent will then access the ‘dmi_queue_item‘ and retrieve the document that needs indexing (add/update/remove from index). Then from here a CPS is called and processing the document (language identification, text extraction, tokenization, lemmatization, stemming, …). My point here is that there are two main sides of the indexing process: the IndexAgent and then the CPS. This is also true for the thresholds: you will need to configure them properly on both sides.

I. IndexAgent

On the IndexAgent side, there isn’t much configuration possible strictly related to the size of documents since there is only one but it’s kind of the most important one since it’s the first barrier that will block your indexing if not configured properly.

In the file indexagent.xml (found under $JBOSS_HOME/server/DctmServer_Indexagent/deployments/IndexAgent.war/WEB-INF/classes), in the exporter section, you can find the parameter ‘contentSizeLimit‘. This parameter controls the maximum size of a document that can be send to indexing. This is the real size of the document (‘content_size‘/’full_content_size‘); it is not the size of its text once extracted. The reason for that is simple: this limit is on the IndexAgent size and the text hasn’t been extracted yet so the IndexAgent do not know how big the extracted text will be. If the size of the document exceeds the value defined for ‘contentSizeLimit‘, then the IndexAgent will not even try to process it, it will just reject it and in this case, you will see a message that the document exceeded the limit on both the IndexAgent logs as well as in the ‘dmi_queue_item‘ object. Other documents of the same batch aren’t impacted, the parameter ‘contentSizeLimit‘ is for each and every document. The default value for this parameter is 20 000 000 bytes (19,07 MB).

If you are going to change this value, then you might need some other updates. You can tweak some other parameters if you are seeing issues while indexing large documents, all of them can be configured inside this indexagent.xml file. For example, you might want to look at the ‘content_clean_interval‘ (in milliseconds) which controls when the export of the document (dftxml document) will be removed from the staging area of the IndexAgent (location of ‘local_content_area‘). If the value is too small, then the CPS might try to retrieve a file to process it for indexing but the IndexAgent might have removed the file already. The default value for this parameter is 1 200 000 (20 minutes).

II. CPS

On the CPS side, you can look at several other size related parameters. You can find these parameters (and many others) in two main locations. The first is global to the Federation: indexserverconfig.xml (found under $XPLORE_HOME/config by default but you can change it (E.g.: a shared location for a Multi-Node FT)). The second one is a CPS-specific configuration file: PrimaryDsearch_local_configuration.xml for a PrimaryDsearch or <CPS_Name>_configuration.xml for a CPS Only (found under $XPLORE_HOME/dsearch/cps/cps_daemon/).

The first parameter to look for is ‘max_text_threshold‘. This parameter controls the maximum text size of a document. This is the size of its text after extraction; it is not the real size of the document. If the text size of the document exceeds the value defined for ‘max_text_threshold‘, then the CPS will act according to the value defined for the ‘cut_off_text‘. With a ‘cut_off_text‘ set to true, the documents that exceed ‘max_text_threshold‘ will have the first ‘max_text_threshold‘ MB indexed but the CPS will stop once it reached the limit. In this case the CPS log will contain something like ‘doc**** is partially processed’ and the dftxml of this document will contain the mention ‘partialIndexed‘. This means that the CPS stopped at the limit defined and therefore the index might be missing some content. With a ‘cut_off_text‘ set to false (default value), the documents that exceed ‘max_text_threshold‘ will be rejected and therefore not full text indexed at all (only metadata is indexed). Other documents of the same batch aren’t impacted, the parameter ‘max_text_threshold‘ is for each and every document. The default value for this parameter is 10 485 760 bytes (10 MB) and the maximum value possible is 2 147 483 648 (2 GB).

The second parameter to look for is ‘max_data_per_process‘. This parameter controls the maximum text size that a CPS Batch should handle. The CPS is indexing documents/items per batches (‘CPS-requests-batch-size‘). By default, a CPS will process up to 5 documents per batch but, if I’m not mistaken, it can be less if there isn’t enough documents to process. If the total text size to be processed by the CPS for the complete batch is above ‘max_data_per_process‘, then the CPS will reject the full batch and it will therefore not full text index the content of any of the documents. This is going to be an issue if you increased the previous parameters but miss/forget this one. Indeed, you might end-up with very small documents not indexed because they were in a batch containing some big documents. To be sure that this parameter doesn’t block any batch, you can set it to ‘CPS-requests-batch-size‘*’max_text_threshold‘. The default value for this parameter is 31 457 280 bytes (30 MB) and the maximum value possible is 2 147 483 648 (2 GB).

As for the IndexAgent, if you are going to change these values, then you might need some other updates. There are a few timeouts values like ‘request_time_out‘ (default 600 seconds), ‘text_extraction_time_out‘ (between 60 and 300 – default to 300 seconds) or ‘linguistic_processing_time_out‘ (between 60 and 360 – default to 360 seconds) that are probably going to be exceeded if you are processing large documents so you might need to tweak these values.

III. Summary

Parameter	Limit on	Short Description	Default Value	Sample Value
contentSizeLimit	IndexAgent (indexagent.xml)	Maximum size of document	20 000 000 bytes (19,07 MB)	104 857 600 bytes (100 MB)
max_text_threshold	CPS (*_configuration.xml)	Maximum text size of the document’s content	10 485 760 bytes (10 MB)	41 943 040 bytes (40 MB)
max_data_per_process	CPS (*_configuration.xml)	Maximum text size of the CPS batch	31 457 280 bytes (30 MB)	5*41 943 040 bytes = 209 715 200 (200 MB)

In summary, the first factor to consider is ‘contentSizeLimit‘ on the IndexAgent side. All documents with a size (document size) bigger than ‘contentSizeLimit‘ won’t be submitted to full text indexing, they will be skipped. The second factor is then either ‘max_text_threshold‘ or ‘max_data_per_process‘ or both, it depends which size you assigned them. They both rely on text size after extraction and they can both cause a document (or the batch) to be rejected from indexing.

Increasing the size thresholds is a somewhat complex exercise that needs careful thinking and alignment of numerous satellite parameters so that they can all work together without disrupting the performance or stability of the xPlore processes. These satellite parameters can be timeouts, cleanup, batch size, request size or even JVM size.

Cet article Documentum – Large documents in xPlore indexing est apparu en premier sur Blog dbi services.

↧

SQL Server 2019 Accelerated Database Recovery – Instantaneous rollback and aggressive log truncation

October 7, 2019, 9:37 am

≫ Next: Oracle 19c

≪ Previous: Documentum – Large documents in xPlore indexing

In my previous article about Accelerated Database Recovery (ADR), I wrote mostly about the new Persistent Volume Store (PVS), how important it was important in the new SQL database engine recovery process and the potential impact it may have on the application workload. This time let’s talk a little bit more about ADR feature benefits we may get with instantaneous rollback and aggressive log truncation. These two capabilities will address some DBA pains especially when rollback or crash recovery kick in with open long running transactions.

First, let’s set the context by running the following long running transaction without ADR enabled:

BEGIN TRAN;
UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1;
GO

UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1;
GO
ROLLBACK TRAN;

The above query generates 10GB of log records (roughly 90% of the total transaction log space size) as shown below:

SELECT 
	DB_NAME(database_id) AS database_name,
	total_log_size_in_bytes / 1024 / 1024 / 1024 AS total_GB,
	used_log_space_in_bytes / 1024 / 1024 / 1024 AS used_GB,
	used_log_space_in_percent
FROM sys.dm_db_log_space_usage

Before cancelling my previous query to trigger a rollback operation, let’s run the following concurrent update:

BEGIN TRAN;

DECLARE @begin_date DATETIME = GETDATE();

UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1
Where TransactionID = 1;

DECLARE @end_date DATETIME = GETDATE();

SELECT DATEDIFF(SECOND, @begin_date, @end_date);

As expected, the second query is blocked during the rollback process of the first one because they compete on the same resource:

SELECT 
	spid,
	blocked,
	lastwaittype,
	waitresource,
	cmd,
	program_name
FROM sys.sysprocesses
WHERE spid IN (64, 52)

In my case, the second query was blocked during 135s. Regarding your scenario, it could be less or more. I experienced this annoying issue myself at some customer shops and I’m pretty sure it is the case of many of SQL Server DBAs.

Let’s now perform the same test after enabling ADR. Executing the query below (used in my first test) gave interesting results.

BEGIN TRAN;
UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1;
GO

UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1;
GO

First, rolling back the transaction was pretty instantaneous and the concurrent query executed faster without being blocked by ROLLBACK process. This is where the logical revert comes into play. As stated to the Microsoft documentation, when rollback is triggered all locks are released immediately. Unlike the usual recovery process, ADR uses the additional PVS is to cancel operations for identified aborted transactions by restoring the latest committed version of concerned rows. The sys.dm_tran_aborted_transactions DMV provides a picture of aborted transactions:

SELECT *
FROM sys.dm_tran_aborted_transactions;
GO

For a sake of curiosity, I tried to dig further into the transaction log file to compare rollback operations between usual recovery process and ADR-based recovery process. I used a simpler scenario including a simple dbo.test_adr table with one id column and that consists in insert 2 rows and updating them afterwards. To get log record data, the sys.fn_db_log function is your friend.

CREATE TABLE dbo.test_adr (
	id INT
);

CHECKPOINT;

BEGIN TRAN;

INSERT INTO dbo.test_adr ( id ) VALUES (1), (2);

UPDATE dbo.test_adr SET id = id + 1;

ROLLBACK TRAN;

SELECT 
	[Current LSN]
	,Operation
	,Context
	,[Transaction ID]
	,[Lock Information]
	,[Description]
       ,[AllocUnitName]
FROM sys.fn_dblog(NULL, NULL);

Without ADR, we retrieve usual log records for rollback operations including compensation records and the transaction’s end rollback mark. In such case, remind the locks are released only at the end of the rollback operation (LOP_ABORT_XACT).

With ADR, the story is a little bit different:

Let’s precise it is only a speculation stuff from my own here and I just tried to correlate information from the Microsoft documentation. So, don’t take my word for it. When a transaction is roll backed, it is marked as aborted and tracked by the logical revert operation. The good news is that locks are immediately released afterwards. My guess is that the LOP_FORGET_XACT record corresponds to the moment when the transaction is marked as aborted and since this moment no blocking issues related to the ROLLBACK can occur. At the same time the logical revert is an asynchronous process and comes into play by providing instantaneous transaction rollback and undo for all versioned operations by using the PVS.

Second, reverting to this first test scenario …

BEGIN TRAN;
UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1;
GO

UPDATE dbo.bigTransactionHistory
SET Quantity = Quantity + 1;
GO

… I noticed the transaction log file space was used differently and even less compared to my first previous test without ADR enabled. I performed the same test several times and I got results in the same order of magnitude.

I got some clues by adding some perfmon counters:

SQL Server databases:PVS in-row diff generated/sec
SQL Server databases:PVS off-row records generated/sec
SQL Server databases:Percent Log Used
SQL Server Buffer Manager:Checkpoints/sec

My two update operations use different storage strategies for storing row versions. Indeed, in the first shot, row versions fit in the data page whereas in the second shot SQL Server must go through the off-row storage to store additional versions. In addition, we are also seeing interesting behavior of the ADR sLog component with the aggressive log truncation at the moment of the different checkpoint operations. Due to the changes in logging, only certain operations require log space and because the sLog content on every checkpoint operation it makes possible aggressive log truncation.

In my case, it led to keep under control the space used by my long running transaction even if it is in open state.

In this blog post we’ve seen how ADR may address database unavailability paint point through instantaneous rollback and aggressive log truncation. Good to know that we may get benefit from such features in SQL Server 2019!

See you!

Cet article SQL Server 2019 Accelerated Database Recovery – Instantaneous rollback and aggressive log truncation est apparu en premier sur Blog dbi services.

↧

Oracle 19c

October 9, 2019, 3:33 am

≫ Next: Creating archived redolog-files in group dba instead of oinstall

≪ Previous: SQL Server 2019 Accelerated Database Recovery – Instantaneous rollback and aggressive log truncation

Oracle 19c has been released quite a while ago already and some customers already run it in Production. However, as it is the long term supported release, I thought I blog about some interesting information and features around 19c to encourage people to migrate to it.

Download Oracle 19c:

https://www.oracle.com/technetwork/database/enterprise-edition/downloads
or
https://edelivery.oracle.com (search e.g. for “Database Enterprise Edition”)

Docker-Images:
https://github.com/oracle/docker-images/tree/master/OracleDatabase

Oracle provides different offerings for 19c:

On-premises:
– Oracle Database Standard Edition 2 (SE2)
– Oracle Database Enterprise Edition (EE)
– Oracle Database Enterprise Edition on Engineered Systems (EE-ES)
– Oracle Database Personal Edition (PE)

Cloud:
– Oracle Database Cloud Service Standard Edition (DBCS SE)
– Oracle Database Cloud Service Enterprise Edition (DBCS EE)
– Oracle Database Cloud Service Enterprise Edition -High Performance (DBCS EE-HP)
– Oracle Database Cloud Service Enterprise Edition -Extreme Performance (DBCS EE-EP)
– Oracle Database Exadata Cloud Service (ExaCS)

REMARK: When this Blog was released the Autonomous DB offerings provided by Oracle did not run on 19c yet (they actually ran on 18c).

Unfortunately some promising 19c new features are only available on Exadata. If that’s the case (like for Automatic Indexing) then you can still test the feature on EE after setting:

SQL> alter system set "_exadata_feature_on"=TRUE scope=spfile;
and a DB-Restart.

REMARK: DO THAT ON YOUR OWN TESTSYSTEMS ONLY AND USE INTERNAL ORACLE PARAMETERS ONLY WHEN ORACLE SUPPORT RECOMMENDS TO DO SO.

Anyway, there are lots of new features and I wanted to share some interesting of them with you and provide some examples.

REMARK: You may check https://www.oracle.com/a/tech/docs/database19c-wp.pdf as well

1. Automatic Indexing (only available on EE-ES and ExaCS)

Oracle continually evaluates the executing SQL and the underlying tables to determine which indexes to automatically create and which ones to potentially remove.

Documentation:

You can use the AUTO_INDEX_MODE configuration setting to enable or disable automatic indexing in a database.

The following statement enables automatic indexing in a database and creates any new auto indexes as visible indexes, so that they can be used in SQL statements:

EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_MODE','IMPLEMENT');

The following statement enables automatic indexing in a database, but creates any new auto indexes as invisible indexes, so that they cannot be used in SQL statements:

EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_MODE','REPORT ONLY');

The following statement disables automatic indexing in a database, so that no new auto indexes are created, and the existing auto indexes are disabled:

EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_MODE','OFF');

Show a report of automatic indexing activity:

set serveroutput on size unlimited lines 200 pages 200 declare report clob := null; begin report := DBMS_AUTO_INDEX.REPORT_LAST_ACTIVITY(); dbms_output.put_line(report); end; /

In a test I ran some statements repeatedly on a table T1 (which contains 32 times the data of all_objects). The table has no index:

SQL> select * from t1 where object_id=:b1; SQL> select * from t1 where data_object_id=:b2;

After some time indexes were created automatically:

SQL> select table_name, index_name, auto from ind; TABLE_NAME INDEX_NAME AUT -------------------------------- -------------------------------- --- T1 SYS_AI_5mzwj826444wv YES T1 SYS_AI_gs3pbvztmyaqx YES 2 rows selected. SQL> select dbms_metadata.get_ddl('INDEX','SYS_AI_5mzwj826444wv') from dual; DBMS_METADATA.GET_DDL('INDEX','SYS_AI_5MZWJ826444WV') ------------------------------------------------------------------------------------ CREATE INDEX "CBLEILE"."SYS_AI_5mzwj826444wv" ON "CBLEILE"."T1" ("OBJECT_ID") AUTO PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "USERS"

2. Real-Time Statistics (only available on EE-ES and ExaCS)

The database automatically gathers real-time statistics during conventional DML operations. You can see in the Note-section of dbms_xplan.display_cursor when stats used to optimize a Query were gathered during DML:

SQL> select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------- SQL_ID 7cd3thpuf7jxm, child number 0 ------------------------------------- select * from t2 where object_id=:y Plan hash value: 1513984157 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24048 (100)| | |* 1 | TABLE ACCESS FULL| T2 | 254 | 31242 | 24048 (1)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_ID"=:Y) Note ----- - dynamic statistics used: statistics for conventional DML

3. Quarantine problematic SQL (only available on EE-ES and ExaCS)

Runaway SQL statements terminated by Resource Manager due to excessive consumption of processor and I/O resources can now be automatically quarantined. I.e. instead of letting the SQL run until it reaches a resource plan limit, the SQL is not executed at all.

E.g. create a resource plan which limits SQL-exec-time for User CBLEILE to 16 seconds:

begin -- Create a pending area dbms_resource_manager.create_pending_area(); ... dbms_resource_manager.create_plan_directive( plan => 'LIMIT_RESOURCE', group_or_subplan => 'TEST_RUNAWAY_GROUP', comment => 'Terminate SQL statements when they exceed the' ||'execution time of 16 seconds', switch_group => 'CANCEL_SQL', switch_time => 16, switch_estimate => false); ... -- Set the initial consumer group of the 'CBLEILE' user to 'TEST_RUNAWAY_GROUP' dbms_resource_manager.set_initial_consumer_group('CBLEILE','TEST_RUNAWAY_GROUP'); end; /

A SQL-Statement with SQL_ID 12jc0zpmb85tm executed by CBLEILE runs in the 16 seconds limit:

SQL> select count(*) X 2 from kill_cpu 3 connect by n > prior n 4 start with n = 1 5 ; from kill_cpu * ERROR at line 2: ORA-00040: active time limit exceeded - call aborted Elapsed: 00:00:19.85

So I quarantine the SQL now:

set serveroutput on size unlimited DECLARE quarantine_config VARCHAR2(80); BEGIN quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID( SQL_ID => '12jc0zpmb85tm'); dbms_output.put_line(quarantine_config); END; / SQL_QUARANTINE_1d93x3d6vumvs PL/SQL procedure successfully completed. SQL> select NAME,ELAPSED_TIME,ENABLED from dba_sql_quarantine; NAME ELAPSED_TIME ENA ---------------------------------------- -------------------------------- --- SQL_QUARANTINE_1d93x3d6vumvs ALWAYS YES

Other CBLEILE-session:

SQL> select count(*) X 2 from kill_cpu 3 connect by n > prior n 4 start with n = 1 5 ; from kill_cpu * ERROR at line 2: ORA-56955: quarantined plan used Elapsed: 00:00:00.00 SQL> !oerr ora 56955 56955, 00000, "quarantined plan used" // *Cause: A quarantined plan was used for this statement. // *Action: Increase the Oracle Database Resource Manager limits or use a new plan.

–> The SQL does not run for 16 seconds, but is stopped immediately (is under quarantine). You can define the Plan-Hash-Value for which a SQL should be in quarantine and define quarantine thresholds. E.g. 20 seconds for the elapsed time. As long as the resource plan is below those 20 seconds the SQL is under quarantine. If the resource plan is defined to be above 20 seconds execution time limit, the SQL is executed.

4. Active Standby DML Redirect (only available with Active Data Guard)

On Active Data Guard you may allow moderate write activity. These writes are then transparently redirected to the primary database and written there first (to ensure consistency) and then the changes are shipped back to the standby. This approach allows applications to use the standby for moderate write workloads.

5. Hybrid Partitioned Tables

Create partitioned tables where some partitions are inside and some partitions are outside the database (on filesystem, on a Cloud-Filesystem-service or on a Hadoop Distributed File System (HDFS)). This allows e.g. “cold” partitions to remain accessible, but on cheap storage.

Here an example with 3 partitions external (data of 2016-2018) and 1 partition in the DB (data of 2019):

!mkdir -p /u01/my_data/sales_data1 !mkdir -p /u01/my_data/sales_data2 !mkdir -p /u01/my_data/sales_data3 !echo "1,1,01-01-2016,1,1,1000,2000" > /u01/my_data/sales_data1/sales2016_data.txt !echo "2,2,01-01-2017,2,2,2000,4000" > /u01/my_data/sales_data2/sales2017_data.txt !echo "3,3,01-01-2018,3,3,3000,6000" > /u01/my_data/sales_data3/sales2018_data.txt connect / as sysdba alter session set container=pdb1; CREATE DIRECTORY sales_data1 AS '/u01/my_data/sales_data1'; GRANT READ,WRITE ON DIRECTORY sales_data1 TO cbleile; CREATE DIRECTORY sales_data2 AS '/u01/my_data/sales_data2'; GRANT READ,WRITE ON DIRECTORY sales_data2 TO cbleile; CREATE DIRECTORY sales_data3 AS '/u01/my_data/sales_data3'; GRANT READ,WRITE ON DIRECTORY sales_data3 TO cbleile; connect cbleile/difficult_password@pdb1 CREATE TABLE hybrid_partition_table ( prod_id NUMBER NOT NULL, cust_id NUMBER NOT NULL, time_id DATE NOT NULL, channel_id NUMBER NOT NULL, promo_id NUMBER NOT NULL, quantity_sold NUMBER(10,2) NOT NULL, amount_sold NUMBER(10,2) NOT NULL ) EXTERNAL PARTITION ATTRIBUTES ( TYPE ORACLE_LOADER DEFAULT DIRECTORY sales_data1 ACCESS PARAMETERS( FIELDS TERMINATED BY ',' (prod_id,cust_id,time_id DATE 'dd-mm-yyyy',channel_id,promo_id,quantity_sold,amount_sold) ) REJECT LIMIT UNLIMITED ) PARTITION BY RANGE (time_id) ( PARTITION sales_2016 VALUES LESS THAN (TO_DATE('01-01-2017','dd-mm-yyyy')) EXTERNAL LOCATION ('sales2016_data.txt'), PARTITION sales_2017 VALUES LESS THAN (TO_DATE('01-01-2018','dd-mm-yyyy')) EXTERNAL DEFAULT DIRECTORY sales_data2 LOCATION ('sales2017_data.txt'), PARTITION sales_2018 VALUES LESS THAN (TO_DATE('01-01-2019','dd-mm-yyyy')) EXTERNAL DEFAULT DIRECTORY sales_data3 LOCATION ('sales2018_data.txt'), PARTITION sales_2019 VALUES LESS THAN (TO_DATE('01-01-2020','dd-mm-yyyy')) ); insert into hybrid_partition_table values (4,4,to_date('01-01-2019','dd-mm-yyyy'),4,4,4000,8000); commit; SQL> select * from hybrid_partition_table where time_id in (to_date('01-01-2017','dd-mm-yyyy'),to_date('01-01-2019','dd-mm-yyyy')); PROD_ID CUST_ID TIME_ID CHANNEL_ID PROMO_ID QUANTITY_SOLD AMOUNT_SOLD ---------- ---------- --------- ---------- ---------- ------------- ----------- 2 2 01-JAN-17 2 2 2000 4000 4 4 01-JAN-19 4 4 4000 8000 2 rows selected. SQL> select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------- SQL_ID c5s33u5kanzb5, child number 0 ------------------------------------- select * from hybrid_partition_table where time_id in (to_date('01-01-2017','dd-mm-yyyy'),to_date('01-01-2019','dd-mm-yyyy')) Plan hash value: 2612538111 ------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 83 (100)| | | | | 1 | PARTITION RANGE INLIST | | 246 | 21402 | 83 (0)| 00:00:01 |KEY(I) |KEY(I) | |* 2 | TABLE ACCESS HYBRID PART FULL| HYBRID_PARTITION_TABLE | 246 | 21402 | 83 (0)| 00:00:01 |KEY(I) |KEY(I) | |* 3 | TABLE ACCESS FULL | HYBRID_PARTITION_TABLE | | | | |KEY(I) |KEY(I) | ------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter((SYS_OP_XTNN("HYBRID_PARTITION_TABLE"."AMOUNT_SOLD","HYBRID_PARTITION_TABLE"."QUANTITY_SOLD","HYBRID_PARTITION_TABLE"."PROMO_ID","HYBRID_PARTITION_TABLE"."CHANNEL_ID","HYBRID_PARTITION_TABLE"."TIME_ID","HYBRID_PARTITION_TABLE"."CUST_ID","HYBRID_PARTITION_TABLE"."PROD_ID") AND INTERNAL_FUNCTION("TIME_ID"))) 3 - filter((SYS_OP_XTNN("HYBRID_PARTITION_TABLE"."AMOUNT_SOLD","HYBRID_PARTITION_TABLE"."QUANTITY_SOLD","HYBRID_PARTITION_TABLE"."PROMO_ID","HYBRID_PARTITION_TABLE"."CHANNEL_ID","HYBRID_PARTITION_TABLE"."TIME_ID","HYBRID_PARTITION_TABLE"."CUST_ID","HYBRID_PARTITION_TABLE"."PROD_ID") AND INTERNAL_FUNCTION("TIME_ID")))

6. Memoptimized Rowstore

Enables fast data inserts into Oracle Database 19c from applications, such as Internet of Things (IoT), which ingest small, high volume transactions with a minimal amount of transactional overhead.

7. 3 PDBs per Multitenant-DB without having to pay for the Multitenant option

Beginning with 19c it is allowed to create 3 PDBs in a Container-DB without requiring the Mutitenant-Option license from Oracle. As the single- or multi-tenant DB becomes a must in Oracle 20, it is a good idea to start using the container-DB architecture with 19c already.

Please let me know your experience with Oracle 19c.

Cet article Oracle 19c est apparu en premier sur Blog dbi services.

↧

Creating archived redolog-files in group dba instead of oinstall

October 9, 2019, 1:07 pm

≫ Next: Patroni Operations – switchover and failover

≪ Previous: Oracle 19c

Since Oracle 11g files created by the database belong by default to the Linux group oinstall. Changing the default group after creating the central inventory is difficult. In this Blog I want to show how locally created archived redo can be in group dba instead of oinstall.

One of my customers had the requirement to provide read-access on archived redo to an application for logmining. To ensure the application can access the archived redo, we created an additinal local archive log destination:

LOG_ARCHIVE_DEST_9 = 'LOCATION=/logmining/ARCHDEST/NCEE19C valid_for=(online_logfile,primary_role)'

and provided NFS-access to that directory for the application. To ensure that the application can access the archived redo, the remote user was part of a remote dba-group, which had the same group-id (GID) as the dba-group on the DB-server. Everything worked fine until we migrated to a new server and changed the setup to use oinstall as the default group for Oracle. The application could no longer read the files, because they were created with group oinstall:

oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C] ls -ltr -rw-r-----. 1 oracle oinstall 24403456 Oct 9 21:21 1_32_1017039068.dbf -rw-r-----. 1 oracle oinstall 64000 Oct 9 21:25 1_33_1017039068.dbf -rw-r-----. 1 oracle oinstall 29625856 Oct 9 21:27 1_34_1017039068.dbf oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C]

One possibility to workaround this would have been to use the id-mapper on Linux, but there’s something better:

With the group-sticky-bit on Linux we can achieve, that all files in a directory are part of the group of the directory.

I.e.

oracle@19c:/logmining/ARCHDEST/ [NCEE19C] ls -l total 0 drwxr-xr-x. 1 oracle dba 114 Oct 9 21:27 NCEE19C oracle@19c:/logmining/ARCHDEST/ [NCEE19C] chmod g+s NCEE19C oracle@19c:/logmining/ARCHDEST/ [NCEE19C] ls -l drwxr-sr-x. 1 oracle dba 114 Oct 9 21:27 NCEE19C

Whenever an archived redo is created in that directory it will be in the dba-group:

SQL> alter system switch logfile; System altered. SQL> exit oracle@19c:/logmining/ARCHDEST/ [NCEE19C] cd NCEE19C/ oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C] ls -ltr -rw-r-----. 1 oracle oinstall 24403456 Oct 9 21:21 1_32_1017039068.dbf -rw-r-----. 1 oracle oinstall 64000 Oct 9 21:25 1_33_1017039068.dbf -rw-r-----. 1 oracle oinstall 29625856 Oct 9 21:27 1_34_1017039068.dbf -rw-r-----. 1 oracle dba 193024 Oct 9 21:50 1_35_1017039068.dbf oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C]

To make all files part of the dba-group use chgrp and use the newest archivelog as a reference:

oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C] chgrp --reference 1_35_1017039068.dbf 1_3[2-4]*.dbf oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C] ls -ltr -rw-r-----. 1 oracle dba 24403456 Oct 9 21:21 1_32_1017039068.dbf -rw-r-----. 1 oracle dba 64000 Oct 9 21:25 1_33_1017039068.dbf -rw-r-----. 1 oracle dba 29625856 Oct 9 21:27 1_34_1017039068.dbf -rw-r-----. 1 oracle dba 193024 Oct 9 21:50 1_35_1017039068.dbf oracle@19c:/logmining/ARCHDEST/NCEE19C/ [NCEE19C]

Hope this helps somebody.

Cet article Creating archived redolog-files in group dba instead of oinstall est apparu en premier sur Blog dbi services.

↧

Patroni Operations – switchover and failover

October 11, 2019, 7:22 am

≫ Next: Creating a customized PostgreSQL container using buildah

≪ Previous: Creating archived redolog-files in group dba instead of oinstall

In this post we will have a look at switchover and failover of a Patroni cluster. As well as a look at the maintenance mode Patroni offers, which gives the opportunity to prevent from an automatic failover.

Switchover

There are two possibilities to run a switchover, either in scheduled mode or immediately.

1. Scheduled Switchover

postgres@patroni1:/home/postgres/ [PG1] patronictl switchover
Master [patroni1]:
Candidate ['patroni2', 'patroni3'] []: patroni2
When should the switchover take place (e.g. 2019-10-08T11:31 )  [now]: 2019-10-08T10:32
Current cluster topology
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
Are you sure you want to schedule switchover of cluster PG1 at 2019-10-08T10:32:00+02:00, demoting current master patroni1? [y/N]: y
2019-10-08 10:31:14.89236 Switchover scheduled
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
 Switchover scheduled at: 2019-10-08T10:32:00+02:00
                    from: patroni1
                      to: patroni2
postgres@patroni1:/home/postgres/ [PG1]

That’s it. At the given time, the switchover will take place. All you see in the logfile is an entry like this

Oct  8 10:32:00 patroni1 patroni: 2019-10-08 10:32:00,006 INFO: Manual scheduled failover at 2019-10-08T10:32:00+02:00
Oct  8 10:32:00 patroni1 patroni: 2019-10-08 10:32:00,016 INFO: Got response from patroni2 http://192.168.22.112:8008/patroni: {"database_system_identifier": "6745341072751547355", "postmaster_start_time": "2019-10-08 10:09:40.217 CEST", "timeline": 2, "cluster_unlocked": false, "patroni": {"scope": "PG1", "version": "1.6.0"}, "state": "running", "role": "replica", "xlog": {"received_location": 83886560, "replayed_timestamp": null, "paused": false, "replayed_location": 83886560}, "server_version": 110005}
Oct  8 10:32:00 patroni1 patroni: 2019-10-08 10:32:00,113 INFO: manual failover: demoting myself
Oct  8 10:32:01 patroni1 patroni: 2019-10-08 10:32:01,256 INFO: Leader key released
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03,271 INFO: Local timeline=2 lsn=0/6000028
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03,279 INFO: master_timeline=3
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03,281 INFO: master: history=1#0110/5000098#011no recovery target specified
Oct  8 10:32:03 patroni1 patroni: 2#0110/6000098#011no recovery target specified
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03,282 INFO: closed patroni connection to the postgresql cluster
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03,312 INFO: postmaster pid=11537
Oct  8 10:32:03 patroni1 patroni: 192.168.22.111:5432 - no response
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03.325 CEST - 1 - 11537 -  - @ - 0LOG:  listening on IPv4 address "192.168.22.111", port 5432
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03.328 CEST - 2 - 11537 -  - @ - 0LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03.339 CEST - 3 - 11537 -  - @ - 0LOG:  redirecting log output to logging collector process
Oct  8 10:32:03 patroni1 patroni: 2019-10-08 10:32:03.339 CEST - 4 - 11537 -  - @ - 0HINT:  Future log output will appear in directory "pg_log".
Oct  8 10:32:04 patroni1 patroni: 192.168.22.111:5432 - accepting connections
Oct  8 10:32:04 patroni1 patroni: 192.168.22.111:5432 - accepting connections
Oct  8 10:32:04 patroni1 patroni: 2019-10-08 10:32:04,895 INFO: Lock owner: patroni2; I am patroni1
Oct  8 10:32:04 patroni1 patroni: 2019-10-08 10:32:04,895 INFO: does not have lock
Oct  8 10:32:04 patroni1 patroni: 2019-10-08 10:32:04,896 INFO: establishing a new patroni connection to the postgres cluster

2. Immediate switchover

Here you start the same way as for planned switchover, but the switchover will take place immediatelly.

postgres@patroni1:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 |        | running |  1 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 | Leader | running |  1 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  1 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
postgres@patroni1:/home/postgres/ [PG1] patronictl switchover
Master [patroni2]:
Candidate ['patroni1', 'patroni3'] []: patroni1
When should the switchover take place (e.g. 2019-10-08T11:09 )  [now]:
Current cluster topology
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 |        | running |  1 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 | Leader | running |  1 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  1 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
Are you sure you want to switchover cluster PG1, demoting current master patroni2? [y/N]: y
2019-10-08 10:09:38.88046 Successfully switched over to "patroni1"
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  1 |           |
|   PG1   | patroni2 | 192.168.22.112 |        | stopped |    |   unknown |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  1 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
postgres@patroni1:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
postgres@patroni1:/home/postgres/ [PG1]

Failover

In difference to the switchover, the failover is executed automatically, when the Leader node is getting unavailable for unplanned reason.
You can only adjust some database parameter to affect the failover.

The parameters for failover arre also managed using patronictl. But they are not in the parameter section, they are above. so let’s say, we adjust one parameter and add one paramter to not use the default anymore.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl edit-config
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl edit-config
---
+++
@@ -1,5 +1,6 @@
-loop_wait: 7
+loop_wait: 10
 maximum_lag_on_failover: 1048576
+master_start_timeout: 240
 postgresql:
   parameters:
     archive_command: /bin/true

Apply these changes? [y/N]: y
Configuration changed

Afterwards there is no need to restart the database. Changes take affect immediately. So the failover can be configured according to every special need. A list of all possible parameter changes can be found here .

Maintenance mode

In some cases it is necessary to do maintenance on a single node and you do not want Patroni to manage the cluster. This can be needed for e.g. release updates.
When Patroni paused, it won’t change the state of PostgeSQL. For example it will not try to start the cluster when it is stopped.

So let’s do an example. We will pause the cluster, stop the replica, upgrade from 9.6.8 to 9.6.13 and afterwards start the replica again. In case we do not pause the replica, the database will be started automatically by Patroni.

postgres@patroni1:/home/postgres/ [PG1] patronictl pause
Success: cluster management is paused
You have new mail in /var/spool/mail/opendb
postgres@patroni1:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
 Maintenance mode: on

On the replica

postgres@patroni2:/home/postgres/ [PG1] pg_ctl stop -D /u02/pgdata/96/PG1/ -m fast

postgres@patroni2:/home/postgres/ [PG1] export PATH= /u01/app/postgres/product/PG96/db_13/bin:$PATH
postgres@patroni2:/home/postgres/ [PG1] export PORT=5432
postgres@patroni2:/home/postgres/ [PG1] which pg_ctl
/u01/app/opendb/product/PG96/db_13/bin/pg_ctl

postgres@patroni2:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/96/PG1 start
server starting
postgres@patroni2:/home/postgres/ [PG1] 2019-10-08 17:25:28.358 CEST - 1 - 23192 -  - @ - 0LOG:  redirecting log output to logging collector process
2019-10-08 17:25:28.358 CEST - 2 - 23192 -  - @ - 0HINT:  Future log output will appear in directory "pg_log".

postgres@patroni2:/home/postgres/ [PG1] psql -c "select version()" postgres
                                                           version
------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.13 dbi services build on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit
(1 row)

postgres@patroni2:/home/postgres/ [PG1] patronictl resume
Success: cluster management is resumed

postgres@patroni2:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  5 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  5 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  5 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+

You can do this on the other nodes as well.

Conclusion

Switchover is quite easy and for all the test I did so far it was really reliable. As well as the failover, here you just have to think about adjusting the parameters to your needs. Not in every case it is the best solution to wait 5 min for a failover.

Cet article Patroni Operations – switchover and failover est apparu en premier sur Blog dbi services.

↧

Creating a customized PostgreSQL container using buildah

October 16, 2019, 7:09 am

≫ Next: Where can you find core developers asking people what is missing in PostgreSQL? pgconf.eu.2019

≪ Previous: Patroni Operations – switchover and failover

Quite some time ago I blogged about how you could build your customzized PostgreSQL container by using a Dockerfile and Docker build. In the meantime Red Hat replaced Docker in OpenShift and SUSE replaced Docker as well in CaaS. As a consequence there need to be other ways of building containers and one of them is buildah. You can use buildah to build from a Docker file as well, but in this post we will use a simple bash script to create the container.

We start be defining four variables that define PGDATA, the PostgreSQL major version, the full version string and the minor version which will be used to create our standard installation location (these will also go into the entrypoint, see below):

#!/bin/bash
_PGDATA="/u02/pgdata"
_PGMAJOR=12
_PGVERSION=12.0
_PGMINOR="db_0"

As mentioned in the beginning buildah will be used to create the container. For running the container we need something else, and that is podman. You can run the container buildah creates with plain Docker as well, if you want, as it is oci compliant but as Red Hat does not ship Docker anymore we will use the recommended way of doing it by using podman. So the natural next step in the script is do install buildah and podman:

dnf install -y buildah podman

Buildah can create containers from scratch, which means you start with a container that contains nothing except some meta data:

newcontainer=$(buildah from scratch)

Once we have the new scratch container it gets mounted so dnf can be used to install the packages we need into the container without actually using dnf in the container:

scratchmnt=$(buildah mount $newcontainer)
ls -la $scratchmnt
dnf install --installroot $scratchmnt --releasever 8 bash coreutils gcc openldap-devel platform-python-devel readline-devel bison flex perl-ExtUtils-Embed zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel bzip2 wget policycoreutils-python-utils make tar --setopt install_weak_deps=false --setopt=tsflags=nodocs --setopt=override_install_langs=en_US.utf8 -y

Using “buildah config” the container can be configured. Here it is about the author, environment variables, the default user and the entrypoint that will be used once the conatiner will be started:

buildah config --created-by "dbi services"  $newcontainer
buildah config --author "dbi services" --label name=dbiservices $newcontainer
buildah run $newcontainer groupadd postgres
buildah run $newcontainer useradd -g postgres -m postgres
buildah config --user postgres $newcontainer
buildah config --workingdir /home/postgres $newcontainer
buildah config --env PGDATABASE="" $newcontainer
buildah config --env PGUSERNAME="" $newcontainer
buildah config --env PGPASSWORD="" $newcontainer
buildah config --env PGDATA=${_PGDATA} $newcontainer
buildah config --env PGMAJOR=${_PGMAJOR} $newcontainer
buildah config --env PGMINOR=${_PGMINOR} $newcontainer
buildah config --env PGVERSION=${_PGVERSION} $newcontainer
buildah config --entrypoint /usr/bin/entrypoint.sh $newcontainer
buildah copy $newcontainer ./entrypoint.sh /usr/bin/entrypoint.sh
buildah run $newcontainer chmod +x /usr/bin/entrypoint.sh

What follows is basically installing PostgreSQL from source code:

buildah run --user root $newcontainer mkdir -p /u01 /u02
buildah run --user root $newcontainer chown postgres:postgres /u01 /u02
buildah run --user postgres $newcontainer wget https://ftp.postgresql.org/pub/source/v${_PGVERSION}/postgresql-${_PGVERSION}.tar.bz2 -O /home/postgres/postgresql-${_PGVERSION}.tar.bz2
buildah run --user postgres $newcontainer /usr/bin/bunzip2 /home/postgres/postgresql-${_PGVERSION}.tar.bz2
buildah run --user postgres $newcontainer /usr/bin/tar -xvf /home/postgres/postgresql-${_PGVERSION}.tar -C /home/postgres/
buildah run --user postgres $newcontainer /home/postgres/postgresql-12.0/configure --prefix=/u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR} --exec-prefix=/u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR} --bindir=/u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin --libdir=/u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/lib --includedir=/u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/include 
buildah run --user postgres $newcontainer /usr/bin/make -C /home/postgres all
buildah run --user postgres $newcontainer /usr/bin/make -C /home/postgres install
buildah run --user postgres $newcontainer /usr/bin/make -C /home/postgres/contrib install

Containers shoud be as small as possible so lets do some cleanup:

buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/postgresql-${_PGVERSION}.tar
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/config
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/config.log
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/config.status
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/contrib
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/GNUmakefile
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/postgresql-12.0
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/src
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/doc
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/Makefile
buildah run --user postgres $newcontainer /usr/bin/rm -rf /home/postgres/.wget-hsts

When you want to run PostgreSQL inside a container you do not need any of the following binaries, so these can be cleaned as well:

buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/vacuumlo
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/vacuumdb
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/reindexdb
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pgbench
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_waldump
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_test_timing
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_test_fsync
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_standby
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_restore
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_recvlogical
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_receivewal
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_isready
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_dumpall
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_dump
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_checksums
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_basebackup
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/pg_archivecleanup
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/oid2name
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/dropuser
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/dropdb
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/createuser
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/createdb
buildah run --user postgres $newcontainer /usr/bin/rm -rf /u01/app/postgres/product/${_PGMAJOR}/${_PGMINOR}/bin/clusterdb

Last, but not least remove all the packages we do not require anymore and get rid of the dnf cache:

dnf remove --installroot $scratchmnt --releasever 8 gcc openldap-devel readline-devel bison flex perl-ExtUtils-Embed zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel bzip2 wget policycoreutils-python-utils make tar -y
dnf clean all -y --installroot $scratchmnt --releasever 8
# Clean up yum cache
if [ -d "${scratchmnt}" ]; then
rm -rf "${scratchmnt}"/var/cache/yum
fi
buildah unmount $newcontainer

Ready to publish the container:

buildah commit $newcontainer dbi-postgres

When you put all those steps into a script and run that you should see the just created container:

[root@doag2019 ~]$ buildah containers
CONTAINER ID  BUILDER  IMAGE ID     IMAGE NAME                       CONTAINER NAME
47946e4b4fc8     *                  scratch                          working-container
[root@doag2019 ~]$

… but now we also have a new image that can be started:

IMAGE NAME                                               IMAGE TAG            IMAGE ID             CREATED AT             SIZE
localhost/dbi-postgres                                   latest               dfcd3e8d5273         Oct 13, 2019 13:22     461 MB

Once we start that the entrypoint will be executed:

#!/bin/bash
# this are the environment variables which need to be set
PGDATA=${PGDATA}/${PGMAJOR}
PGHOME="/u01/app/postgres/product/${PGMAJOR}/${PGMINOR}"
PGAUTOCONF=${PGDATA}/postgresql.auto.conf
PGHBACONF=${PGDATA}/pg_hba.conf
PGDATABASENAME=${PGDATABASE}
PGUSERNAME=${PGUSERNAME}
PGPASSWD=${PGPASSWORD}
# create the database and the user
_pg_create_database_and_user()
{
${PGHOME}/bin/psql -c "create user ${PGUSERNAME} with login password '${PGPASSWD}'" postgres
${PGHOME}/bin/psql -c "create database ${PGDATABASENAME} with owner = ${PGUSERNAME}" postgres
${PGHOME}/bin/psql -c "create extension pg_stat_statements" postgres
}
# start the PostgreSQL instance
_pg_prestart()
{
${PGHOME}/bin/pg_ctl -D ${PGDATA} -w start
}
# Start PostgreSQL without detaching 
_pg_start()
{
exec ${PGHOME}/bin/postgres "-D" "${PGDATA}"
}
# stop the PostgreSQL instance
_pg_stop()
{
${PGHOME}/bin/pg_ctl -D ${PGDATA} stop -m fast
}
# initdb a new cluster
_pg_initdb()
{
${PGHOME}/bin/initdb -D ${PGDATA} --data-checksums
}
# adjust the postgresql parameters
_pg_adjust_config() {
if [ -z $PGMEMORY ]; then MEM="128MB"
else                      MEM=$PGMEMORY; fi
# PostgreSQL parameters
echo "shared_buffers='$MEM'" >> ${PGAUTOCONF}
echo "effective_cache_size='128MB'" >> ${PGAUTOCONF}
echo "listen_addresses = '*'" >> ${PGAUTOCONF}
echo "logging_collector = 'off'" >> ${PGAUTOCONF}
echo "log_truncate_on_rotation = 'on'" >> ${PGAUTOCONF}
echo "log_line_prefix = '%m - %l - %p - %h - %u@%d '" >> ${PGAUTOCONF}
echo "log_directory = 'pg_log'" >> ${PGAUTOCONF}
echo "log_min_messages = 'WARNING'" >> ${PGAUTOCONF}
echo "log_autovacuum_min_duration = '60s'" >> ${PGAUTOCONF}
echo "log_min_error_statement = 'NOTICE'" >> ${PGAUTOCONF}
echo "log_min_duration_statement = '30s'" >> ${PGAUTOCONF}
echo "log_checkpoints = 'on'" >> ${PGAUTOCONF}
echo "log_statement = 'none'" >> ${PGAUTOCONF}
echo "log_lock_waits = 'on'" >> ${PGAUTOCONF}
echo "log_temp_files = '0'" >> ${PGAUTOCONF}
echo "log_timezone = 'Europe/Zurich'" >> ${PGAUTOCONF}
echo "log_connections=on" >> ${PGAUTOCONF}
echo "log_disconnections=on" >> ${PGAUTOCONF}
echo "log_duration=off" >> ${PGAUTOCONF}
echo "client_min_messages = 'WARNING'" >> ${PGAUTOCONF}
echo "wal_level = 'replica'" >> ${PGAUTOCONF}
echo "wal_compression=on" >> ${PGAUTOCONF}
echo "max_replication_slots=20" >> ${PGAUTOCONF}
echo "max_wal_senders=20" >> ${PGAUTOCONF}
echo "hot_standby_feedback = 'on'" >> ${PGAUTOCONF}
echo "cluster_name = '${PGDATABASENAME}'" >> ${PGAUTOCONF}
echo "max_replication_slots = '10'" >> ${PGAUTOCONF}
echo "work_mem=8MB" >> ${PGAUTOCONF}
echo "maintenance_work_mem=64MB" >> ${PGAUTOCONF}
echo "shared_preload_libraries='pg_stat_statements'" >> ${PGAUTOCONF}
echo "autovacuum_max_workers=6" >> ${PGAUTOCONF}
echo "autovacuum_vacuum_scale_factor=0.1" >> ${PGAUTOCONF}
echo "autovacuum_vacuum_threshold=50" >> ${PGAUTOCONF}
echo "archive_mode=on" >> ${PGAUTOCONF}
echo "archive_command='/bin/true'" >> ${PGAUTOCONF}
# Authentication settings in pg_hba.conf
echo "host    all             all             0.0.0.0/0            md5"  >> ${PGHBACONF}
}
# initialize and start a new cluster
_pg_init_and_start()
{
# initialize a new cluster
_pg_initdb
# set params and access permissions
_pg_adjust_config
# start the new cluster
_pg_prestart
# set username and password
_pg_create_database_and_user
# restart database with correct pid
_pg_stop
_pg_start
}
# check if $PGDATA exists
if [ -e ${PGDATA} ]; then
# when $PGDATA exists we need to check if there are files
# because when there are files we do not want to initdb
if [ -e "${DEBUG}" ]; then
/bin/bash
elif [ -e "${PGDATA}/base" ]; then
# when there is the base directory this
# probably is a valid PostgreSQL cluster
# so we just start it
_pg_start
else
# when there is no base directory then we
# should be able to initialize a new cluster
# and then start it
_pg_init_and_start
fi
else
# create PGDATA
mkdir -p ${PGDATA}
# initialze and start the new cluster
_pg_init_and_start
fi

Starting that up using podman:

[root@doag2019 ~]$ podman run -e PGDATABASE=test -e PGUSERNAME=test -e PGPASSWORD=test --detach -p 5432:5432 localhost/dbi-postgres
f933df8216de83b3c2243860ace02f231748a05273c16d3ddb0308231004552f
CONTAINER ID  IMAGE                          COMMAND               CREATED             STATUS             PORTS                   NAMES
f933df8216de  localhost/dbi-postgres:latest  /bin/sh -c /usr/b...  About a minute ago  Up 59 seconds ago  0.0.0.0:5432->5432/tcp  nervous_leavitt

… and connecting from the host system:

[root@doag2019 ~]$ psql -p 5432 -h localhost -U test test
Password for user test:
psql (10.6, server 12.0)
WARNING: psql major version 10, server major version 12.
Some psql features might not work.
Type "help" for help.
test=> select version();
version
--------------------------------------------------------------------------------------------------------
PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
(1 row)
test=> \q

One you have that scripted and ready it is a very convinient way for creating images. What I like most is, that you can make changes afterwards without starting from scratch:

[root@doag2019 ~]$ podman inspect localhost/dbi-postgres
[
{
"Id": "dfcd3e8d5273116e5678806dfe7bbf3ca2276549db73e62f27b967673df8084c",
"Digest": "sha256:b2d65e569becafbe64e8bcb6d49b065188411f596c04dea2cf335f677e2db68e",
"RepoTags": [
"localhost/dbi-postgres:latest"
],
"RepoDigests": [
"localhost/dbi-postgres@sha256:b2d65e569becafbe64e8bcb6d49b065188411f596c04dea2cf335f677e2db68e"
],
"Parent": "",
"Comment": "",
"Created": "2019-10-13T11:22:15.096957689Z",
"Config": {
"User": "postgres",
"Env": [
"PGDATABASE=",
"PGUSERNAME=",
"PGPASSWORD=",
"PGDATA=/u02/pgdata",
"PGMAJOR=12",
"PGMINOR=db_0",
"PGVERSION=12.0"
],
"Entrypoint": [
"/bin/sh",
"-c",
"/usr/bin/entrypoint.sh"
],
"WorkingDir": "/home/postgres",
"Labels": {
"name": "dbiservices"
}
},
"Version": "",
"Author": "dbiservices",
"Architecture": "amd64",
"Os": "linux",
"Size": 460805033,
"VirtualSize": 460805033,
"GraphDriver": {
"Name": "overlay",
"Data": {
"MergedDir": "/var/lib/containers/storage/overlay/89de699f19781bb61eec12cf61a097a9daa31d7725fc3c078c76d0d6291cb074/merged",
"UpperDir": "/var/lib/containers/storage/overlay/89de699f19781bb61eec12cf61a097a9daa31d7725fc3c078c76d0d6291cb074/diff",
"WorkDir": "/var/lib/containers/storage/overlay/89de699f19781bb61eec12cf61a097a9daa31d7725fc3c078c76d0d6291cb074/work"
}
},
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:89de699f19781bb61eec12cf61a097a9daa31d7725fc3c078c76d0d6291cb074"
]
},
"Labels": {
"name": "dbiservices"
},
"Annotations": {},
"ManifestType": "application/vnd.oci.image.manifest.v1+json",
"User": "postgres",
"History": [
{
"created": "2019-10-13T11:22:15.096957689Z",
"created_by": "dbi services",
"author": "dbiservices"
}
]
}
]

Assume we want to add a new environment variable. All we need to do is this:

[root@doag2019 ~]$ buildah containers
CONTAINER ID  BUILDER  IMAGE ID     IMAGE NAME                       CONTAINER NAME
47946e4b4fc8     *                  scratch                          working-container
[root@doag2019 ~]$ buildah config --env XXXXXXX="xxxxxxxx" 47946e4b4fc8
[root@doag2019 ~]$ buildah commit 47946e4b4fc8 dbi-postgres
Getting image source signatures
Skipping fetch of repeat blob sha256:9b74f2770486cdb56539b4a112b95ad7e10aced3a2213d33878f8fd736b5c684
Copying config sha256:e2db86571bfa2e64e6079077fe023e38a07544ccda529ba1c3bfc04984f2ac74
606 B / 606 B [============================================================] 0s
Writing manifest to image destination
Storing signatures
e2db86571bfa2e64e6079077fe023e38a07544ccda529ba1c3bfc04984f2ac74

The new image with the new variable is ready:

[root@doag2019 ~]$ buildah images
IMAGE NAME                                               IMAGE TAG            IMAGE ID             CREATED AT             SIZE
                                                                              dfcd3e8d5273         Oct 13, 2019 13:22     461 MB
localhost/dbi-postgres                                   latest               e2db86571bfa         Oct 13, 2019 13:52     461 MB
[root@doag2019 ~]$ buildah inspect localhost/dbi-postgres
...
"Env": [
"PGDATABASE=",
"PGUSERNAME=",
"PGPASSWORD=",
"PGDATA=/u02/pgdata",
"PGMAJOR=12",
"PGMINOR=db_0",
"PGVERSION=12.0",
"XXXXXXX=xxxxxxxx"
],
...

Nice. If you are happy with the image the scratch container can be deleted.

Cet article Creating a customized PostgreSQL container using buildah est apparu en premier sur Blog dbi services.

↧

Where can you find core developers asking people what is missing in PostgreSQL? pgconf.eu.2019

October 17, 2019, 6:37 am

≫ Next: pgconf.eu – Welcome to the community

≪ Previous: Creating a customized PostgreSQL container using buildah

One of the major advantages of PostgreSQL conferences when you compare it to other conferences is, that you can listen to talks where the actual developers are presenting their work. You have questions about a feature, you want to know more about this or that: Just catch one of the developers and start to talk. It is as easy as that. Today it was even more impressive: Alvaro came to our booth and asked what we miss in PostgreSQL and what should be implemented to make our life easier. Where can you find that? Core developers directly going to people asking on how the product can be improved? That is pgconfeu.2019 and we are already half way through, one and a half day to go but I can already say that it is again an amazing conference.

As usual the conference started with hard work and management had to build the booth

This year pgconfeu is bigger than ever and 562 people made it to the conference. You really can feel the spirit around PostgreSQL in every other corner. People talking here, discussions there. PostgreSQL companies talking to each other and having fun. Fun with Oleg at the reception desk:

Julia attended the first time this year and will be giving here talk about Patroni automation with Ansible tomorrow. And there was it again: One of the creators of Patroni, is here as well and Julia’s questions have been directly answered.

Yesterday there was the social event organized by the conference and that was fun as well:

A PostgreSQL conference, as it is a community event, can only happen because of all the sponsors and volunteering. Volunteering for example means registering for being a room host. This is what I did this year and this is another great way of getting in touch with other PostgreSQL people. Here it is Gabriele, the man behind barman :

That’s it for now, interesting sessions happening which I do not want to miss, like this one from Hervé:

Btw: You can find a lot more impressions on Twitter.

Cet article Where can you find core developers asking people what is missing in PostgreSQL? pgconf.eu.2019 est apparu en premier sur Blog dbi services.

↧

pgconf.eu – Welcome to the community

October 18, 2019, 1:41 pm

≫ Next: Solr Sharding – Concepts & Methods

≪ Previous: Where can you find core developers asking people what is missing in PostgreSQL? pgconf.eu.2019

On tuesday I started my journey to Milan to attend my first pgconf.eu, which was also my first big conference. I was really excited what will come up to me. How will it be, to become a visible part of the community. How will it be, to give my first presentation in front of so many people?

The conference started with the welcome and opening session. It took place in a huge room, to give all of the participants a seat. Really amazing, how big this community is and it is still growing. So many people from all over the world (Japan, USA, Chile, Canada….) attending this conference.

And suddenly I realized, this is the room, where I have to give my session. Some really strange feelings came up. This is my first presentation at a conference, this is the main stage, there is space for so many people! And I really hoped, they will make it smaller for me. But there was something else: Anticipation.

But first I want to give you some impressions from my time at the pgconf. Amazing to talk to one of the main developers of Patroni. I was really nervous when I just went to him and said: “Hi, may I ask you a question?” For sure he didn’t say NO. Even all the other ladies and gentlemen I met (the list is quite long), they all are so nice and all of them really open minded (is this because they all work with an open source database?). And of course a special thanks to Pavel Golub for the great picture. Find it in Daniel’s blog
Beside meeting all that great people, I enjoyed some really informative and cool sessions.

Although I still hoped they are going to make the room smaller for my presentation, of course they didn’t do it. So I had only one chance:

And I did it and afterwards I book it under “good experience”. A huge room is not so much different than a small one.

As I am back home now, I want to say: Thanks pgconf.eu and dbi services for giving me this opportunity and thanks to the community for this warm welcome.

Cet article pgconf.eu – Welcome to the community est apparu en premier sur Blog dbi services.

↧

Solr Sharding – Concepts & Methods

October 20, 2019, 1:06 am

≫ Next: Kubernetes DNS resolution using CoreDNS (force update deployment)

≪ Previous: pgconf.eu – Welcome to the community

A few weeks ago, I published a series of blog on the Alfresco Clustering, including Solr Sharding. At that time, I planned to first explain what is really the Solr Sharding, what are the different concepts and methods around it. Unfortunately, I didn’t get the time to write this blog so I had to post the one related to Solr even before explaining the basics. Today, I’m here to rights my wrong! Obviously, this blog has a focus on Alfresco related Solr Sharding since that’s what I do.

I. Solr Sharding – Concepts

The Sharding in general is the partitioning of a set of data in a specific way. There are several possibilities to do that, depending on the technology you are working on. In the scope of Solr, the Sharding is therefore the split of the Solr index into several smaller indices. You might be interested in the Solr Sharding because it improves the following points:

Fault Tolerance: with a single index, if you lose it, then… you lost it. If the index is split into several indices, then even if you are losing one part, you will still have all others that will continue working
High Availability: it provides more granularity than the single index. You might want for example to have a few small indices without HA and then have some others with HA because you configured them to contain some really important nodes of your repository
Automatic Failover: Alfresco knows automatically (with Dynamic Registration) which Shards are up-to-date and which ones are lagging behind so it will choose automatically the best Shards to handle the search queries so that you get the best results possible. In combination with the Fault Tolerance above, this gives the best possible HA solution with the less possible resources
Performance improvements: better performance in indexing since you will have several Shards indexing the same repository so you can have less work done by each Shards for example (depends on Sharding Method). Better performance in searches since the search query will be processes by all Shards in parallel on smaller parts of the index instead of being one single query on the full index

Based on benchmarks, Alfresco considers that a Solr Shard can contain up to 50 to 80 000 000 nodes. This is obviously not a hard limit, you can have a single Shard with 200 000 000 nodes but it is more of a best practice if you want to keep a fast and reliable index. With older versions of Alfresco (before the version 5.1), you couldn’t create Shards because Alfresco didn’t support it. So, at that time, there were no other solutions than having a single big index.

There is one additional thing that must be understood here: the 50 000 000 nodes soft limit is 50M nodes in the index, not in the repository. Let’s assume that you are using a DB_ID_RANGE method (see below for the explanation) with an assumed split of 65% live nodes, 20% archived nodes, 15% others (not indexed: renditions, other stores, …). So, if we are talking about the “workspace://SpacesStore” nodes (live ones), then if we want to fill a Shard with 50M nodes, we will have to use a DB_ID_RANGE of 100*50M/65 = 77M. Basically, the Shard should be more or less “full” once there are 77M IDs in the Database. For the “archive://SpacesStore” nodes (archived ones), it would be 100*50M/20 = 250M.

Alright so what are the main concepts in the Solr Sharding? There are several terms that need to be understood:

Node: It’s a Solr Server (a Solr installed using the Alfresco Search Services). Below, I will use “Solr Server” instead because I already use “nodes” (lowercase) for the Alfresco Documents so using “Node” (uppercase) for the Solr Server, it might be a little bit confusing…
Cluster: It’s a set of Solr Servers all working together to index the same repository
Shard: A part of the index. In other words, it’s a representation (virtual concept) of the index composed of a certain set of nodes (Alfresco Documents)
Shard Instance: It’s one Instance of a specific Shard. A Shard is like a virtual concept while the Instance is the implementation of that virtual concept for that piece of the index. Several Shard Instances of the same Shard will therefore contain the same set of Alfresco nodes
Shard Group: It’s a collection of Shards (several indices) that forms a complete index. Shards are part of the same index (same Shard Group) if they:
- Track the same store (E.g.: workspace://SpacesStore)
- Use the same template (E.g.: rerank)
- Have the same number of Shards max (“numShards“)
- Use the same configuration (Sharding methods, Solr settings, …)

Shard is often (wrongly) used in place of Shard Instance which might lead to some confusion… When you are reading “Shard”, sometimes it means the Shard itself (the virtual concept), sometimes it’s all its Shard Instances. This is these concepts can look like:

II. Solr Sharding – Methods

Alfresco supports several methods for the Solr Sharding and they all have different attributes and different ways of working:

MOD_ACL_ID (ACL v1): Alfresco nodes and ACLs are grouped by their ACL ID and stored together in the same Shard. Different ACL IDs will be assigned randomly to different Shards (depending on the number of Shards you defined). Each Alfresco node using a specific ACL ID will be stored in the Shard already containing this ACL ID. This simplifies the search requests from Solr since ACLs and nodes are together, so permission checking is simple. If you have a lot of documents using the same ACL, then the distribution will not be even between Shards. Parameters:
- shard.method=MOD_ACL_ID
- shard.instance=<shard.instance>
- shard.count=<shard.count>
ACL_ID (ACL v2): This is the same as the MOD_ACL_ID, the only difference is that it changes the method to assign to ACL to the Shards so it is more evenly distributed but if you still have a lot of documents using the same ACL then you still have the same issue. Parameters:
- shard.method=ACL_ID
- shard.instance=<shard.instance>
- shard.count=<shard.count>
DB_ID: This is the default Sharding Method in Solr 6 which will evenly distribute the nodes in the different Shards based on their DB ID (“alf_node.id“). The ACLs are replicated on each of the Shards so that Solr is able to perform the permission checking. If you have a lot of ACLs, then this will obviously make the Shards a little bit bigger, but this is usually insignificant. Parameters:
- shard.method=DB_ID
- shard.instance=<shard.instance>
- shard.count=<shard.count>
DB_ID_RANGE: Pretty much the same thing as the DB_ID but instead of looking into each DB ID one by one, it will just dispatch the DB IDs from the same range into the same Shard. The ranges are predefined at the Shard Instance creation and you cannot change them later, but this is also the only Sharding Method that allows you to add new Shards dynamically (auto-scaling) without the need to perform a full reindex. The lower value of the range is included and the upper value is excluded (for Math lovers: [begin-end[ ;)). Since DB IDs are incremental (increase over time), performing a search query with a date filter might end-up as simple as checking inside a single Shard. Parameters:
- shard.method=DB_ID_RANGE
- shard.range=<begin-end>
- shard.instance=<shard.instance>
DATE: Months will be assigned to a specific Shard sequentially and then nodes are indexed into the Shard that was assigned the current month. Therefore, if you have 2 Shards, each one will contain 6 months (Shard 1 = Months 1,3,5,7,9,11 // Shard 2 = Months 2,4,6,8,10,12). It is possible to assign consecutive months to the same Shard using the “shard.date.grouping” parameter which defines how many months should be grouped together (a semester for example). If there is no date on a node, the fallback method is to use DB_ID instead. Parameters:
- shard.method=DATE
- shard.key=exif:dateTimeOriginal
- shard.date.grouping=<1-12>
- shard.instance=<shard.instance>
- shard.count=<shard.count>
PROPERTY: A property is specified as the base for the Shard assignment. The first time that a node is indexed with a new value for this property, the node will be assigned randomly to a Shard. Each node coming in with the same value for this property will be assigned to the same Shard. Valid properties are either d:text (single line text), d:date (date only) or d:datetime (date+time). It is possible to use only a part of the property’s value using “shard.regex” (To keep only the first 4 digit of a date for example: shard.regex=^\d{4}). If this property doesn’t exist on a node or if the regex doesn’t match (if any is specified), the fallback method is to use DB_ID instead. Parameters:
- shard.method=PROPERTY
- shard.key=cm:creator
- shard.instance=<shard.instance>
- shard.count=<shard.count>
EXPLICIT_ID: Pretty much similar to the PROPERTY but instead of using the value of a “random” property, this method requires a specific property (d:text) to define explicitly on which Shard the node should be indexed. Therefore, this will require an update of the Data Model to have one property dedicated to the assignment of a node to a Shard. In case you are using several types of documents, then you will potentially want to do that for all. If this property doesn’t exist on a node or if an invalid Shard number is given, the fallback method is to use DB_ID instead. Parameters:
- shard.method=EXPLICIT_ID
- shard.key=<property> (E.g.: cm:targetShardInstance)
- shard.instance=<shard.instance>
- shard.count=<shard.count>

As you can see above, each Sharding Method has its own set of properties. You can define these properties in:

The template’s solrcore.properties file in which case it will apply to all Shard Instance creations
- E.g.: $SOLR_HOME/solrhome/templates/rerank/conf/solrcore.properties
The URL/Command used to create the Shard Instance in which case it will only apply to the current Shard Instance creation
- E.g.: curl -v “http://host:port/solr/admin/cores?action=newCore&…&property.shard.method=DB_ID_RANGE&property.shard.range=0-50000000&property.shard.instance=0“

Summary of the benefits of each method:

First supported versions for the Solr Sharding in Alfresco:

Hopefully, this is a good first look into the Solr Sharding. In a future blog, I will talk about the creation process and show some example of what is possible. If you want to read more on the subject, don’t hesitate to take a look at the Alfresco documentation, it doesn’t explain everything, but it is still a very good starting point.

Cet article Solr Sharding – Concepts & Methods est apparu en premier sur Blog dbi services.

↧